hi John, The documentation says
array : pyarrow.Array or pyarrow.ChunkedArray A ChunkedArray instead of an Array is returned if: - the object data overflowed binary storage. - the object's ``__arrow_array__`` protocol method returned a chunked array. Overflowing binary storage means exceeding the 2^31 - 1 bytes limit for BinaryType or StringType/UTF8. We thought this was better than failing since the output of pyarrow.array is often used to instantiate a pyarrow.Table which will not argue with the ChunkedArray. Depending on your input data you might wager a guess whether the overflow will occur but it will be application-dependent. - Wes On Tue, Dec 3, 2019 at 10:51 AM John Muehlhausen <j...@jgm.org> wrote: > > Given input data and a type, how do we predict whether array() will produce > ChunkedArray? > > I figure the formula involves: > - the length of input > - the type, and max length (to be conservative) for variable length types > - some constant(s) that Arrow knows internally... that may change in the > future? > > Should there be an API to make this easy? Am I missing one that already > exists? > > Thanks, > John