+1 (non-binding) On Thu, Apr 25, 2019 at 1:58 PM Philipp Moritz <[email protected]> wrote:
> +1 (binding) > > On Thu, Apr 25, 2019 at 1:34 PM Wes McKinney <[email protected]> wrote: > > > +1 (binding) > > > > On Thu, Apr 25, 2019 at 3:33 PM Wes McKinney <[email protected]> > wrote: > > > > > > In a recent mailing list discussion [1] Micah Kornfield has proposed > > > to add new list and variable-size binary and unicode types to the > > > Arrow columnar format with 64-bit signed integer offsets, to be used > > > in addition to the existing 32-bit offset varieties. These will be > > > implemented as new types in the Type union in Schema.fbs (the > > > particular names can be debated in the PR that implements them): > > > > > > LargeList > > > LargeBinary > > > LargeString [UTF8] > > > > > > While very large contiguous columns are not a principle use case for > > > the columnar format, it has been observed empirically that there are > > > applications that use the format to represent datasets where > > > realizations of data can sometimes exceed the 2^31 - 1 "capacity" of a > > > column and cannot be easily (or at all) split into smaller chunks. > > > > > > Please vote whether to accept the changes. The vote will be open for at > > > least 72 hours. > > > > > > [ ] +1 Accept the additions to the columnar format > > > [ ] +0 > > > [ ] -1 Do not accept the changes because... > > > > > > Thanks, > > > Wes > > > > > > [1]: > > > https://lists.apache.org/thread.html/8088eca21b53906315e2bbc35eb2d246acf10025b5457eccc7a0e8a3@%3Cdev.arrow.apache.org%3E > > >
