ARROW-6837 (which, er, includes ARROW-6836) and ARROW-5916 have PRs.

Would appreciate some feedback.  I will finish the Python part of 6837 when
I know I'm on the right track.

Thanks,
John

On Thu, Oct 10, 2019 at 9:54 AM John Muehlhausen <j...@jgm.org> wrote:

> The format change is ARROW-6836 ... add a custom_metadata:[KeyValue] field
> to the Footer table in File.fbs
>
> The other change (slicing a recordbatch to honor RecordBatch.length rather
> than array length if the former is smaller) will hopefully not affect the
> format.
>
>
> On Wed, Oct 9, 2019 at 11:55 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
>> Hi John,
>>
>> Since the 1.0.0 release is focused on Format stability, probably the
>> only real "blockers" will be ensuring that we have hardened multiple
>> implementations (in particular C++ and Java) of the columnar format as
>> specified with integration tests to prove it. The issues you listed
>> sound more like C++ library changes to me?
>>
>> If you want to propose Format-related changes, that would need to
>> happen right away otherwise the ship will sail on that.
>>
>> - Wes
>>
>> On Wed, Oct 9, 2019 at 9:08 PM John Muehlhausen <j...@jgm.org> wrote:
>> >
>> > ARROW-5916
>> > ARROW-6836/6837
>> >
>> > These are of particular interest to me because they enable recordbatch
>> > "incrementalism" which is useful for streaming applications:
>> >
>> > ARROW-5916 allows a recordbatch to pre-allocate space for future records
>> > that have not yet been populated, making it safe for readers to consume
>> the
>> > partial batch.
>> >
>> > ARROW-6836/6837 allows a file of record batches to be extended at the
>> end,
>> > without re-writing the beginning, while including the idea that the
>> > custom_metadata may change with each update.  (custom_metadata in the
>> > Schema is not a good candidate because Schema also appears at the
>> beginning
>> > of the file.)
>> >
>> > While these are not blockers for me quite yet, they soon will be!  If I
>> > wanted to ensure that these are in 1.0, what is my deadline for
>> > implementation and test cases?  Can such a note be made on the wiki?
>> > Should I change the priority in Jira?
>> >
>> > Thanks,
>> > John
>> >
>> > On Wed, Oct 9, 2019 at 2:57 PM Neal Richardson <
>> neal.p.richard...@gmail.com>
>> > wrote:
>> >
>> > > Congratulations everyone on 0.15! I know a lot of hard work went into
>> > > it, not only in the software itself but also in the build and release
>> > > process.
>> > >
>> > > Once you've caught your breath from the release, we should start
>> > > thinking about what's in scope for our next release, the big 1.0. To
>> > > get us started (or restarted, since we did discuss 1.0 before the
>> > > flatbuffer alignment issue came up), I've created
>> > > https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release
>> > > based on our past release wiki pages.
>> > >
>> > > A good place to begin would be to list, either in "blocker" Jiras or
>> > > bullet points on the document, the key features and tasks we must
>> > > resolve before 1.0. For example, I get the sense that we need to
>> > > overhaul the documentation, but that should be expressed in a more
>> > > concrete, actionable way.
>> > >
>> > > Neal
>> > >
>>
>

Reply via email to