I just filed https://issues.apache.org/jira/browse/ARROW-9362 and plan
to work on it tomorrow.

David

On 7/7/20, Wes McKinney <wesmck...@gmail.com> wrote:
> I don't recall a ticket for the Java work but you're certainly a good
> candidate to take the lead on it.
>
> On Tue, Jul 7, 2020 at 3:16 PM David Li <li.david...@gmail.com> wrote:
>>
>> I see there's ARROW-9258 to do the backwards compatibility work for
>> C++ and ARROW-9333 to expose this for Python; is there any ticket or
>> anyone planning on doing this for Java? Otherwise I'm willing to look
>> at it so that we can do some testing with Flight.
>>
>> Best,
>> David
>>
>> On 6/29/20, Wes McKinney <wesmck...@gmail.com> wrote:
>> > Thanks David. Indeed it seems that exposing IpcWriteOptions is going
>> > to be critical here. I'd like to avoid an "environment variable"
>> > workaround at the C++ level instead only providing such things in e.g.
>> > Python like we did for the alignment patch
>> >
>> > On Mon, Jun 29, 2020 at 9:30 AM David Li <li.david...@gmail.com> wrote:
>> >>
>> >> This would cause compatibility issues for Flight servers/clients
>> >> between versions as well. The situation is a little worse since
>> >> IpcWriteOptions isn't exposed and so you can't control what version
>> >> you write. But just exposing them in lieu of a full negotiation (which
>> >> we should start thinking about) should be enough to work through this.
>> >>
>> >> I see there's https://issues.apache.org/jira/browse/ARROW-8190 so I'll
>> >> try to tackle this soon (and do the same for Java) since it should be
>> >> independent of whether the format change goes through.
>> >>
>> >> Best,
>> >> David
>> >>
>> >> On 6/28/20, Wes McKinney <wesmck...@gmail.com> wrote:
>> >> > I opened a PR https://github.com/apache/arrow/pull/7566
>> >> >
>> >> > We should prioritize getting through the other format changes, but
>> >> > we
>> >> > can vote on this in the meantime if there is consensus
>> >> >
>> >> > On Fri, Jun 26, 2020 at 2:58 PM Micah Kornfield
>> >> > <emkornfi...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> I agree I think we have to do this given the number of changes in
>> >> >> flight
>> >> >> (especially union types).
>> >> >>
>> >> >> On Fri, Jun 26, 2020 at 7:29 AM Wes McKinney <wesmck...@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >> > I created a JIRA about this
>> >> >> >
>> >> >> > https://issues.apache.org/jira/browse/ARROW-9231
>> >> >> >
>> >> >> > This issue is quite important so please take a look.
>> >> >> >
>> >> >> > On Thu, Jun 25, 2020 at 8:53 AM Wes McKinney
>> >> >> > <wesmck...@gmail.com>
>> >> >> > wrote:
>> >> >> > >
>> >> >> > > On Thu, Jun 25, 2020 at 5:31 AM Antoine Pitrou
>> >> >> > > <anto...@python.org>
>> >> >> > wrote:
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > Le 25/06/2020 à 12:18, Antoine Pitrou a écrit :
>> >> >> > > > >
>> >> >> > > > > Le 25/06/2020 à 00:40, Wes McKinney a écrit :
>> >> >> > > > >> hi folks,
>> >> >> > > > >>
>> >> >> > > > >> This has come up in some other contexts, but I believe it
>> >> >> > > > >> would
>> >> >> > > > >> be a
>> >> >> > > > >> good idea to increment the version number in Schema.fbs
>> >> >> > > > >> starting
>> >> >> > with
>> >> >> > > > >> 1.0.0 to separate the pre-1.0 and post-1.0 worlds
>> >> >> > > > >>
>> >> >> > > > >> https://github.com/apache/arrow/blob/master/format/Schema.fbs#L22
>> >> >> > > > >>
>> >> >> > > > >> Given that we are contemplating a number of changes to
>> >> >> > > > >> assist
>> >> >> > > > >> with
>> >> >> > > > >> forward compatibility and a breaking serialization change
>> >> >> > > > >> for
>> >> >> > unions,
>> >> >> > > > >> this would seem prudent so that we do not risk breaking
>> >> >> > compatibility
>> >> >> > > > >> with 0.17.1 and prior.
>> >> >> > > > >>
>> >> >> > > > >> Given that there are no major backwards incompatibilities,
>> >> >> > > > >> there
>> >> >> > > > >> should be no problem with 1.0.0 readers reading data
>> >> >> > > > >> generated
>> >> >> > > > >> by
>> >> >> > > > >> libraries <= 0.17.1.
>> >> >> > > > >
>> >> >> > > > > Actually, it seems that a dense array with top-level null
>> >> >> > > > > values
>> >> >> > > > > (represented in 0.17.1 fashion) would need non-trivial
>> >> >> > > > > rewriting
>> >> >> > > > > of
>> >> >> > its
>> >> >> > > > > offsets and child arrays (at least one child array) to
>> >> >> > > > > represent
>> >> >> > > > > the
>> >> >> > > > > nulls at the child level.
>> >> >> > > > >
>> >> >> > > > > This is unless we keep the top-level union null bitmap in
>> >> >> > > > > C++
>> >> >> > > > > and
>> >> >> > only
>> >> >> > > > > avoid emitting it on the IPC side.  Which would be a
>> >> >> > > > > slightly
>> >> >> > > > > weird
>> >> >> > > > > arrangement, but would limit incompatibilites on the C++
>> >> >> > > > > API
>> >> >> > > > > side.
>> >> >> > > >
>> >> >> > > > Actually, if we do this, the same problem will appear on the
>> >> >> > > > IPC
>> >> >> > > > write
>> >> >> > > > side (C++-created dense union arrays with a top-level null
>> >> >> > > > bitmap
>> >> >> > > > will
>> >> >> > > > need regenerating some of the child buffers).
>> >> >> > >
>> >> >> > > I see. Well I think we can shut down this issue by giving up on
>> >> >> > > Union
>> >> >> > > forward compatibility V4 / pre-1.0 libraries.
>> >> >> > >
>> >> >> > > > Regards
>> >> >> > > >
>> >> >> > > > Antoine.
>> >> >> >
>> >> >
>> >
>

Reply via email to