hi Bryan -- with the way that things are going, if we were to block
the 1.0.0 release on completing the Java work, it could be a very long
time to wait (long time = more than 6 months from now). I don't think
that's acceptable. The Versioning document was formally adopted last
August and so a year will have soon elapsed since we previously said
we wanted to have everything integration tested.

With what I'm proposing the primary things that would not be tested
(if no progress in Java):

* custom_metadata fields
* Extension Types
* Large (64-bit offset) variable size types
* Delta and Replacement Dictionaries
* Unions

These do not seem like huge sacrifices, or at least not ones that
compromise the stability of the columnar format. Of course, if some of
them are completed in the next 10-12 weeks, then that's great.

- Wes

On Tue, Apr 21, 2020 at 12:12 PM Bryan Cutler <[email protected]> wrote:
>
> I really would like to see a 1.0.0 release with complete implementations
> for C++ and Java. From my experience, that interoperability has been a
> major selling point for the project. That being said, my time for
> contributions has been pretty limited lately and I know that Java has been
> lagging, so if the rest of the community would like to push forward with a
> reduced scope, that is okay with me. I'll still continue to do what I can
> on Java to fill in the gaps.
>
> Bryan
>
> On Tue, Apr 21, 2020 at 8:47 AM Wes McKinney <[email protected]> wrote:
>
> > Hi all -- are there some opinions about this?
> >
> > Thanks
> >
> > On Thu, Apr 16, 2020 at 5:30 PM Wes McKinney <[email protected]> wrote:
> > >
> > > hi folks,
> > >
> > > Previously we had discussed a plan for making a 1.0.0 release based on
> > > completeness of columnar format integration tests and making
> > > forward/backward compatibility guarantees as formalized in
> > >
> > >
> > https://github.com/apache/arrow/blob/master/docs/source/format/Versioning.rst
> > >
> > > In particular, we wanted to demonstrate comprehensive Java/C++
> > interoperability.
> > >
> > > As time has passed we have stalled out a bit on completing integration
> > > tests for the "long tail" of data types and columnar format features.
> > >
> > >
> > https://docs.google.com/spreadsheets/d/1Yu68rn2XMBpAArUfCOP9LC7uHb06CQrtqKE5vQ4bQx4/edit?usp=sharing
> > >
> > > As such I wanted to propose a reduction in scope so that we can make a
> > > 1.0.0 release sooner. The plan would be as follows:
> > >
> > > * Endeavor to have integration tests implemented and working in at
> > > least one reference implementation (likely to be the C++ library). It
> > > seems important to verify that what's in Columnar.rst is able to be
> > > unambiguously implemented.
> > > * Indicate in Versioning.rst or another place in the documentation the
> > > list of data types or advanced columnar format features (like
> > > delta/replacement dictionaries) that are not yet fully integration
> > > tested.
> > >
> > > Some of the essential protocol stability details and all of the most
> > > commonly used data types have been stable for a long time now,
> > > particularly after the recent alignment change. The current list of
> > > features that aren't being tested for cross-implementation
> > > compatibility should not pose risk to downstream users.
> > >
> > > Thoughts about this? The 1.0.0 release is an important milestone for
> > > the project and will help build continued momentum in developer and
> > > user community growth.
> > >
> > > Thanks
> > > Wes
> >

Reply via email to