I think it is worthwhile to pursue this, but I fear that if we do not
proceed very carefully, unforeseen complications could arise, creating even
greater work for the release managers.

> In general I think that this is not something we neither need to nor want
to implement from 0 to 100.
> Incrementally evolving and evaluating our process is key for sucsh a core
change
I strongly agree with this. Perhaps we could start with just one of the
"easier" cases (such as C# or JS), and use that as a "pilot project" of
sorts.

> C/GLib, python, and ruby are all tightly coupled to C++ at the moment and
should not be a first priority.
I agree. I think it would be best to hold off on Java also, in part because
of how the Java docs are integrated with the C++ and Python docs and
controlled by the version selector menu.

Ian

On Tue, Apr 9, 2024 at 4:45 AM David Li <lidav...@apache.org> wrote:

> Java has JNI parts, but I think they do not necessarily need to release at
> the same time as C++, especially since the JAR bundles the libraries; Java
> could just pick up the latest version of the C++ library whenever it
> releases. It would make it harder if the next step is to also decouple the
> repositories, though.
>
> JB, I see what you're saying but I think we want to avoid declaring a
> "core" Arrow library as the implication is not fair to the independent and
> fully-featured implementations in Go, Rust, etc. But that is just a matter
> of wording.
>
> On Tue, Apr 9, 2024, at 17:06, Jean-Baptiste Onofré wrote:
> > Hi,
> >
> > Yeah, to be honest, I was more focused on Java versioning.
> >
> > Maybe, we can "group" Arrow components in two major areas: the "core"
> > libs and the components using the "core" libs.
> > C++ can have its own versioning, and the rest is decoupled from each
> > other but it will depend to C++ release.
> >
> > I think it's do-able and probably "cleaner".
> >
> > Regards
> > JB
> >
> > On Mon, Apr 8, 2024 at 3:55 PM Weston Pace <weston.p...@gmail.com>
> wrote:
> >>
> >> > Probably major versions should match between C++ and PyArrow, but I
> guess
> >> > we could have diverging minor and patch versions. Or at least patch
> >> > versions given that
> >> > a new minor version is usually cut for bug fixes too.
> >>
> >> I believe even this would be difficult.  Stable ABIs are very finicky in
> >> C++.  If the public API surface changes in any way then it can lead to
> >> subtle bugs if pyarrow were to link against an older version.  I also am
> >> not sure there is much advantage in trying to separate pyarrow from
> >> arrow-cpp since they are almost always changing in lockstep (e.g. any
> >> change to arrow-cpp enables functionality in pyarrow).
> >>
> >> I think we should maybe focus on a few more obvious cases.
> >>
> >> I think C#, JS, Java, and Go are the most obvious candidates to
> decouple.
> >> Even then, we should probably only separate these candidates if they
> have
> >> willing release managers.
> >>
> >> C/GLib, python, and ruby are all tightly coupled to C++ at the moment
> and
> >> should not be a first priority.  I would have guessed that R is also in
> >> this list but Jacob reported in the original email that they are already
> >> somewhat decoupled?
> >>
> >> I don't know anything about swift or matlab.
> >>
> >> On Mon, Apr 8, 2024 at 6:23 AM Alessandro Molina
> >> <alessan...@voltrondata.com.invalid> wrote:
> >>
> >> > On Sun, Apr 7, 2024 at 3:06 PM Andrew Lamb <al...@influxdata.com>
> wrote:
> >> >
> >> > >
> >> > > We have had separate releases / votes for Arrow Rust (and Arrow
> >> > DataFusion)
> >> > > and it has served us quite well. The version schemes have diverged
> >> > > substantially from the monorepo (we are on version 51.0.0 in
> arrow-rs,
> >> > for
> >> > > example) and it doesn't seem to have caused any large confusion with
> >> > users
> >> > >
> >> > >
> >> > I think that versioning will require additional thinking for
> libraries like
> >> > PyArrow, Java etc...
> >> > For rust this is a non problem because there is no link to the C++
> library,
> >> >
> >> > PyArrow instead is based on what the C++ library provides,
> >> > so there is a direct link between the features provided by C++ in a
> >> > specific version
> >> > and the features provided in PyArrow at a specific version.
> >> >
> >> > More or less PyArrow 20 should have the same bug fixes that C++ 20
> has,
> >> > and diverging the two versions would lead to confusion easily.
> >> > Probably major versions should match between C++ and PyArrow, but I
> guess
> >> > we could have diverging minor and patch versions. Or at least patch
> >> > versions given that
> >> > a new minor version is usually cut for bug fixes too.
> >> >
>

Reply via email to