Thank you for bringing this up. I'm in favor of this. I think there are several motivations but the main ones are:
1. Decoupling the versions will allow components to have no release, or only a minor release, when there are no breaking changes 2. We do have some vote fatigue I think and we don't want to make that more difficult. 3. Anything we can do to ease the burden of release managers is good If I understand what you are describing then I think it satisfies points 1 & 2. I am not familiar enough with the release management process to speak to #3. > Voting in one thread on > all components/a subset of components per voter and the surrounding > technicalities is something I would like to hear some opinions on. I am in favor of decoupling the version numbers. I do think batched quarterly releases are still a good thing to avoid vote fatigue. Perhaps we can have a single vote on a batch of version numbers (e.g. please vote on the batched release containing CPP version X, Go version Y, JS version Z). > A more meta question is about the messaging that different versioning > schemes carry, as it might no longer be obvious on first glance which > versions are compatible or have the newest features. I am not concerned about this. One of the advantages of Arrow is that we have a stable C ABI (C Data Interface) and a stable IPC mechanism (IPC serialization) and this means that version compatibility is rarely a difficulty or major concern. Plus, regarding individual features, our solution already requires a compatibility table ( https://arrow.apache.org/docs/status.html). Changing the versioning strategy will not make this any worse. On Thu, Mar 28, 2024 at 1:42 PM Jacob Wujciak <assignu...@apache.org> wrote: > Hello Everyone! > > I would like to resurface the discussion of separate > versioning/releases/voting for monorepo components. We have previously > touched on this topic mostly in the community meetings and spread across > multiple, only tangential related threads. I think a focused discussion can > be a bit more results oriented, especially now that we almost regularly > deviate from the quarterly release cadence with minor releases. My hope is > that discussing this and adapting our process can lower the amount of work > required and ease the pressure on our release managers (Thank you Raúl and > Kou!). > > I think the base of the topic is the separate versioning for components as > otherwise separate releases only have limited value. From a technical > perspective standalone implementations like Go or JS are the easiest to > handle in that regard, they can just follow their ecosystem standards, > which has been requested by users already (major releases in Go require > manual editing across a code base as dependencies are usually pinned to a > major version). > > For Arrow C++ bindings like Arrow R and PyArrow having distinct versions > would require additional work to both enable the use of different versions > and ensure version compatibility is monitored and potentially updated if > needed. > > For Arrow R we have already implemented these changes for different reasons > and have backwards compatibility with libarrow >= 13.0.0. From a user > standpoint of PyArrow this is likely irrelevant as most users get binary > wheels from pypi, if a user regularly builds PyArrow from source they are > also capable of managing potentially different libarrow version > requirements as this is already necessary to build the package just with an > exact version match. > > A more meta question is about the messaging that different versioning > schemes carry, as it might no longer be obvious on first glance which > versions are compatible or have the newest features. Though I would argue > that this a marginal concern at best as there is no guarantee of feature > parity between different components with the same version. Breaking that > implicit expectation with separate versions could be seen as clearer. If a > component only receives dependency bumps or minor bug fixes, releasing this > component with a patch version aligns much better with expectations than a > major version bump. In addition there are already several differently > versioned libraries in the apache/arrow-* ecosystem that are released > outside of the monorepo release process. A proper support policy for each > component would also be required but could just default to 'current major > release' as it is now. > > From an ASF perspective there is no requirement to release the entire > repository at once as the actual release artifact is the source tarball. As > long as that is verified and voted on by the PMC it is an official release. > > This brings me to the release process and voting. I think it is pretty > clear that completely decoupling all components and their release processes > isn't feasible at the moment, mainly from a technical perspective > (crossbow) and would likely also lead to vote fatigue. We have made efforts > to ease the verification required for the vote easier and will continue > these efforts. Though I can see some of the components managing their own > releases (e.g. R, as we do with post release tasks already due to CRAN, ) a > continued quarterly 'batch release' seems like a more appealing solution > and would still allow us to use separate versions. Voting in one thread on > all components/a subset of components per voter and the surrounding > technicalities is something I would like to hear some opinions on. > > In my opinion being stricter with release requirements for components might > lead to smaller/less active components not releasing. This seems like a > bad thing at first glance but might also spur the user community to get > involved when the reassuring, regular releases dry up and reflect the > reality of the development situation of the component. > > I am eager to hear your thoughts! > > Best > Jacob >