I am suggesting we always skip the number. So only one component gets the next one :) In your example Hive trunk would be 2.3, and if SA is released again it would become 2.4. Otherwise we’d need a compat table cause versions will be totally out of sync.
On 16/8/19, 16:31, "Owen O'Malley" <omal...@apache.org> wrote: >That won't necessarily work, especially in the beginning. If we release SA >2.2.0 and use it for Hive trunk with the assumption that the next Hive >release will be 2.2. What do we do when we need to make an incompatible >change in SA? I guess we could release SA as 2.3.0 and when hive makes its >next release skip over Hive 2.2 and go straight to Hive 2.3.0. In general >I >think that we'd be better off with the release numbers not tied together. > >.. Owen > >On Fri, Aug 19, 2016 at 4:14 PM, Sergey Shelukhin <ser...@hortonworks.com> >wrote: > >> Can we just run the versions thru? I.e. increment it every time but >> release only one component (or both if they happen to align I guess). >> E.g. storage-api will be released at 2.2, and say 2.3 if it moves fast, >> then Hive 2.4, then storage-api 2.5, etc. >> That might make it easier to reason about compatibility because the >>order >> is obvious. >> >> On 16/8/19, 09:04, "Sergio Pena" <sergio.p...@cloudera.com> wrote: >> >> >I see Parquet is currently using the SearchArgument class for >>predicates >> >push down. >> >Will this class be part of the new sub-module or project? >> > >> >Following Sushanth idea, can we have other API interfaces in the new >> >project that other components can use? >> >Perhaps having this may be a good reason to create a project. >> > >> >I'm -1 with the 4th minor version. As Owen mentioned, changing the 4th >> >version number for incompatible changes is ugly and confusing. >> >I like the new project idea more, +1, but the storage-api may be too >> >small >> >for a new project. >> > >> >- Sergio >> > >> >On Wed, Aug 17, 2016 at 2:05 PM, Owen O'Malley <omal...@apache.org> >> wrote: >> > >> >> On Wed, Aug 17, 2016 at 10:46 AM, Alan Gates <alanfga...@gmail.com> >> >>wrote: >> >> >> >> > +1 for making the API clean and easy for other projects to work >>with. >> >> A >> >> > few questions: >> >> > >> >> > 1) Would this also make it easier for Parquet and others to >>implement >> >> > Hive’s ACID interfaces? >> >> > >> >> >> >> Currently the ACID interfaces haven't been moved over to storage-api, >> >> although it would make sense to do so at some point. >> >> >> >> >> >> > >> >> > 2) Would we make any attempt to coordinate version numbers between >> >>Hive >> >> > and the storage module, or would a given version of Hive just >>depend >> >>on a >> >> > given version of the storage module? >> >> > >> >> >> >> The two options that I see are: >> >> >> >> * Let the numbers run separately starting from 2.2.0. >> >> * Tie the numbers together with an additional level of versioning >>(eg. >> >> 2.2.0.0). >> >> >> >> I think that letting the two version numbers diverge is better in the >> >>long >> >> term. For example, if you need to make an incompatible change, it is >> >>pretty >> >> ugly to do it as a fourth level version number (eg. an incompatible >> >>change >> >> from 2.2.0.0 to 2.2.0.1). At the beginning, I expect that storage-api >> >>would >> >> move faster than Hive, but as it stabilizes I expect it might start >> >>moving >> >> slower than Hive. >> >> >> >> I'd propose that we have Hive's build use a released version of >> >>storage-api >> >> rather than a snapshot. >> >> >> >> Thoughts? >> >> >> >> Owen >> >> >> >> >> >> > Alan. >> >> > >> >> > > On Aug 15, 2016, at 17:01, Owen O'Malley <omal...@apache.org> >> wrote: >> >> > > >> >> > > All, >> >> > > >> >> > > As part of moving ORC out of Hive, we pulled all of the >> >>vectorization >> >> > > storage and sarg classes into a separate module, which is named >> >> > > storage-api. Although it is currently only used by ORC, it >>could be >> >> used >> >> > > by Parquet or Avro if they wanted to make a fast vectorized >>reader >> >>that >> >> > > read directly in to Hive's VectorizedRowBatch without needing a >> >>shim or >> >> > > data copy. Note that this is in many ways similar to pulling the >> >>Arrow >> >> > > project out of Drill. >> >> > > >> >> > > This unfortunately still leaves us with a circular dependency >> >>between >> >> > Hive >> >> > > and ORC. I'd hoped that storage-api wouldn't change that much, >>but >> >>that >> >> > > doesn't seem to be happening. As a result, ORC ends up shipping >>its >> >>own >> >> > > fork of storage-api. >> >> > > >> >> > > Although we could make a new project for just the storage-api, I >> >>think >> >> it >> >> > > would be better to make it a subproject of Hive that is released >> >> > > independently. >> >> > > >> >> > > What do others think? >> >> > > >> >> > > Owen >> >> > >> >> > >> >> >> >>