great idea! On Tue, Aug 3, 2021 at 8:49 AM Andy Grove <andygrov...@gmail.com> wrote:
> I also like the idea of moving arrow2/parquet2 into the official repos. > This is effectively what we did with Ballista, which is still experimental. > Ballista was simpler because it depends on DataFusion rather than the other > way around, but I like the idea of using feature flags to enable DataFusion > on arrow2/parquet2. > > I don't see any reason why we wouldn't be able to also release > arrow2/parquet2 with suitable 0.x.x versioning as well (as we plan on doing > with Ballista) and releasing would be much easier if they are in the > official repos. > > > On Tue, Aug 3, 2021 at 7:13 AM paddy horan <paddyho...@hotmail.com> wrote: > > > Hi Jorge, > > > > What do you think about moving Arrow2 into the main Arrow repo where it > is > > only enabled via an "experimental" feature flag? This would allow > > development of Arrow2 to proceed in the main repo but also this would be > a > > clear signal that Arrow2 is <1.0. When we feel ready (i.e. Arrow2 is > 1.0) > > we can release it in the next main release with Arrow2 being the default > > and move the existing implementation behind a "legacy" feature flag. > > > > Here is why I think this might work well: > > - People contributing to the Arrow project will naturally contribute to > > Arrow2. At the moment, some people will still contribute to Arrow > instead > > of Arrow2 just by virtue of it being the "official" implementation. > > However, if both are in one repo people will want to contribute to the > > "future", i.e. Arrow2. > > - the experimental flag will be a clear signal to the existing Arrow > > community that Arrow2 is the future but that it is <1.0 > > - existing users will be well supported in this transition > > - In general, I think the longer that development proceeds in separate > > repos the harder it will be to eventually merge the two in a way that > > supports existing users. > > > > Do you think would work? > > > > Paddy > > > > -----Original Message----- > > From: Jorge Cardoso Leitão <jorgecarlei...@gmail.com> > > Sent: Monday, August 2, 2021 1:59 PM > > To: dev@arrow.apache.org > > Subject: Re: [Discuss] [Rust] Arrow2/parquet2 going foward > > > > Hi, > > > > Sorry for the delay. > > > > If there is a path towards an official release under a <1.0.0 versioning > > schema aligned with the rest of the Rust ecosystem and in line with the > > stability of the API, then IMO we should move all development to within > > Apache experimental asap (I can handle this and the likely IP clearance > > round). If we require a release >=1.X.Y to it and/or a schedule, then I > > prefer to keep expectations aligned and postpone any movement. > > > > Under the move situation, I was thinking in something as follows: > > > > * gradually stop maintaining "arrow" in crates, offering a maintenance > > window over which we release patches (*) > > * work towards achieving feature parity on arrow2/parquet2 on the > > experimental repos. > > * keep releasing arrow2/parquet2 under a 0.X model during the step above > > (**) > > * migrate to arrow-rs and archive experimentals (***) > > * break arrow2 in smaller crates so that we can version the APIs at a > > different cadence > > * once a crate reaches some stability (this is always opinionated, but it > > is fine), we bump it to 1.0 and announce a maintenance plan ala tokio < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftokio.rs%2Fblog%2F2020-12-tokio-1-0&data=04%7C01%7C%7C1b3176da8b6b45407c4208d955df3394%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637635239391364824%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=lpj8KTpf3c3t0zxo28dSqtuJ82xfMtPssmxzNkrj%2BBQ%3D&reserved=0 > > >. > > > > (*) e.g. "we will continue to patch the arrow crate up to at least 6 > > months starting after the first release of arrow2 that supports > > a) nested parquet read and write > > b) union array (including IPC integration tests) > > c) map array (including IPC integration tests)" > > > > (**) officially or un-officially (I would suggest officially so that we > > can acknowledge everyone's work on it, but no strong feelings) > > > > (***) something like: > > 1. place arrow2 on top of a clear arrow repo so that the full > contribution > > history up to that point preserved 2. make arrow-rs the home of arrow2 > > (i.e. we start releasing arrow2 from > > arrow-rs) and archive the experimental repos; create arrow-rs-parquet or > > something for parquet2. > > > > In summary, the core pain point for me is the current versioning of > arrow, > > which I feel is incompatible with my goals for arrow2 and the ecosystem I > > envision it supporting :) > > > > Best, > > Jorge > > > > On Fri, Jul 30, 2021 at 8:44 PM Wes McKinney <wesmck...@gmail.com> > wrote: > > > > > I think it would also be fine to push "beta" arrow2 crates out of a > > > repo under apache/ so long as they are not marked on crates.io as > > > being Apache-official releases. There's a possible slippery slope > > > there, but as long as we are on a path to formalizing the releases I > > think it is okay. > > > > > > On Fri, Jul 30, 2021 at 1:07 PM Andrew Lamb <al...@influxdata.com> > > wrote: > > > > > > > Jorge -- do you feel like we have a resolution on what to do with > > > > arrow2 > > > in > > > > the near term? > > > > > > > > The current state of affairs seems to me that arrow2 is released > > > > from > > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjorgecarleitao%2Farrow2&data=04%7C01%7C%7C1b3176da8b6b45407c4208d955df3394%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637635239391364824%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=W1TaT%2BFVGrGL1Oay9QclLozhkfNS78jPdrkZFIFRtjA%3D&reserved=0 > > to crates.io (which is fine). > > > > Are > > > > you happy with keeping development in the jorgecarleitao repo where > > > > you will retain maximal control and flexibility until it is ready to > > > > start integrating? > > > > > > > > Or would you prefer to put it into one of the apache repos and > > > > subject > > > its > > > > development and release to the normal Arrow governance model > > > > (tarball, vote, etc)? > > > > > > > > Since you are the primary author/architect I think you should have a > > > > substantial say at this stage. > > > > > > > > Andrew > > > > > > > > > > > > On Tue, Jul 27, 2021 at 7:16 PM Andrew Lamb <al...@influxdata.com> > > > wrote: > > > > > > > > > I would be happy with this approach. Thank you for the suggestion > > > > > > > > > > This hybrid approach of both arrow and arrow2 in the same repo > > > > > seems better to me than separate repos. > > > > > > > > > > What I really care about is ensuring we don't have two crates/APIs > > > > > indefinitely -- as long as we are continually making progress > > > > > towards unification that is what is important to me. > > > > > > > > > > Andrew > > > > > > > > > > On Tue, Jul 27, 2021 at 1:40 PM Andy Grove <andygrov...@gmail.com> > > > > wrote: > > > > > > > > > >> Apologies for being late to this discussion. > > > > >> > > > > >> There is a hybrid option to consider here where we add the arrow2 > > > > >> code into the arrow crate as a separate module, so we release one > > > > >> crate > > > containing > > > > >> the "old" API (which we can mark as deprecated) as well as the > > > > >> new > > > API. > > > > >> Java did a similar thing a long time ago with "java.io" versus > > > > "java.nio" > > > > >> (new IO). > > > > >> > > > > >> I agree that the versioning wouldn't be ideal, but this seems > > > > >> like it might be a pragmatic compromise? > > > > >> > > > > >> Thanks, > > > > >> > > > > >> Andy. > > > > >> > > > > >> > > > > >> On Tue, Jul 20, 2021 at 5:41 AM Andrew Lamb > > > > >> <al...@influxdata.com> > > > > wrote: > > > > >> > > > > >> > What I meant is that when you decide arrow2 is suitable for > > > > >> > release > > > to > > > > >> > existing arrow users, I stand ready to help you incorporate it > > > > >> > into > > > > >> arrow. > > > > >> > > > > > >> > All the feedback I have heard so far from the rest of the > > > > >> > community > > > is > > > > >> that > > > > >> > we are ready. One might even say we are anxious to do so :) > > > > >> > > > > > >> > Andrew > > > > >> > > > > > >> > > > > > > > > > > > > > > >