I  agree with you both.

Users would love to have a project with multi-year maintenance with a
completely stable backwards compatible API (aka what tokio has promised)
that does everything they need.

However, building such software is (very) costly both initially and then
much more so for the ongoing maintenance; Until there is a need
(demonstrated by the willingness to pay the cost) from users of Rust/Arrow
for such maintenance I don't see how to make it happen.

Evidence of the lack of demand for longer 'supported' releases in my mind:
No one I know of has asked for, let alone volunteered to help create an
arrow-rs maintenance release (e.g.  4.4.1)  with just bug fixes. We have
all the process setup to make it happen, but no one cares yet.

I agree with Adam that there is middle ground here and I don't see any
insurmountable incompatibilities in release versions or processes.

Andrew

On Fri, Aug 6, 2021 at 5:31 AM Adam Lippai <a...@rigo.sk> wrote:

> Hi,
>
> Thanks for the detailed answer.
>
> In contrast to my previous email, my opinionated part:
>
> Generally I like the idea of smaller crates, it helps with a lot of stuff
> (different targets, build time), but those benefits can be achieved by
> feature gates too.
> The upside would be out-of-sync crate releases.
>
> Maintenance is important, historically speaking I've seen it solved for
> open source by private companies offering it as a paid service.
> You are right that currently only 3 months of support is provided for free,
> but personally I don't see that as an issue.
> There are professional libraries and software with close to 100% market
> share in their field which support the last or last two versions only
> (Chrome, OS-es, compilers).
> I find it hard to imagine we'd want to do it *better*, that sounds to be an
> illusion, but I'd like to be wrong on this one :)
> Professionally speaking, when picking projects, having Apache (or other)
> governance and community is more important for the businesses I worked
> with, than the release schedule or API stability / versioning.
>
>
> Based on the above and that there are about a dozen active Rust arrow
> contributors, any promise for reliable maintenance over years would be a
> lie in my eyes.
> DataFusion, Polars, odbc2parquet and others had issues with the changes
> being too slow, not too fast.
>
> I'm a big advocate of middle grounds and I still believe that your efforts
> and ideal setup is compatible with arrow-rs, nobody would stop you creating
> a 5.23.0 release next to the 6.1.0 if you'd want to backport anything and
> nobody would stop you cutting an out-of-schedule 6.2 or even 7.0 release if
> it's to ensure security. The frequent Apache release process - which we
> were afraid of - was smooth so far, with surprisingly nice support from
> members of different languages / implementations.
>
> Also I believe that any plan you'd have turning arrow2 into arrow-rs 6.0
> would be more than welcome on a public vote, along with the technical
> chances you propose (eg. cutting a separate arrow-io crate).
>
>
> At least 6 key members showed their excitement for your changes in this
> thread and even more on Slack/GitHub ;)
>
> Best regards,
> Adam Lippai
>
> On Fri, Aug 6, 2021 at 10:07 AM Jorge Cardoso Leitão <
> jorgecarlei...@gmail.com> wrote:
>
> > Hi,
> >
> > Thanks for your input.
> >
> > Every time there is a new major release, all new development shifts
> towards
> > that new API and users of previous APIs are left behind. It is not just a
> > matter of SemVer and size of version numbers, there is a whole
> development
> > shift to be on top of the new API.
> >
> > I disagree that a software that has a major release every 3 months and no
> > maintenance window over previous versions is stable. I alluded to the
> Tokio
> > example because Tokio 1.0 recently became the runtime of rust-based AWS
> > lambda functions [1]; this commitment is only possible by enforcing API
> > stability and maintenance beyond a 3 month period (at least 3 years in
> > their case).
> >
> > Also, imo the current major version number is not meaningless: divided by
> > the software age, it constitutes the historical release pattern and is
> > usually a good predictor of the pattern used in future releases.
> >
> > The evidence is that we haven't been able to support any version for any
> > period of time; recently, Andrew has been doing amazing work at
> supporting
> > the latest version for a period of 3 months. I.e. an application that
> > depends on `arrow = ^5.0` has a support window of 3 months. Given that we
> > have not backported any security fixes to previous versions, it is
> > reasonable to assume that security patches are also applied within a 3
> > month period only.
> >
> > As contributor of arrow2, I would rather not have arrow2 under Apache
> Arrow
> > than having to release it under its current versioning and scheduling
> (this
> > is similar to some of Julia's concerns). As a contributor to the Apache
> > Arrow, I currently cannot guarantee a maintenance window over arrow-rs
> for
> > any period of time because it is unsafe by design and I do not have the
> > motivation to fix it. As both, I am confident that the core arrow2 will
> > soon reach a point where we can live with and develop on top of it for at
> > least a year. This is not true to the whole API surface, though: there
> are
> > APIs that we will need to change more often until stability can be
> > promised.
> >
> > So, I am requesting that we tie the discussion of arrow2 to how it will
> be
> > released.
> >
> > Could a middle ground be somewhere along the lines of splitting the crate
> > in smaller crates that are versioned independently. I.e. continue to
> > release `arrow` under the same versioning and cadence, and create 3 new
> > crates, arrow-core, arrow-compute, and arrow-io (see also [2]) that would
> > have their own versioning at 0.X until stability is achieved, based on
> > arrow2's code base. The migration of the `arrow` crate to arrow2's API
> > would be to re-export from the smaller crates (e.g. `pub use
> > arrow_core::array`).
> >
> > [1] https://crates.io/crates/lambda_runtime/0.3.1/dependencies
> > [2] https://github.com/jorgecarleitao/arrow2/issues/257
> >
> > Best,
> > Jorge
> >
> >
> > On Thu, Aug 5, 2021 at 11:53 PM Adam Lippai <a...@rigo.sk> wrote:
> >
> > > Not taking sides, just two technical notes below.
> > >
> > > Server.org clearly defines (
> > > https://semver.org/#how-do-i-know-when-to-release-100) the versions
> > > >1.0.0.
> > > * If it's used in production, it's 1.0.0.
> > > * If it provides an API others depend on then it's 1.0.0.
> > > * If you intend to keep backward compatibility, it's 1.0.0.
> > > Tl;Dr 1.0.0 represents a version which from point we guarantee that
> > > non-production releases are marked (alpha, beta, rc) and breaking (API)
> > > changes, backwards incompatible changes result in major version bump.
> > This
> > > we already do, 4x per year.
> > >
> > > The second fact is that arrow2 uses the arrow name, but it doesn't have
> > > apache governance. It's not released from GitHub.com/apache, there are
> no
> > > formal releases, there are no votes. This is not correct or fair usage
> of
> > > the brand (on the same level as DataFuse, or db-benchmark calling a
> > custom
> > > R implementation arrow) even if it's "unofficial". My understanding is
> > that
> > > arrow2 can be an unofficial implementation with a different name or an
> > > arrow-rs experiment with the intention to merge the code, but not both.
> > >
> > > I think both issues could be solved and I really value and like the
> > arrow2
> > > work so far. That's the right way. I hope we'll see it in prod either
> way
> > > as soon as it's ready.
> > >
> > > Best regards,
> > > Adam Lippai
> > >
> > > On Wed, Aug 4, 2021, 08:25 QP Hou <houqp....@gmail.com> wrote:
> > >
> > > > Just my two cents.
> > > >
> > > > I think we all have the same goal here, which is to accelerate the
> > > > transitioning of arrow to arrow2 as the official arrow rust
> > > > implementation.
> > > >
> > > > In my opinion, the biggest gain we can get from merging two projects
> > > > into one repo is to have some kind of a policy to enforce that every
> > > > new feature/test added to the current arrow implementation also
> needs
> > > > to be added to the arrow2 implementation. This way, we can make sure
> > > > the gap between arrow and arrow2 is closing on every iteration.
> > > > Without this, I tend to agree with Jorge that merging two repos would
> > > > add more overhead to his work and slow him down.
> > > >
> > > > For those who want to contribute to arrow2 to accelerate the
> > > > transition, I don't think they would have problem sending PRs to the
> > > > arrow2 repo. For those who are not interested in contributing to
> > > > arrow2, merging the arrow2 code base into the current arrow-rs repo
> > > > won't incentivize them to contribute. Merging arrow2 into current
> > > > arrow-rs repo could help with discovery. But I think this can be
> > > > achieved by adding a big note in the current arrow-rs README to
> > > > encourage contributions to the arrow2 repo as well.
> > > >
> > > > At the end of the day, Jorge is currently the sole active contributor
> > > > to the arrow2 implementation, so I think he would have the most say
> on
> > > > what's the most productive way to push arrow2 forward. The only
> > > > concern I have with regards to merging arrow2 into arrow-rs right now
> > > > is Jorge spent all the efforts to do the merge, then it turned out
> > > > that he is still the only active contributor to arrow2 within
> > > > arrow-rs, but with more overhead that he has to deal with.
> > > >
> > > > As for maintaining semantic versioning for arrow2, Andy had a good
> > > > point that we could still release arrow2 with its own versioning even
> > > > if we merge it into the arrow-rs repo. So I don't think we should
> > > > worry/focus too much about versioning in our discussion. Velocity to
> > > > close the gap between arrow-rs and arrow2 is the most important
> thing.
> > > >
> > > > Lastly, I do agree with Andrew that it would be good to only maintain
> > > > a single arrow crate in crates.io in the long run. As he mentioned,
> > > > when the current arrow2 code base becomes stable, we could still
> > > > release it under the arrow namespace in crates.io with a major
> version
> > > > bump. The absolute value in the major version doesn't really matter
> as
> > > > long as we stick to the convention that breaking change will result
> in
> > > > a major version bump.
> > > >
> > > > Thanks,
> > > > QP
> > > >
> > > >
> > > >
> > > > On Tue, Aug 3, 2021 at 5:31 PM paddy horan <paddyho...@hotmail.com>
> > > wrote:
> > > > >
> > > > > Hi Jorge,
> > > > >
> > > > > I see value in consolidating development in a single repo and
> > releasing
> > > > under the existing arrow crate.  Regarding versioning, I think once
> we
> > > > follow semantic versioning we are fine.  I don't think it's worth
> > > migrating
> > > > to a different repo and crate to comply with the de-facto standard
> you
> > > > mention.
> > > > >
> > > > > Just one person's opinion though,
> > > > > Paddy
> > > > >
> > > > >
> > > > > -----Original Message-----
> > > > > From: Jorge Cardoso Leitão <jorgecarlei...@gmail.com>
> > > > > Sent: Tuesday, August 3, 2021 5:23 PM
> > > > > To: dev@arrow.apache.org
> > > > > Subject: Re: [Discuss] [Rust] Arrow2/parquet2 going foward
> > > > >
> > > > > Hi Paddy,
> > > > >
> > > > > > What do you think about moving Arrow2 into the main Arrow repo
> > where
> > > > > > it
> > > > > is only enabled via an "experimental" feature flag?
> > > > >
> > > > > AFAIK this is already possible:
> > > > > * add `arrow2 = { version = "0.2.0", optional = true }` to
> Cargo.toml
> > > > > * add `#[cfg(feature = "arrow2")]\npub mod arrow2;\n` to lib.rs
> > > > >
> > > > > We do this kind of thing to expose APIs from non-arrow crates such
> as
> > > > parts of the parquet-format-rs crate, and is generally the way to go
> > > when a
> > > > crate wants to expose a third-party API.
> > > > >
> > > > > I would not recommend doing this, though: by exposing arrow2 from
> > > arrow,
> > > > we double the compilation time and binary size of all dependencies
> that
> > > > activate the flag. Furthermore, there are users of arrow2 that do not
> > > need
> > > > the arrow crate, which this model would not support.
> > > > >
> > > > > AFAIK where development happens is unrelated to this aspect, Rust
> > > > enables this by design.
> > > > >
> > > > > > but also this would be a clear signal that Arrow2 is <1.0.
> > > > > > the experimental flag will be a clear signal to the existing
> Arrow
> > > > > community that Arrow2 is the future but that it is <1.0
> > > > >
> > > > > arrow2 is already <1.0 <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Farrow2&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=bJEw92M9Lz8cxJZ0o3vc0ezpou%2BuQx1S0MYeODKCKmE%3D&amp;reserved=0
> > > >.
> > > > My argument is that the arrow/arrow-flight/parquet are not versioned
> > > > according to the Rust community standards: It is a de facto practice
> in
> > > > Rust to delay major releases until the API is stable. Tokio's blog
> post
> > > > about their 1.0 <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftokio.rs%2Fblog%2F2020-12-tokio-1-0&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=En8p4k7Etyc%2BnQ3mJC4woQD%2Fkt7Uhmhw%2Bzf8scHhdgQ%3D&amp;reserved=0
> > > >
> > > > (i.e. "[...] we commit to holding back on a Tokio 2.0 release for at
> > > least
> > > > 3 years."). 10 most downloaded
> > > > > crates:
> > > > >
> > > > > *
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frand&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=sBxp1XYBLl6OIV57nM%2FGsZO0AmbgyBeRaoPANEvdZGE%3D&amp;reserved=0
> > > > (0.8.4)
> > > > > *
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fsyn&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=oeQliVwSgrvgART7r49XeiM%2F72TYa7hX8M3QyVDrqsk%3D&amp;reserved=0
> > > > (1.0.74)
> > > > > *
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Flibc&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=OULOu9vhaWEgnavRqedebM7ceZRsVnaF7YjYuq1MJ3Y%3D&amp;reserved=0
> > > > (0.2.98)
> > > > > *
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frand_core&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=mx6X86bNRis6UykbWR%2FWTGEgAjq8h6JylmOSAQlfsh0%3D&amp;reserved=0
> > > > (0.6.3)
> > > > > * quote (1.0.9)
> > > > > * unicode-xid (0.2.2)
> > > > > * proc-macro2 (1.0.28)
> > > > > * cfg-if (1.0.0)
> > > > > *
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fserde&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=p%2FNgTB0839C1%2F1Zn4GeEnRtvr0hiFhOuBJ5tF76aW5E%3D&amp;reserved=0
> > > > (1.0.126)
> > > > > * bitflags (1.2.1)
> > > > >
> > > > > These are small crates with a small scope, but even larger projects
> > > > share the same pattern:
> > > > >
> > > > > * crossbeam <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fcrossbeam&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9C%2BX5DnKLpp%2F8aTGrmKNB73Jf5JanlL4OhuC0YKgw9s%3D&amp;reserved=0
> > > >
> > > > (0.8.1)
> > > > > * rocket <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frocket&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Jh93g%2BiXxoeKlTNzhaOKvs3bsBfIJO3DJeetBI3nBV0%3D&amp;reserved=0
> > > >
> > > > (0.5)
> > > > > * polars <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fpolars&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Pdzno7bF3oqviXmv6nxInZemHD1d0SsaxmfdUxJ57T0%3D&amp;reserved=0
> > > >
> > > > (0.14.8)
> > > > > * tower <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Ftower&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=AmUGvrzXd8giphnKq0FNwjnc4a4Ki3T3GJL3P8rvEeM%3D&amp;reserved=0
> > > >
> > > > (0.4.8)
> > > > > * Tokio <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Ftokio&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Z%2FqBVQ%2Fi0BCmSJiBL7E6y%2F%2BbMVGKYXdo3oCRGOjm5UA%3D&amp;reserved=0
> > > >
> > > > (1.9.0)
> > > > > * hyper <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fhyper&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=c%2Fy4eY0BQCXE8XIoSb6UZAVUx4U%2BwcRUKN9jGJs5v3w%3D&amp;reserved=0
> > > >
> > > > (0.14.11)
> > > > >
> > > > > Crates that arrow depends on
> > > > > <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-rs%2Fblob%2Fmaster%2Farrow%2FCargo.toml&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=DdGZFC5Hf7i362%2FmhfFQUVVPnkDBJzw0zM6AzQ4jgcQ%3D&amp;reserved=0
> > > > >,
> > > > > that DataFusion
> > > > > depends on
> > > > > <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-datafusion%2Fblob%2Fmaster%2Fdatafusion%2FCargo.toml&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=OXKyW4O6q4hn6ZCHTN2jIvJpI3Iv8JvBBa0zKzBgZag%3D&amp;reserved=0
> > > > >,
> > > > > all share the same pattern of being either 0.X, 1.X when their API
> is
> > > > stable, and 2.X when they needed a large change in the API. This
> > > contrasts
> > > > with Apache Arrow's releases where we are now at 5.0 (and we have yet
> > to
> > > > arrive at a safe design).
> > > > >
> > > > > > existing users will be well supported in this transition
> > > > >
> > > > > How so? imo people either PR to the arrow/arrow2 code base or they
> > > won't.
> > > > > This is largely independent of where the development of either
> arrow2
> > > or
> > > > arrow happens; people google the crate, click on the repository link
> > and
> > > > file an issue or field a PR.
> > > > >
> > > > > > In general, I think the longer that development proceeds in
> > separate
> > > > > repos the harder it will be to eventually merge the two in a way
> that
> > > > supports existing users.
> > > > >
> > > > > How so? I may be mistaken, but API design is unrelated to on which
> > repo
> > > > the development happens: it is primarily driven by who is designing
> it
> > > and
> > > > from where or who they are inspired by. Both arrow and parquet's
> crate
> > > > design are inspired by the C++ implementation and have gradually been
> > > > migrated to "idiomatic" Rust, as "idiomatic" is becoming more well
> > > defined
> > > > in Rust.
> > > > > Arrow2 is inspired by the current crate and the pains of using it
> in
> > > > DataFusion. Datafuse, a fork of datafusion, recently migrated to
> arrow2
> > > > > <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdatafuselabs%2Fdatafuse%2Fpull%2F1239&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=0W9AeIxXcAvCrXkOE%2F1h0o%2BWam15PHEP7Pf7U1L84As%3D&amp;reserved=0
> > > >:
> > > > +1,947 −3,484, which shows that the crate is capturing important
> > patterns
> > > > from the arrow crate and exposing ones that are useful / result in
> less
> > > > code for the same or higher performance.
> > > > >
> > > > > On the opposite side, merging the development of crates under the
> > same
> > > > repo leads to: more triagging of PRs; more work for releases and
> > > > changelogging; tagging based on crates; multiple READMEs in subpaths
> of
> > > the
> > > > repo, curation of the CI to accommodate this, a workspace with many
> > > crates
> > > > each with its own set of dependencies, increasing compilation and
> > > > development; mixed commit logs, difficulties in reverts and
> > cherry-picks;
> > > > more difficult to find stuff in the repo. See e.g. how tokio-rs does
> > it:
> > > > >
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftokio-rs&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=nZUiKNr1DmeTNJLqiZgKX5P7nb6jt0OuZlufMywmDBE%3D&amp;reserved=0
> > > ,
> > > > even for small crates like bytes <
> > > >
> > >
> >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftokio-rs%2Fbytes&amp;data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ltf66TZejbomCtlqvhmDswFfdrunChIz5rDTeZzwyRU%3D&amp;reserved=0
> > > > >.
> > > > >
> > > > > Best,
> > > > > Jorge
> > > > >
> > > > > On Tue, Aug 3, 2021 at 3:13 PM paddy horan <paddyho...@hotmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi Jorge,
> > > > > >
> > > > > > What do you think about moving Arrow2 into the main Arrow repo
> > where
> > > > > > it is only enabled via an "experimental" feature flag?  This
> would
> > > > > > allow development of Arrow2 to proceed in the main repo but also
> > this
> > > > > > would be a clear signal that Arrow2 is <1.0.  When we feel ready
> > > (i.e.
> > > > > > Arrow2 is 1.0) we can release it in the next main release with
> > Arrow2
> > > > > > being the default and move the existing implementation behind a
> > > > "legacy" feature flag.
> > > > > >
> > > > > > Here is why I think this might work well:
> > > > > >  - People contributing to the Arrow project will naturally
> > contribute
> > > > > > to Arrow2.  At the moment, some people will still contribute to
> > Arrow
> > > > > > instead of Arrow2 just by virtue of it being the "official"
> > > > implementation.
> > > > > > However, if both are in one repo people will want to contribute
> to
> > > the
> > > > > > "future", i.e. Arrow2.
> > > > > >  - the experimental flag will be a clear signal to the existing
> > Arrow
> > > > > > community that Arrow2 is the future but that it is <1.0
> > > > > >  - existing users will be well supported in this transition
> > > > > >  - In general, I think the longer that development proceeds in
> > > > > > separate repos the harder it will be to eventually merge the two
> > in a
> > > > > > way that supports existing users.
> > > > > >
> > > > > > Do you think would work?
> > > > > >
> > > > > > Paddy
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Jorge Cardoso Leitão <jorgecarlei...@gmail.com>
> > > > > > Sent: Monday, August 2, 2021 1:59 PM
> > > > > > To: dev@arrow.apache.org
> > > > > > Subject: Re: [Discuss] [Rust] Arrow2/parquet2 going foward
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Sorry for the delay.
> > > > > >
> > > > > > If there is a path towards an official release under a <1.0.0
> > > > > > versioning schema aligned with the rest of the Rust ecosystem and
> > in
> > > > > > line with the stability of the API, then IMO we should move all
> > > > > > development to within Apache experimental asap (I can handle this
> > and
> > > > > > the likely IP clearance round). If we require a release >=1.X.Y
> to
> > it
> > > > > > and/or a schedule, then I prefer to keep expectations aligned and
> > > > postpone any movement.
> > > > > >
> > > > > > Under the move situation, I was thinking in something as follows:
> > > > > >
> > > > > > * gradually stop maintaining "arrow" in crates, offering a
> > > maintenance
> > > > > > window over which we release patches (*)
> > > > > > * work towards achieving feature parity on arrow2/parquet2 on the
> > > > > > experimental repos.
> > > > > > * keep releasing arrow2/parquet2 under a 0.X model during the
> step
> > > > > > above
> > > > > > (**)
> > > > > > * migrate to arrow-rs and archive experimentals (***)
> > > > > > * break arrow2 in smaller crates so that we can version the APIs
> > at a
> > > > > > different cadence
> > > > > > * once a crate reaches some stability (this is always
> opinionated,
> > > but
> > > > > > it is fine), we bump it to 1.0 and announce a maintenance plan
> ala
> > > > > > tokio <
> > > > > >
> > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftokio
> > > > > >
> > > .rs%2Fblog%2F2020-12-tokio-1-0&amp;data=04%7C01%7C%7Ca37de2cddc6e447a7
> > > > > >
> > > 77b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225
> > > > > >
> > > 764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIi
> > > > > >
> > > LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=oHPQI8MeSumgLTEsawCkRN
> > > > > > 5hANft%2BkbLTEmLZ3pIDiU%3D&amp;reserved=0
> > > > > > >.
> > > > > >
> > > > > > (*) e.g. "we will continue to patch the arrow crate up to at
> least
> > 6
> > > > > > months starting after the first release of arrow2 that supports
> > > > > > a) nested parquet read and write
> > > > > > b) union array (including IPC integration tests)
> > > > > > c) map array (including IPC integration tests)"
> > > > > >
> > > > > > (**) officially or un-officially (I would suggest officially so
> > that
> > > > > > we can acknowledge everyone's work on it, but no strong feelings)
> > > > > >
> > > > > > (***) something like:
> > > > > > 1. place arrow2 on top of a clear arrow repo so that the full
> > > > > > contribution history up to that point preserved 2. make arrow-rs
> > the
> > > > > > home of arrow2 (i.e. we start releasing arrow2 from
> > > > > > arrow-rs) and archive the experimental repos; create
> > arrow-rs-parquet
> > > > > > or something for parquet2.
> > > > > >
> > > > > > In summary, the core pain point for me is the current versioning
> of
> > > > > > arrow, which I feel is incompatible with my goals for arrow2 and
> > the
> > > > > > ecosystem I envision it supporting :)
> > > > > >
> > > > > > Best,
> > > > > > Jorge
> > > > > >
> > > > > > On Fri, Jul 30, 2021 at 8:44 PM Wes McKinney <
> wesmck...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > I think it would also be fine to push "beta" arrow2 crates out
> > of a
> > > > > > > repo under apache/ so long as they are not marked on crates.io
> > as
> > > > > > > being Apache-official releases. There's a possible slippery
> slope
> > > > > > > there, but as long as we are on a path to formalizing the
> > releases
> > > I
> > > > > > think it is okay.
> > > > > > >
> > > > > > > On Fri, Jul 30, 2021 at 1:07 PM Andrew Lamb <
> > al...@influxdata.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Jorge -- do you feel like we have a resolution on what to do
> > with
> > > > > > > > arrow2
> > > > > > > in
> > > > > > > > the near term?
> > > > > > > >
> > > > > > > > The current state of affairs seems to me that arrow2 is
> > released
> > > > > > > > from
> > > > > > > >
> > > > > >
> > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
> > > > > > b.com
> > > %2Fjorgecarleitao%2Farrow2&amp;data=04%7C01%7C%7Ca37de2cddc6e447a
> > > > > >
> > > 777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C63763622
> > > > > >
> > > 5764541982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI
> > > > > >
> > > iLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=jNo5puUzWEOmWj3wIs8CN
> > > > > > p44WmsoaRQGfsRdWgrftwE%3D&amp;reserved=0
> > > > > > to crates.io (which is fine).
> > > > > > > > Are
> > > > > > > > you happy with keeping development in the jorgecarleitao repo
> > > > > > > > where you will retain maximal control and flexibility until
> it
> > is
> > > > > > > > ready to start integrating?
> > > > > > > >
> > > > > > > > Or would you prefer to put it into one of the apache repos
> and
> > > > > > > > subject
> > > > > > > its
> > > > > > > > development and release to the normal Arrow governance model
> > > > > > > > (tarball, vote, etc)?
> > > > > > > >
> > > > > > > > Since you are the primary author/architect I think you should
> > > have
> > > > > > > > a substantial say at this stage.
> > > > > > > >
> > > > > > > > Andrew
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Jul 27, 2021 at 7:16 PM Andrew Lamb <
> > > al...@influxdata.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I would be happy with this approach. Thank you for the
> > > > > > > > > suggestion
> > > > > > > > >
> > > > > > > > > This hybrid approach of both arrow and arrow2 in the same
> > repo
> > > > > > > > > seems better to me than separate repos.
> > > > > > > > >
> > > > > > > > > What I really care about is ensuring we don't have two
> > > > > > > > > crates/APIs indefinitely -- as long as we are continually
> > > making
> > > > > > > > > progress towards unification that is what is important to
> me.
> > > > > > > > >
> > > > > > > > > Andrew
> > > > > > > > >
> > > > > > > > > On Tue, Jul 27, 2021 at 1:40 PM Andy Grove
> > > > > > > > > <andygrov...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> Apologies for being late to this discussion.
> > > > > > > > >>
> > > > > > > > >> There is a hybrid option to consider here where we add the
> > > > > > > > >> arrow2 code into the arrow crate as a separate module, so
> we
> > > > > > > > >> release one crate
> > > > > > > containing
> > > > > > > > >> the "old" API (which we can mark as deprecated) as well as
> > the
> > > > > > > > >> new
> > > > > > > API.
> > > > > > > > >> Java did a similar thing a long time ago with "java.io"
> > > versus
> > > > > > > > "java.nio"
> > > > > > > > >> (new IO).
> > > > > > > > >>
> > > > > > > > >> I agree that the versioning wouldn't be ideal, but this
> > seems
> > > > > > > > >> like it might be a pragmatic compromise?
> > > > > > > > >>
> > > > > > > > >> Thanks,
> > > > > > > > >>
> > > > > > > > >> Andy.
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> On Tue, Jul 20, 2021 at 5:41 AM Andrew Lamb
> > > > > > > > >> <al...@influxdata.com>
> > > > > > > > wrote:
> > > > > > > > >>
> > > > > > > > >> > What I meant is that when you decide arrow2 is suitable
> > for
> > > > > > > > >> > release
> > > > > > > to
> > > > > > > > >> > existing arrow users, I stand ready to help you
> > incorporate
> > > > > > > > >> > it into
> > > > > > > > >> arrow.
> > > > > > > > >> >
> > > > > > > > >> > All the feedback I have heard so far from the rest of
> the
> > > > > > > > >> > community
> > > > > > > is
> > > > > > > > >> that
> > > > > > > > >> > we are ready. One might even say we are anxious to do so
> > :)
> > > > > > > > >> >
> > > > > > > > >> > Andrew
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
>

Reply via email to