Julian, thank you for your insight. I very much agree with it.

> I think the ASF is wrong on this. I think it needs to provide a home
> for medium-sized projects such as sqlparser-rs in an existing
> top-level project;

It could be said that DataFusion fits this model  -- it isn't really an
"Arrow" project but needed a place to live and grow, and the Arrow ASF
community provided that.

Andrew




On Mon, Feb 26, 2024 at 1:09 PM Julian Hyde <jh...@apache.org> wrote:

> I am torn on this.
>
> One one hand, I am a big fan of components that are standalone - have
> no more dependencies than necessary, and are self-evidently
> standalone. So, I think that re-absorbing sqlparser-rs back into
> DataFusion would not be a good step. It would reduce the perception
> that it is standalone.
>
> On the other hand, it sounds as if sqlparser-rs would benefit by
> having an Apache-like community around it. DataFusion isn't a perfect
> fit - there is not much overlap between DataFusion and sqlparser-rs
> users - but it takes a lot of effort to create and run a top-level
> project, and DataFusion is already up and running.
>
> The tension is that people want to consume components that they
> perceive to be standalone, and yet the ASF wants to create communities
> that produce either a single large component or sets of highly-coupled
> components. The ASF used to do 'umbrella projects' whose sub-projects
> were in the same subject area but had little or no dependencies. For
> example, Apache DB [ https://db.apache.org/ ] has JDO, Derby and
> Torque. And commons included many useful Java libraries. Umbrella
> projects caused problems during the Jakarta and Hadoop eras, and now
> are strongly discouraged at the ASF.
>
> I think the ASF is wrong on this. I think it needs to provide a home
> for medium-sized projects such as sqlparser-rs in an existing
> top-level project; maybe those projects grow into top-level projects,
> or maybe they remain medium-sized projects. This is especially
> necessary in the Rust community, where there are many exciting
> projects, but they are almost all happening outside ASF. (This is
> exactly where Java was in ~2005. Maybe we need a rust-commons or
> rust-db?)
>
> My conclusion is to leave sqlparser-rs where it is for now, but to
> continue talking about what might be an attractive home for it in ASF.
>
> Julian
>
> On Mon, Feb 26, 2024 at 8:12 AM Andrew Lamb <al...@influxdata.com> wrote:
> >
> > Sorry for the late reply,
> >
> > I think sqlparser-rs users are quite a bit more varied than DataFusion
> and
> > there is not a large overlap between the contributors of the two
> projects.
> > I currently seem to be the one reviewing / merging most sqlparser-rs
> > reviews, and I would definitely love some more help.
> >
> > However, given that the project is not an Apache project, I did not have
> > good luck attracting help.  A related discussion is here [1].
> >
> > If the DataFusion community would like to accelerate releases, we can
> also
> > try to do that without bringing it into Apache governance. Specifically,
> it
> > would be great to have help reviewing the PRs -- the actual release
> process
> > is pretty low overhead. The reviews are what take the vast majority of
> the
> > maintenance time.
> >
> > Andrew
> >
> > [1]: https://github.com/sqlparser-rs/sqlparser-rs/issues/818
> >
> >
> >
> > On Sat, Feb 17, 2024 at 4:44 PM Aldrin <octalene....@pm.me.invalid>
> wrote:
> >
> > > do users of sqlparser-rs mostly use datafusion? I don't know the
> > > community, but it seems like it would be an annoying change for users
> who
> > > use it with a different query engine. Just a thought
> > >
> > > Sent from Proton Mail <https://proton.me/mail/home> for iOS
> > >
> > >
> > > On Sat, Feb 17, 2024 at 10:26, Andy Grove <andygrov...@gmail.com
> > > <On+Sat,+Feb+17,+2024+at+10:26,+Andy+Grove+%3C%3Ca+href=>> wrote:
> > >
> > > I agree that it simplifies shipping new SQL features in DataFusion
> since we
> > > can develop the changes in the parser concurrently with the changes in
> > > other DataFusion crates and then release them all together.
> > >
> > > The name of the crate would not need to change, so downstream users
> should
> > > see no impact.
> > >
> > > We would need to decide if we want to keep a separate version number or
> > > bring it in line with DataFusion version numbers (I have no preference
> > > either way).
> > >
> > >
> > >
> > > On Sat, Feb 17, 2024 at 11:09 AM Mehmet Ozan Kabak <o...@synnada.ai>
> > > wrote:
> > >
> > > > Doing this will probably reduce the time-to-ship for DataFusion
> features
> > > > that need parsing support due to increased convenience, so I’m
> inclined
> > > to
> > > > see it in a positive light.
> > > >
> > > > What would be the impact of doing this on people who use only
> > > > sqlparser-rs, if any?
> > > >
> > > > > On Feb 17, 2024, at 7:16 PM, Andy Grove <andygrov...@gmail.com>
> wrote:
> > > > >
> > > > > The sqlparser-rs project [1] seems to have become the de-facto SQL
> > > parser
> > > > > for Rust, with almost 4 million downloads so far. This was
> originally
> > > > part
> > > > > of DataFusion very early on, and I moved it into a separate project
> > > > because
> > > > > it seemed useful for other projects. This was before DataFusion was
> > > known
> > > > > as a composable query engine, and with hindsight, I probably should
> > > have
> > > > > left it as part of the DataFusion project.
> > > > >
> > > > > Now that DataFusion has a reputation as a composable query engine,
> I
> > > > think
> > > > > it would make sense to move this code back into DataFusion, where
> it
> > > > would
> > > > > benefit from a larger community of maintainers.
> > > > >
> > > > > I would like to hear thoughts from the Apache Arrow / DataFusion
> > > > community.
> > > > > Does this seem like a good idea?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Andy.
> > > > >
> > > > > [1] https://github.com/sqlparser-rs/sqlparser-rs
> > > >
> > > >
> > >
> > >
>

Reply via email to