One potential way "moving sqlparser-rs into DataFusion" could look is that
code/repo is moved from the sqlparser-rs [1] organization to the apache
organization. For example

https://github.com/sqlparser-rs/sqlparser-rs
to
https://github.com/apache/datafusion-sqlparser

We could continue development separately from any other code, release it as
a separate artifact, but use the same overarching governance structure
(voting on releases, committer access, etc)

To follow this model, I think the largest work item would be to run the IP
clearance process, and since sqlparser-rs has many distinct contributors
that may take a while

Andrew



On Wed, Feb 28, 2024 at 1:45 AM Aldrin <octalene....@pm.me.invalid> wrote:

> Maybe it would be valuable to more explicitly define "moving back into
> DataFusion project".
>
> I assumed it meant absorbing into the datafusion repo, but it occurs to me
> that may not be the case. Then, how would sqlparser-rs be "moved"?
>
>
>
> # ------------------------------
> # Aldrin
>
>
> https://github.com/drin/
> https://gitlab.com/octalene
> https://keybase.io/octalene
>
>
> On Tuesday, February 27th, 2024 at 16:20, Chak-Pong Chung <
> chakpongch...@gmail.com> wrote:
>
> > There are cases where people need datafusion but not a SQL parser. For
> > example, people building a composable query engine for graph or other
> data
> > modality may not choose SQL as the DSL. Decoupling them seems to be a
> good
> > idea.
> >
>
> > On Tue, Feb 27, 2024, 6:20 AM Mehmet Ozan Kabak o...@synnada.ai wrote:
> >
>
> > > In this case, maybe we can bring sqlparser-rs into the ASF umbrella
> > > following the arrow-datafusion model?
> > >
>
> > > Once DataFusion becomes a top-level project, we could move it to
> > > datafusion-sqlparser-rs — it would be a quasi-independent project just
> like
> > > how DataFusion is today w.r.t. Arrow. But it would get most benefits of
> > > having a community behind it.
> > >
>
> > > > On Feb 27, 2024, at 2:11 AM, Andrew Lamb al...@influxdata.com wrote:
> > > >
>
> > > > Julian, thank you for your insight. I very much agree with it.
> > > >
>
> > > > > I think the ASF is wrong on this. I think it needs to provide a
> home
> > > > > for medium-sized projects such as sqlparser-rs in an existing
> > > > > top-level project;
> > > >
>
> > > > It could be said that DataFusion fits this model -- it isn't really
> an
> > > > "Arrow" project but needed a place to live and grow, and the Arrow
> ASF
> > > > community provided that.
> > > >
>
> > > > Andrew
> > > >
>
> > > > On Mon, Feb 26, 2024 at 1:09 PM Julian Hyde jh...@apache.org wrote:
> > > >
>
> > > > > I am torn on this.
> > > > >
>
> > > > > One one hand, I am a big fan of components that are standalone -
> have
> > > > > no more dependencies than necessary, and are self-evidently
> > > > > standalone. So, I think that re-absorbing sqlparser-rs back into
> > > > > DataFusion would not be a good step. It would reduce the perception
> > > > > that it is standalone.
> > > > >
>
> > > > > On the other hand, it sounds as if sqlparser-rs would benefit by
> > > > > having an Apache-like community around it. DataFusion isn't a
> perfect
> > > > > fit - there is not much overlap between DataFusion and sqlparser-rs
> > > > > users - but it takes a lot of effort to create and run a top-level
> > > > > project, and DataFusion is already up and running.
> > > > >
>
> > > > > The tension is that people want to consume components that they
> > > > > perceive to be standalone, and yet the ASF wants to create
> communities
> > > > > that produce either a single large component or sets of
> highly-coupled
> > > > > components. The ASF used to do 'umbrella projects' whose
> sub-projects
> > > > > were in the same subject area but had little or no dependencies.
> For
> > > > > example, Apache DB [ https://db.apache.org/ ] has JDO, Derby and
> > > > > Torque. And commons included many useful Java libraries. Umbrella
> > > > > projects caused problems during the Jakarta and Hadoop eras, and
> now
> > > > > are strongly discouraged at the ASF.
> > > > >
>
> > > > > I think the ASF is wrong on this. I think it needs to provide a
> home
> > > > > for medium-sized projects such as sqlparser-rs in an existing
> > > > > top-level project; maybe those projects grow into top-level
> projects,
> > > > > or maybe they remain medium-sized projects. This is especially
> > > > > necessary in the Rust community, where there are many exciting
> > > > > projects, but they are almost all happening outside ASF. (This is
> > > > > exactly where Java was in ~2005. Maybe we need a rust-commons or
> > > > > rust-db?)
> > > > >
>
> > > > > My conclusion is to leave sqlparser-rs where it is for now, but to
> > > > > continue talking about what might be an attractive home for it in
> ASF.
> > > > >
>
> > > > > Julian
> > > > >
>
> > > > > On Mon, Feb 26, 2024 at 8:12 AM Andrew Lamb al...@influxdata.com
> > > > > wrote:
> > > > >
>
> > > > > > Sorry for the late reply,
> > > > > >
>
> > > > > > I think sqlparser-rs users are quite a bit more varied than
> DataFusion
> > > > > > and
> > > > > > there is not a large overlap between the contributors of the two
> > > > > > projects.
> > > > > > I currently seem to be the one reviewing / merging most
> sqlparser-rs
> > > > > > reviews, and I would definitely love some more help.
> > > > > >
>
> > > > > > However, given that the project is not an Apache project, I did
> not
> > > > > > have
> > > > > > good luck attracting help. A related discussion is here 1.
> > > > > >
>
> > > > > > If the DataFusion community would like to accelerate releases,
> we can
> > > > > > also
> > > > > > try to do that without bringing it into Apache governance.
> > > > > > Specifically,
> > > > > > it
> > > > > > would be great to have help reviewing the PRs -- the actual
> release
> > > > > > process
> > > > > > is pretty low overhead. The reviews are what take the vast
> majority of
> > > > > > the
> > > > > > maintenance time.
> > > > > >
>
> > > > > > Andrew
> > > > > >
>
> > > > > > On Sat, Feb 17, 2024 at 4:44 PM Aldrin octalene....@pm.me.invalid
> > > > > > wrote:
> > > > > >
>
> > > > > > > do users of sqlparser-rs mostly use datafusion? I don't know
> the
> > > > > > > community, but it seems like it would be an annoying change
> for users
> > > > > > > who
> > > > > > > use it with a different query engine. Just a thought
> > > > > > >
>
> > > > > > > Sent from Proton Mail https://proton.me/mail/home for iOS
> > > > > > >
>
> > > > > > > On Sat, Feb 17, 2024 at 10:26, Andy Grove <
> andygrov...@gmail.com
> > > > > > > <On+Sat,+Feb+17,+2024+at+10:26,+Andy+Grove+%3C%3Ca+href=>>
> wrote:
> > > > > > >
>
> > > > > > > I agree that it simplifies shipping new SQL features in
> DataFusion
> > > > > > > since we
> > > > > > > can develop the changes in the parser concurrently with the
> changes in
> > > > > > > other DataFusion crates and then release them all together.
> > > > > > >
>
> > > > > > > The name of the crate would not need to change, so downstream
> users
> > > > > > > should
> > > > > > > see no impact.
> > > > > > >
>
> > > > > > > We would need to decide if we want to keep a separate version
> number
> > > > > > > or
> > > > > > > bring it in line with DataFusion version numbers (I have no
> preference
> > > > > > > either way).
> > > > > > >
>
> > > > > > > On Sat, Feb 17, 2024 at 11:09 AM Mehmet Ozan Kabak
> o...@synnada.ai
> > > > > > > wrote:
> > > > > > >
>
> > > > > > > > Doing this will probably reduce the time-to-ship for
> DataFusion
> > > > > > > > features
> > > > > > > > that need parsing support due to increased convenience, so
> I’m
> > > > > > > > inclined
> > > > > > > > to
> > > > > > > > see it in a positive light.
> > > > > > > >
>
> > > > > > > > What would be the impact of doing this on people who use only
> > > > > > > > sqlparser-rs, if any?
> > > > > > > >
>
> > > > > > > > > On Feb 17, 2024, at 7:16 PM, Andy Grove
> andygrov...@gmail.com
> > > > > > > > > wrote:
> > > > > > > > >
>
> > > > > > > > > The sqlparser-rs project 1 seems to have become the
> de-facto SQL
> > > > > > > > > parser
> > > > > > > > > for Rust, with almost 4 million downloads so far. This was
> > > > > > > > > originally
> > > > > > > > > part
> > > > > > > > > of DataFusion very early on, and I moved it into a
> separate project
> > > > > > > > > because
> > > > > > > > > it seemed useful for other projects. This was before
> DataFusion was
> > > > > > > > > known
> > > > > > > > > as a composable query engine, and with hindsight, I
> probably should
> > > > > > > > > have
> > > > > > > > > left it as part of the DataFusion project.
> > > > > > > > >
>
> > > > > > > > > Now that DataFusion has a reputation as a composable query
> engine,
> > > > > > > > > I
> > > > > > > > > think
> > > > > > > > > it would make sense to move this code back into
> DataFusion, where
> > > > > > > > > it
> > > > > > > > > would
> > > > > > > > > benefit from a larger community of maintainers.
> > > > > > > > >
>
> > > > > > > > > I would like to hear thoughts from the Apache Arrow /
> DataFusion
> > > > > > > > > community.
> > > > > > > > > Does this seem like a good idea?
> > > > > > > > >
>
> > > > > > > > > Thanks,
> > > > > > > > >
>
> > > > > > > > > Andy.
> > > > > > > > >
>
> > > > > > > > > 1 https://github.com/sqlparser-rs/sqlparser-rs

Reply via email to