Hi David and Weston, With your and other relevant maintainers permission, we’d like to take one of the following actions in order of preference:
- Transfer ownership of the flight-spark-connector repo to the Apache Github Org - Make a new repo in Apache and push the code - Keep the repo where it is and work to get it added into the “used-by” or “powered-by” list on the Arrow website. For the first two options we are not looking for help maintaining or developing. We want a logical location for the project in the short / medium term. Would you be able to help us to implement for one of these options? I like Weston's idea of listing the project in the “powered-by” website regardless of the option chosen. Thanks, Kyle On 2022/10/21 15:58:06 Weston Pace wrote: > > Maybe to take a step back - why do we want this in the Arrow > > repositories/under Arrow governance? > > I think this is the important question. What is the goal here? > > If the goal is to help spread awareness then we can link to a repo > somewhere (e.g. a "projects that use Arrow" section or something) For > example, I could eventually see something like [1] for ADBC. > > If the goal is to share some kind of CI infrastructure burden (e.g. > ensure a library runs everywhere that Arrow can run) then the contrib > repo might be more useful than a repo-per-project but I think we'll > need some more general discussion on how to make this happen. > > If the goal is to share maintenance / development cost or find new > developers then I don't think any approach works. Most Arrow > developers are quite adept at ignoring the parts of the repo they > don't need to interact with. > > [1] https://jwt.io/libraries > > On Fri, Oct 21, 2022 at 8:48 AM Antoine Pitrou <an...@python.org> wrote: > > > > > > Le 21/10/2022 à 17:35, David Li a écrit : > > > Maybe to take a step back - why do we want this in the Arrow > > > repositories/under Arrow governance? > > > > > > I'm excited to see more integrations and use cases for Flight and Flight > > > SQL in the wild, but I think it would be good to see a true ecosystem > > > around this, and so I don't think -every- integration needs to end up in > > > the Arrow repos. And there is a cost to set up CI, releases, etc. (ADBC > > > is still getting set up there, and my hope at least is that most > > > integrations will eventually be provided by the database systems, not by > > > Arrow.) > > > > > > That said I'm not necessarily opposed. We've discussed similar 'contrib' > > > things in the past [1][2]. It may be worth reviewing the discussions > > > there and discussing how this project would address the criteria proposed. > > > > The problem is that Arrow is so broad nowadays that a "contrib" repo > > would end up hosting a hodgepodge of entirely disparate subprojects with > > no common maintenance/release policies, and disjoint development and > > user communities. > > > > A separate Apache repo for each subproject is probably better, even > > though there might be a small setup overhead. > > > > Regards > > > > Antoine. > > > > > > > > > > > > > > > > [1]: https://lists.apache.org/thread/nfr3tq1tb5tvr34zg5z7on8xglfsj79t > > > [2]: https://lists.apache.org/thread/yshp4b3g34kxovzvf6x48pzj0894qbw5 > > > (though you may have to dig to find the responses - the UI didn't link > > > them up) > > > > > > On Fri, Oct 21, 2022, at 11:08, Kyle Brooks wrote: > > >> Hi David and Antoine, > > >> > > >> Long-term I completely agree that this should belong in Apache Spark. > > >> I also agree that Flight SQL or ADBC would be a good enhancement for > > >> users. We are planning on implementing Flight SQL support soon. ADBC > > >> doesn't look mature enough right now for this use case. We will keep > > >> an eye on it. > > >> > > >> Short-term, I'd like to propose either creating an Arrow contrib repo > > >> or making a separate Apache repo just for the Flight Spark Connector. > > >> > > >> We would need help facilitating this within Apache / Arrow. > > >> > > >> Thank you, > > >> Kyle > > >> > > >> On 2022/10/18 23:44:49 David Li wrote: > > >>> Given the probable need for IP clearance, getting it into Arrow would > > >>> also be a Process(TM) unfortunately. We also don't really have a great > > >>> place for "not quite in tree" projects; there have been discussions of > > >>> a 'contrib' repo in the past, but nothing has materialized. > > >>> > > >>> That said - have you shown this to Spark users? I'd guess there'd be > > >>> more enthusiasm there, especially if there are particular data > > >>> source(s) you anticipate this would make available to them. (Though > > >>> again, Flight SQL or ADBC over plain Flight RPC would might be a more > > >>> attractive target for such a Spark plugin.) > > >>> > > >>> -David > > >>> > > >>> On Tue, Oct 18, 2022, at 16:50, Matt Phelps wrote: > > >>>> Hi David and Antoine, > > >>>> > > >>>> Thanks for your input. On past experience talking to some other Arrow / > > >>>> Spark developers, we anticipate that it would take a long time to get > > >>>> into Spark. Our plan was to build up a user base in the Arrow community > > >>>> before submitting for inclusion to Spark. Is there a place the code can > > >>>> live in the mean time? > > >>>> > > >>>> Matt Phelps > > >>>> > > >>>> > > >>>> From: Antoine Pitrou <an...@python.org> > > >>>> Date: Monday, October 17, 2022 at 2:48 PM > > >>>> To: dev@arrow.apache.org <de...@arrow.apache.org> > > >>>> Subject: Re: [DISCUSS] Integrate existing Spark connector for Flight > > >>>> CAUTION: This email originated from outside of the organization. Do not > > >>>> click links or open attachments unless you recognize the sender and > > >>>> know the content is safe. > > >>>> > > >>>> Le 17/10/2022 à 21:27, David Li a écrit : > > >>>>> Hey Matt, > > >>>>> > > >>>>> This is cool to see. To be clear, this is an implementation of Spark > > >>>>> DataSourceV2 using Arrow Flight? > > >>>>> > > >>>>> I think the questions I have are: > > >>>>> > > >>>>> - Does this belong under Arrow, or under Spark - I lean towards it > > >>>>> being closer to Spark than Arrow; > > >>>> > > >>>> FWIW, that is my feeling as well. > > >>>> > > >>>> Regards > > >>>> > > >>>> Antoine. > > >>> >