As a potential "end user developer," (and aspiring contributor) this immediately excited me when I first saw it.
I work at a trading firm, and my team has developed an IPC mechanism for efficiently transmitting pandas dataframes both remotely via TCP and locally via shared memory, where the interface for the application developer is the same for both. The data in the dataframes may change rapidly, so when communicating locally via shared memory, if the shape of the dataframe doesn't change, we update the memory in place, coordinating between the producer and consumer via TCP. We intend to move away from our remote TCP mechanism towards Arrow Flight, or a lighter-weight version of Arrow IPC. For the local shared memory mechanism which we previously did not have a good answer for, it seems like Disassociated Arrow IPC maps quite well to our problem. So some features that enable our use case are: - Updating existing batches in place is supported - The interface is pretty similar to Flight I'd imagine we're not the only financial firm to implement something like this, given how widespread pandas usage is, so that could be a place to seek feedback. As I was reading the proposal initially, I gleaned that the most important audience was those writing interfaces to GPUs/remote memory/non-standard transports/etc. And it wasn't clear to me whether updating batches in place (and the producer/consumer coordination that comes with that) was supported or encouraged as part of the proposal. But regardless, as an end user, this seems like an easier and more efficient way to glue pieces in the Arrow ecosystem together if it was adopted broadly. Paul On Tue, Feb 27, 2024 at 6:05 PM Matt Topol <zotthewiz...@gmail.com> wrote: > I'll continue my efforts of trying to reach out to other interested > parties, but if anyone else here has any contacts or connections that they > think might be interested please forward them the link to the Google doc. > > I really do want to get as much engagement and feedback as possible on > this. > > Thanks! > > On Tue, Feb 27, 2024, 6:38 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > Have there been efforts to proactively reach out to other third parties > > that might have an interest in this or be a potential user at some point? > > There are a lot of interested parties in Arrow that may not actively > follow > > the mailing list. > > > > Seems like folks from the Dask, Ray, RAPIDS (especially folks at NVIDIA > or > > working on UCX), or other communities like that might have constructive > > thoughts about this. DLPack (https://dmlc.github.io/dlpack/latest/) also > > seems adjacent and worth reaching out to. Other ideas for projects or > > companies that could be reached out to for feedback. > > > > On Tue, Feb 27, 2024 at 5:23 PM Antoine Pitrou <anto...@python.org> > wrote: > > > > > > > > If there's no engagement, then I'm afraid it might mean that third > > > parties have no interest in this. I don't really have any solution for > > > generating engagement except nagging and pinging people explicitly :-) > > > > > > > > > > > > Le 27/02/2024 à 19:09, Matt Topol a écrit : > > > > I would like to see the same Antoine, currently given the lack of > > > > engagement (both for OR against) I was going to take the silence as > > > assent > > > > and hope for non-Voltron Data PMC members to vote in this. > > > > > > > > If anyone has any suggestions on how we could potentially generate > more > > > > engagement and discussion on this, please let me know as I want as > many > > > > parties in the community as possible to be part of this. > > > > > > > > Thanks everyone. > > > > > > > > --Matt > > > > > > > > On Tue, Feb 27, 2024 at 12:48 PM Antoine Pitrou <anto...@python.org> > > > wrote: > > > > > > > >> > > > >> Hello, > > > >> > > > >> I'd really like to see more engagement and criticism from > non-Voltron > > > >> Data parties before this is formally adopted as an Arrow spec. > > > >> > > > >> Regards > > > >> > > > >> Antoine. > > > >> > > > >> > > > >> Le 27/02/2024 à 18:35, Matt Topol a écrit : > > > >>> Hey all, > > > >>> > > > >>> I'd like to propose a vote for us to officially adopt the protocol > > > >>> described in the google doc[1] for Dissociated Arrow IPC > Transports. > > > This > > > >>> proposal was originally discussed at [2]. Once this proposal is > > > adopted, > > > >> I > > > >>> will work on adding the necessary documentation to the Arrow > website > > > >> along > > > >>> with examples etc. > > > >>> > > > >>> The vote will be open for at least 72 hours. > > > >>> > > > >>> [ ] +1 Accept this Proposal > > > >>> [ ] +0 > > > >>> [ ] -1 Do not accept this proposal because... > > > >>> > > > >>> Thank you everyone! > > > >>> > > > >>> --Matt > > > >>> > > > >>> [1]: > > > >>> > > > >> > > > > > > https://docs.google.com/document/d/1zHbnyK1r6KHpMOtEdIg1EZKNzHx-MVgUMOzB87GuXyk/edit#heading=h.38515dnp2bdb > > > >>> [2]: > > https://lists.apache.org/thread/tn5wt4p52f6kqjtx3tjxqd9122n4pf94 > > > >>> > > > >> > > > > > > > > > >