I am interested in this as well, but I haven't gotten to a point where I can 
have valuable input (I haven't tried other transports). I know of a third party 
that is interested in Arrow for HPC environments that could be interested in 
the proposal and I can see if they're interested in providing feedback.

I glanced at the document before but I'll go through again to see if there is 
anything I can comment on.



# ------------------------------
# Aldrin


https://github.com/drin/
https://gitlab.com/octalene
https://keybase.io/octalene


On Tuesday, February 27th, 2024 at 17:43, Paul Whalen <pgwha...@gmail.com> 
wrote:

> As a potential "end user developer," (and aspiring contributor) this
> immediately excited me when I first saw it.
> 

> I work at a trading firm, and my team has developed an IPC mechanism for
> efficiently transmitting pandas dataframes both remotely via TCP and
> locally via shared memory, where the interface for the application
> developer is the same for both. The data in the dataframes may change
> rapidly, so when communicating locally via shared memory, if the shape of
> the dataframe doesn't change, we update the memory in place, coordinating
> between the producer and consumer via TCP.
> 

> We intend to move away from our remote TCP mechanism towards Arrow Flight,
> or a lighter-weight version of Arrow IPC. For the local shared memory
> mechanism which we previously did not have a good answer for, it seems like
> Disassociated Arrow IPC maps quite well to our problem.
> 

> So some features that enable our use case are:
> - Updating existing batches in place is supported
> - The interface is pretty similar to Flight
> 

> I'd imagine we're not the only financial firm to implement something like
> this, given how widespread pandas usage is, so that could be a place to
> seek feedback.
> 

> As I was reading the proposal initially, I gleaned that the most important
> audience was those writing interfaces to GPUs/remote memory/non-standard
> transports/etc. And it wasn't clear to me whether updating batches in
> place (and the producer/consumer coordination that comes with that) was
> supported or encouraged as part of the proposal. But regardless, as an end
> user, this seems like an easier and more efficient way to glue pieces in
> the Arrow ecosystem together if it was adopted broadly.
> 

> Paul
> 

> On Tue, Feb 27, 2024 at 6:05 PM Matt Topol zotthewiz...@gmail.com wrote:
> 

> > I'll continue my efforts of trying to reach out to other interested
> > parties, but if anyone else here has any contacts or connections that they
> > think might be interested please forward them the link to the Google doc.
> > 

> > I really do want to get as much engagement and feedback as possible on
> > this.
> > 

> > Thanks!
> > 

> > On Tue, Feb 27, 2024, 6:38 PM Wes McKinney wesmck...@gmail.com wrote:
> > 

> > > Have there been efforts to proactively reach out to other third parties
> > > that might have an interest in this or be a potential user at some point?
> > > There are a lot of interested parties in Arrow that may not actively
> > > follow
> > > the mailing list.
> > > 

> > > Seems like folks from the Dask, Ray, RAPIDS (especially folks at NVIDIA
> > > or
> > > working on UCX), or other communities like that might have constructive
> > > thoughts about this. DLPack (https://dmlc.github.io/dlpack/latest/) also
> > > seems adjacent and worth reaching out to. Other ideas for projects or
> > > companies that could be reached out to for feedback.
> > > 

> > > On Tue, Feb 27, 2024 at 5:23 PM Antoine Pitrou anto...@python.org
> > > wrote:
> > > 

> > > > If there's no engagement, then I'm afraid it might mean that third
> > > > parties have no interest in this. I don't really have any solution for
> > > > generating engagement except nagging and pinging people explicitly :-)
> > > > 

> > > > Le 27/02/2024 à 19:09, Matt Topol a écrit :
> > > > 

> > > > > I would like to see the same Antoine, currently given the lack of
> > > > > engagement (both for OR against) I was going to take the silence as
> > > > > assent
> > > > > and hope for non-Voltron Data PMC members to vote in this.
> > > > > 

> > > > > If anyone has any suggestions on how we could potentially generate
> > > > > more
> > > > > engagement and discussion on this, please let me know as I want as
> > > > > many
> > > > > parties in the community as possible to be part of this.
> > > > > 

> > > > > Thanks everyone.
> > > > > 

> > > > > --Matt
> > > > > 

> > > > > On Tue, Feb 27, 2024 at 12:48 PM Antoine Pitrou anto...@python.org
> > > > > wrote:
> > > > > 

> > > > > > Hello,
> > > > > > 

> > > > > > I'd really like to see more engagement and criticism from
> > > > > > non-Voltron
> > > > > > Data parties before this is formally adopted as an Arrow spec.
> > > > > > 

> > > > > > Regards
> > > > > > 

> > > > > > Antoine.
> > > > > > 

> > > > > > Le 27/02/2024 à 18:35, Matt Topol a écrit :
> > > > > > 

> > > > > > > Hey all,
> > > > > > > 

> > > > > > > I'd like to propose a vote for us to officially adopt the protocol
> > > > > > > described in the google doc[1] for Dissociated Arrow IPC
> > > > > > > Transports.
> > > > > > > This
> > > > > > > proposal was originally discussed at 2. Once this proposal is
> > > > > > > adopted,
> > > > > > > I
> > > > > > > will work on adding the necessary documentation to the Arrow
> > > > > > > website
> > > > > > > along
> > > > > > > with examples etc.
> > > > > > > 

> > > > > > > The vote will be open for at least 72 hours.
> > > > > > > 

> > > > > > > [ ] +1 Accept this Proposal
> > > > > > > [ ] +0
> > > > > > > [ ] -1 Do not accept this proposal because...
> > > > > > > 

> > > > > > > Thank you everyone!
> > > > > > > 

> > > > > > > --Matt
> > > > > > > 

> > > > > > > [1]:
> > 

> > https://docs.google.com/document/d/1zHbnyK1r6KHpMOtEdIg1EZKNzHx-MVgUMOzB87GuXyk/edit#heading=h.38515dnp2bdb

Attachment: publickey - octalene.dev@pm.me - 0x21969656.asc
Description: application/pgp-keys

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to