Absolutely, it will be marked experimental until we see some people using
it and can get more real-world feedback.

There's also already a couple things that will be followed-up on after the
initial adoption for expansion which were discussed in the comments.

On Thu, Mar 21, 2024, 11:42 AM David Li <lidav...@apache.org> wrote:

> I think let's try again. Would it be reasonable to declare this
> 'experimental' for the time being, just as we did with Flight/Flight
> SQL/etc?
>
> On Tue, Mar 19, 2024, at 15:24, Matt Topol wrote:
> > Hey All, It's been another month and we've gotten a whole bunch of
> feedback
> > and engagement on the document from a variety of individuals. Myself and
> a
> > few others have proactively attempted to reach out to as many third
> parties
> > as we could, hoping to pull more engagement also. While it would be great
> > to get even more feedback, the comments have slowed down and we haven't
> > gotten anything in a few days at this point.
> >
> > If there's no objections, I'd like to try to open up for voting again to
> > officially adopt this as a protocol to add to our docs.
> >
> > Thanks all!
> >
> > --Matt
> >
> > On Sat, Mar 2, 2024 at 6:43 PM Paul Whalen <pgwha...@gmail.com> wrote:
> >
> >> Agreed that it makes sense not to focus on in-place updating for this
> >> proposal.  I’m not even sure it’s a great fit as a “general purpose”
> Arrow
> >> protocol, because of all the assumptions and restrictions required as
> you
> >> noted.
> >>
> >> I took another look at the proposal and don’t think there’s anything
> >> preventing in-place updating in the future - ultimately the data body
> could
> >> just be in the same location for subsequent messages.
> >>
> >> Thanks!
> >> Paul
> >>
> >> On Fri, Mar 1, 2024 at 5:28 PM Matt Topol <zotthewiz...@gmail.com>
> wrote:
> >>
> >> > > @pgwhalen: As a potential "end user developer," (and aspiring
> >> > contributor) this
> >> > immediately excited me when I first saw it.
> >> >
> >> > Yay! Good to hear that!
> >> >
> >> > > @pgwhalen: And it wasn't clear to me whether updating batches in
> >> > place (and the producer/consumer coordination that comes with that)
> was
> >> > supported or encouraged as part of the proposal.
> >> >
> >> > So, updating batches in place was not a particular use-case we were
> >> > targeting with this approach. Instead using shared memory to produce
> and
> >> > consume the buffers/batches without having to physically copy the
> data.
> >> > Trying to update a batch in place is a dangerous prospect for a
> number of
> >> > reasons, but as you've mentioned it can technically be made safe if
> the
> >> > shape is staying the same and you're only modifying fixed-width data
> >> types
> >> > (i.e. not only is the *shape* unchanged but the sizes of the
> underlying
> >> > data buffers are also remaining unchanged). The producer/consumer
> >> > coordination that would be needed for updating batches in place is not
> >> part
> >> > of this proposal but is definitely something we can look into as a
> >> > follow-up to this for extending it. There's a number of discussions
> that
> >> > would need to be had around that so I don't want to add on another
> >> > complexity to this already complex proposal.
> >> >
> >> > That said, if you or anyone see something in this proposal that would
> >> > hinder or prevent being able to use it for your use case please let me
> >> know
> >> > so we can address it. Even though the proposal as it currently exists
> >> > doesn't fully support the in-place updating of batches, I don't want
> to
> >> > make things harder for us in such a follow-up where we'd end up
> requiring
> >> > an entirely new protocol to support that.
> >> >
> >> > > @octalene.dev: I know of a third party that is interested in Arrow
> for
> >> > HPC environments that could be interested in the proposal and I can
> see
> >> if
> >> > they're interested in providing feedback.
> >> >
> >> > Awesome! Thanks much!
> >> >
> >> >
> >> > For reference to anyone who hasn't looked at the document in a while,
> >> since
> >> > the original discussion thread on this I have added a full "Background
> >> > Context" page to the beginning of the proposal to help anyone who
> isn't
> >> > already familiar with the issues this protocol is trying to solve or
> >> isn't
> >> > already familiar with ucx or libfabric transports to better understand
> >> > *why* I'm
> >> > proposing this and what it is trying to solve. The point of this
> >> background
> >> > information is to help ensure that anyone who might have thoughts on
> >> > protocols in general or APIs should still be able to understand the
> base
> >> > reasons and goals that we're trying to achieve with this protocol
> >> proposal.
> >> > You don't need to already understand managing GPU/device memory or
> ucx to
> >> > be able to have meaningful input on the document.
> >> >
> >> > Thanks again to all who have contributed so far and please spread to
> any
> >> > contacts that you think might be interested in this for their
> particular
> >> > use cases.
> >> >
> >> > --Matt
> >> >
> >> > On Wed, Feb 28, 2024 at 1:39 AM Aldrin <octalene....@pm.me.invalid>
> >> wrote:
> >> >
> >> > > I am interested in this as well, but I haven't gotten to a point
> where
> >> I
> >> > > can have valuable input (I haven't tried other transports). I know
> of a
> >> > > third party that is interested in Arrow for HPC environments that
> could
> >> > be
> >> > > interested in the proposal and I can see if they're interested in
> >> > providing
> >> > > feedback.
> >> > >
> >> > > I glanced at the document before but I'll go through again to see if
> >> > there
> >> > > is anything I can comment on.
> >> > >
> >> > >
> >> > >
> >> > > # ------------------------------
> >> > > # Aldrin
> >> > >
> >> > >
> >> > > https://github.com/drin/
> >> > > https://gitlab.com/octalene
> >> > > https://keybase.io/octalene
> >> > >
> >> > >
> >> > > On Tuesday, February 27th, 2024 at 17:43, Paul Whalen <
> >> > pgwha...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > As a potential "end user developer," (and aspiring contributor)
> this
> >> > > > immediately excited me when I first saw it.
> >> > > >
> >> > >
> >> > > > I work at a trading firm, and my team has developed an IPC
> mechanism
> >> > for
> >> > > > efficiently transmitting pandas dataframes both remotely via TCP
> and
> >> > > > locally via shared memory, where the interface for the application
> >> > > > developer is the same for both. The data in the dataframes may
> change
> >> > > > rapidly, so when communicating locally via shared memory, if the
> >> shape
> >> > of
> >> > > > the dataframe doesn't change, we update the memory in place,
> >> > coordinating
> >> > > > between the producer and consumer via TCP.
> >> > > >
> >> > >
> >> > > > We intend to move away from our remote TCP mechanism towards Arrow
> >> > > Flight,
> >> > > > or a lighter-weight version of Arrow IPC. For the local shared
> memory
> >> > > > mechanism which we previously did not have a good answer for, it
> >> seems
> >> > > like
> >> > > > Disassociated Arrow IPC maps quite well to our problem.
> >> > > >
> >> > >
> >> > > > So some features that enable our use case are:
> >> > > > - Updating existing batches in place is supported
> >> > > > - The interface is pretty similar to Flight
> >> > > >
> >> > >
> >> > > > I'd imagine we're not the only financial firm to implement
> something
> >> > like
> >> > > > this, given how widespread pandas usage is, so that could be a
> place
> >> to
> >> > > > seek feedback.
> >> > > >
> >> > >
> >> > > > As I was reading the proposal initially, I gleaned that the most
> >> > > important
> >> > > > audience was those writing interfaces to GPUs/remote
> >> > memory/non-standard
> >> > > > transports/etc. And it wasn't clear to me whether updating
> batches in
> >> > > > place (and the producer/consumer coordination that comes with
> that)
> >> was
> >> > > > supported or encouraged as part of the proposal. But regardless,
> as
> >> an
> >> > > end
> >> > > > user, this seems like an easier and more efficient way to glue
> pieces
> >> > in
> >> > > > the Arrow ecosystem together if it was adopted broadly.
> >> > > >
> >> > >
> >> > > > Paul
> >> > > >
> >> > >
> >> > > > On Tue, Feb 27, 2024 at 6:05 PM Matt Topol zotthewiz...@gmail.com
> >> > wrote:
> >> > > >
> >> > >
> >> > > > > I'll continue my efforts of trying to reach out to other
> interested
> >> > > > > parties, but if anyone else here has any contacts or connections
> >> that
> >> > > they
> >> > > > > think might be interested please forward them the link to the
> >> Google
> >> > > doc.
> >> > > > >
> >> > >
> >> > > > > I really do want to get as much engagement and feedback as
> possible
> >> > on
> >> > > > > this.
> >> > > > >
> >> > >
> >> > > > > Thanks!
> >> > > > >
> >> > >
> >> > > > > On Tue, Feb 27, 2024, 6:38 PM Wes McKinney wesmck...@gmail.com
> >> > wrote:
> >> > > > >
> >> > >
> >> > > > > > Have there been efforts to proactively reach out to other
> third
> >> > > parties
> >> > > > > > that might have an interest in this or be a potential user at
> >> some
> >> > > point?
> >> > > > > > There are a lot of interested parties in Arrow that may not
> >> > actively
> >> > > > > > follow
> >> > > > > > the mailing list.
> >> > > > > >
> >> > >
> >> > > > > > Seems like folks from the Dask, Ray, RAPIDS (especially folks
> at
> >> > > NVIDIA
> >> > > > > > or
> >> > > > > > working on UCX), or other communities like that might have
> >> > > constructive
> >> > > > > > thoughts about this. DLPack (
> >> https://dmlc.github.io/dlpack/latest/
> >> > )
> >> > > also
> >> > > > > > seems adjacent and worth reaching out to. Other ideas for
> >> projects
> >> > or
> >> > > > > > companies that could be reached out to for feedback.
> >> > > > > >
> >> > >
> >> > > > > > On Tue, Feb 27, 2024 at 5:23 PM Antoine Pitrou
> >> anto...@python.org
> >> > > > > > wrote:
> >> > > > > >
> >> > >
> >> > > > > > > If there's no engagement, then I'm afraid it might mean that
> >> > third
> >> > > > > > > parties have no interest in this. I don't really have any
> >> > solution
> >> > > for
> >> > > > > > > generating engagement except nagging and pinging people
> >> > explicitly
> >> > > :-)
> >> > > > > > >
> >> > >
> >> > > > > > > Le 27/02/2024 à 19:09, Matt Topol a écrit :
> >> > > > > > >
> >> > >
> >> > > > > > > > I would like to see the same Antoine, currently given the
> >> lack
> >> > of
> >> > > > > > > > engagement (both for OR against) I was going to take the
> >> > silence
> >> > > as
> >> > > > > > > > assent
> >> > > > > > > > and hope for non-Voltron Data PMC members to vote in this.
> >> > > > > > > >
> >> > >
> >> > > > > > > > If anyone has any suggestions on how we could potentially
> >> > > generate
> >> > > > > > > > more
> >> > > > > > > > engagement and discussion on this, please let me know as I
> >> want
> >> > > as
> >> > > > > > > > many
> >> > > > > > > > parties in the community as possible to be part of this.
> >> > > > > > > >
> >> > >
> >> > > > > > > > Thanks everyone.
> >> > > > > > > >
> >> > >
> >> > > > > > > > --Matt
> >> > > > > > > >
> >> > >
> >> > > > > > > > On Tue, Feb 27, 2024 at 12:48 PM Antoine Pitrou
> >> > > anto...@python.org
> >> > > > > > > > wrote:
> >> > > > > > > >
> >> > >
> >> > > > > > > > > Hello,
> >> > > > > > > > >
> >> > >
> >> > > > > > > > > I'd really like to see more engagement and criticism
> from
> >> > > > > > > > > non-Voltron
> >> > > > > > > > > Data parties before this is formally adopted as an Arrow
> >> > spec.
> >> > > > > > > > >
> >> > >
> >> > > > > > > > > Regards
> >> > > > > > > > >
> >> > >
> >> > > > > > > > > Antoine.
> >> > > > > > > > >
> >> > >
> >> > > > > > > > > Le 27/02/2024 à 18:35, Matt Topol a écrit :
> >> > > > > > > > >
> >> > >
> >> > > > > > > > > > Hey all,
> >> > > > > > > > > >
> >> > >
> >> > > > > > > > > > I'd like to propose a vote for us to officially adopt
> the
> >> > > protocol
> >> > > > > > > > > > described in the google doc[1] for Dissociated Arrow
> IPC
> >> > > > > > > > > > Transports.
> >> > > > > > > > > > This
> >> > > > > > > > > > proposal was originally discussed at 2. Once this
> >> proposal
> >> > is
> >> > > > > > > > > > adopted,
> >> > > > > > > > > > I
> >> > > > > > > > > > will work on adding the necessary documentation to the
> >> > Arrow
> >> > > > > > > > > > website
> >> > > > > > > > > > along
> >> > > > > > > > > > with examples etc.
> >> > > > > > > > > >
> >> > >
> >> > > > > > > > > > The vote will be open for at least 72 hours.
> >> > > > > > > > > >
> >> > >
> >> > > > > > > > > > [ ] +1 Accept this Proposal
> >> > > > > > > > > > [ ] +0
> >> > > > > > > > > > [ ] -1 Do not accept this proposal because...
> >> > > > > > > > > >
> >> > >
> >> > > > > > > > > > Thank you everyone!
> >> > > > > > > > > >
> >> > >
> >> > > > > > > > > > --Matt
> >> > > > > > > > > >
> >> > >
> >> > > > > > > > > > [1]:
> >> > > > >
> >> > >
> >> > > > >
> >> > >
> >> >
> >>
> https://docs.google.com/document/d/1zHbnyK1r6KHpMOtEdIg1EZKNzHx-MVgUMOzB87GuXyk/edit#heading=h.38515dnp2bdb
> >> >
> >>
>

Reply via email to