I'm sorry for the very late reply.  Until yesterday I had no real concept
of what this was talking about and so I had stayed out.

I'm +0 only because it isn't clear what we are voting on.  There is a word
doc with no implementation or PR.  I think there could be an implementation
/ PR.  For example, does any ADBC client respect this protocol today?  If a
flight server responds with an S3/HTTP URI will the ADBC client download
the files from the correct place?  Will it at least notice that the URI is
not a GRPC URI and give a "I don't have a connector for downloading from
HTTP/S3" error?  In general, I think we do want this in Flight (see
comments below) and I am very supportive of the idea.  However, if adopting
this as an experimental proposal helps move this forward then I think
that's fine.

That being said, I do want to express support for the proposal as a
concept, at least the "disassociated transports" portion (I can't speak to
UCX/etc.).  I was speaking with someone yesterday and they explained that
they ended up not choosing Flight for an internal project because Flight
didn't support something called "cloud fetch" which I have now learned is
[1].  I had recalled looking at this proposal before and this person seemed
interested and optimistic to know this was being considered for Flight.
This proposal, as I understand it, should make it possible for cloud
servers to support a cloud fetch style API.  From the discussion I got the
impression that this cloud fetch approach is useful and generally
applicable.

So a big +1 for the idea of disassociated transports but I'm not sure why
we need a vote to start working on it (but I'm not opposed if a vote helps)

[1]
https://www.databricks.com/blog/2021/08/11/how-we-achieved-high-bandwidth-connectivity-with-bi-tools.html

On Thu, Mar 28, 2024 at 1:04 PM Matt Topol <zotthewiz...@gmail.com> wrote:

> I'll keep this new vote open for at least the next 72 hours. As before
> please reply with:
>
> [ ] +1 Accept this Proposal
> [ ] +0
> [ ] -1 Do not accept this proposal because...
>
> Thanks everyone!
>
> On Wed, Mar 27, 2024 at 7:51 PM Benjamin Kietzman <bengil...@gmail.com>
> wrote:
>
> > +1
> >
> > On Tue, Mar 26, 2024, 18:36 Matt Topol <zotthewiz...@gmail.com> wrote:
> >
> > > Should I start a new thread for a new vote? Or repeat the original vote
> > > email here?
> > >
> > > Just asking since there hasn't been any responses so far.
> > >
> > > --Matt
> > >
> > > On Thu, Mar 21, 2024 at 11:46 AM Matt Topol <zotthewiz...@gmail.com>
> > > wrote:
> > >
> > > > Absolutely, it will be marked experimental until we see some people
> > using
> > > > it and can get more real-world feedback.
> > > >
> > > > There's also already a couple things that will be followed-up on
> after
> > > the
> > > > initial adoption for expansion which were discussed in the comments.
> > > >
> > > > On Thu, Mar 21, 2024, 11:42 AM David Li <lidav...@apache.org> wrote:
> > > >
> > > >> I think let's try again. Would it be reasonable to declare this
> > > >> 'experimental' for the time being, just as we did with Flight/Flight
> > > >> SQL/etc?
> > > >>
> > > >> On Tue, Mar 19, 2024, at 15:24, Matt Topol wrote:
> > > >> > Hey All, It's been another month and we've gotten a whole bunch of
> > > >> feedback
> > > >> > and engagement on the document from a variety of individuals.
> Myself
> > > >> and a
> > > >> > few others have proactively attempted to reach out to as many
> third
> > > >> parties
> > > >> > as we could, hoping to pull more engagement also. While it would
> be
> > > >> great
> > > >> > to get even more feedback, the comments have slowed down and we
> > > haven't
> > > >> > gotten anything in a few days at this point.
> > > >> >
> > > >> > If there's no objections, I'd like to try to open up for voting
> > again
> > > to
> > > >> > officially adopt this as a protocol to add to our docs.
> > > >> >
> > > >> > Thanks all!
> > > >> >
> > > >> > --Matt
> > > >> >
> > > >> > On Sat, Mar 2, 2024 at 6:43 PM Paul Whalen <pgwha...@gmail.com>
> > > wrote:
> > > >> >
> > > >> >> Agreed that it makes sense not to focus on in-place updating for
> > this
> > > >> >> proposal.  I’m not even sure it’s a great fit as a “general
> > purpose”
> > > >> Arrow
> > > >> >> protocol, because of all the assumptions and restrictions
> required
> > as
> > > >> you
> > > >> >> noted.
> > > >> >>
> > > >> >> I took another look at the proposal and don’t think there’s
> > anything
> > > >> >> preventing in-place updating in the future - ultimately the data
> > body
> > > >> could
> > > >> >> just be in the same location for subsequent messages.
> > > >> >>
> > > >> >> Thanks!
> > > >> >> Paul
> > > >> >>
> > > >> >> On Fri, Mar 1, 2024 at 5:28 PM Matt Topol <
> zotthewiz...@gmail.com>
> > > >> wrote:
> > > >> >>
> > > >> >> > > @pgwhalen: As a potential "end user developer," (and aspiring
> > > >> >> > contributor) this
> > > >> >> > immediately excited me when I first saw it.
> > > >> >> >
> > > >> >> > Yay! Good to hear that!
> > > >> >> >
> > > >> >> > > @pgwhalen: And it wasn't clear to me whether updating batches
> > in
> > > >> >> > place (and the producer/consumer coordination that comes with
> > that)
> > > >> was
> > > >> >> > supported or encouraged as part of the proposal.
> > > >> >> >
> > > >> >> > So, updating batches in place was not a particular use-case we
> > were
> > > >> >> > targeting with this approach. Instead using shared memory to
> > > produce
> > > >> and
> > > >> >> > consume the buffers/batches without having to physically copy
> the
> > > >> data.
> > > >> >> > Trying to update a batch in place is a dangerous prospect for a
> > > >> number of
> > > >> >> > reasons, but as you've mentioned it can technically be made
> safe
> > if
> > > >> the
> > > >> >> > shape is staying the same and you're only modifying fixed-width
> > > data
> > > >> >> types
> > > >> >> > (i.e. not only is the *shape* unchanged but the sizes of the
> > > >> underlying
> > > >> >> > data buffers are also remaining unchanged). The
> producer/consumer
> > > >> >> > coordination that would be needed for updating batches in place
> > is
> > > >> not
> > > >> >> part
> > > >> >> > of this proposal but is definitely something we can look into
> as
> > a
> > > >> >> > follow-up to this for extending it. There's a number of
> > discussions
> > > >> that
> > > >> >> > would need to be had around that so I don't want to add on
> > another
> > > >> >> > complexity to this already complex proposal.
> > > >> >> >
> > > >> >> > That said, if you or anyone see something in this proposal that
> > > would
> > > >> >> > hinder or prevent being able to use it for your use case please
> > let
> > > >> me
> > > >> >> know
> > > >> >> > so we can address it. Even though the proposal as it currently
> > > exists
> > > >> >> > doesn't fully support the in-place updating of batches, I don't
> > > want
> > > >> to
> > > >> >> > make things harder for us in such a follow-up where we'd end up
> > > >> requiring
> > > >> >> > an entirely new protocol to support that.
> > > >> >> >
> > > >> >> > > @octalene.dev: I know of a third party that is interested in
> > > >> Arrow for
> > > >> >> > HPC environments that could be interested in the proposal and I
> > can
> > > >> see
> > > >> >> if
> > > >> >> > they're interested in providing feedback.
> > > >> >> >
> > > >> >> > Awesome! Thanks much!
> > > >> >> >
> > > >> >> >
> > > >> >> > For reference to anyone who hasn't looked at the document in a
> > > while,
> > > >> >> since
> > > >> >> > the original discussion thread on this I have added a full
> > > >> "Background
> > > >> >> > Context" page to the beginning of the proposal to help anyone
> who
> > > >> isn't
> > > >> >> > already familiar with the issues this protocol is trying to
> solve
> > > or
> > > >> >> isn't
> > > >> >> > already familiar with ucx or libfabric transports to better
> > > >> understand
> > > >> >> > *why* I'm
> > > >> >> > proposing this and what it is trying to solve. The point of
> this
> > > >> >> background
> > > >> >> > information is to help ensure that anyone who might have
> thoughts
> > > on
> > > >> >> > protocols in general or APIs should still be able to understand
> > the
> > > >> base
> > > >> >> > reasons and goals that we're trying to achieve with this
> protocol
> > > >> >> proposal.
> > > >> >> > You don't need to already understand managing GPU/device memory
> > or
> > > >> ucx to
> > > >> >> > be able to have meaningful input on the document.
> > > >> >> >
> > > >> >> > Thanks again to all who have contributed so far and please
> spread
> > > to
> > > >> any
> > > >> >> > contacts that you think might be interested in this for their
> > > >> particular
> > > >> >> > use cases.
> > > >> >> >
> > > >> >> > --Matt
> > > >> >> >
> > > >> >> > On Wed, Feb 28, 2024 at 1:39 AM Aldrin
> > <octalene....@pm.me.invalid
> > > >
> > > >> >> wrote:
> > > >> >> >
> > > >> >> > > I am interested in this as well, but I haven't gotten to a
> > point
> > > >> where
> > > >> >> I
> > > >> >> > > can have valuable input (I haven't tried other transports). I
> > > know
> > > >> of a
> > > >> >> > > third party that is interested in Arrow for HPC environments
> > that
> > > >> could
> > > >> >> > be
> > > >> >> > > interested in the proposal and I can see if they're
> interested
> > in
> > > >> >> > providing
> > > >> >> > > feedback.
> > > >> >> > >
> > > >> >> > > I glanced at the document before but I'll go through again to
> > see
> > > >> if
> > > >> >> > there
> > > >> >> > > is anything I can comment on.
> > > >> >> > >
> > > >> >> > >
> > > >> >> > >
> > > >> >> > > # ------------------------------
> > > >> >> > > # Aldrin
> > > >> >> > >
> > > >> >> > >
> > > >> >> > > https://github.com/drin/
> > > >> >> > > https://gitlab.com/octalene
> > > >> >> > > https://keybase.io/octalene
> > > >> >> > >
> > > >> >> > >
> > > >> >> > > On Tuesday, February 27th, 2024 at 17:43, Paul Whalen <
> > > >> >> > pgwha...@gmail.com>
> > > >> >> > > wrote:
> > > >> >> > >
> > > >> >> > > > As a potential "end user developer," (and aspiring
> > contributor)
> > > >> this
> > > >> >> > > > immediately excited me when I first saw it.
> > > >> >> > > >
> > > >> >> > >
> > > >> >> > > > I work at a trading firm, and my team has developed an IPC
> > > >> mechanism
> > > >> >> > for
> > > >> >> > > > efficiently transmitting pandas dataframes both remotely
> via
> > > TCP
> > > >> and
> > > >> >> > > > locally via shared memory, where the interface for the
> > > >> application
> > > >> >> > > > developer is the same for both. The data in the dataframes
> > may
> > > >> change
> > > >> >> > > > rapidly, so when communicating locally via shared memory,
> if
> > > the
> > > >> >> shape
> > > >> >> > of
> > > >> >> > > > the dataframe doesn't change, we update the memory in
> place,
> > > >> >> > coordinating
> > > >> >> > > > between the producer and consumer via TCP.
> > > >> >> > > >
> > > >> >> > >
> > > >> >> > > > We intend to move away from our remote TCP mechanism
> towards
> > > >> Arrow
> > > >> >> > > Flight,
> > > >> >> > > > or a lighter-weight version of Arrow IPC. For the local
> > shared
> > > >> memory
> > > >> >> > > > mechanism which we previously did not have a good answer
> for,
> > > it
> > > >> >> seems
> > > >> >> > > like
> > > >> >> > > > Disassociated Arrow IPC maps quite well to our problem.
> > > >> >> > > >
> > > >> >> > >
> > > >> >> > > > So some features that enable our use case are:
> > > >> >> > > > - Updating existing batches in place is supported
> > > >> >> > > > - The interface is pretty similar to Flight
> > > >> >> > > >
> > > >> >> > >
> > > >> >> > > > I'd imagine we're not the only financial firm to implement
> > > >> something
> > > >> >> > like
> > > >> >> > > > this, given how widespread pandas usage is, so that could
> be
> > a
> > > >> place
> > > >> >> to
> > > >> >> > > > seek feedback.
> > > >> >> > > >
> > > >> >> > >
> > > >> >> > > > As I was reading the proposal initially, I gleaned that the
> > > most
> > > >> >> > > important
> > > >> >> > > > audience was those writing interfaces to GPUs/remote
> > > >> >> > memory/non-standard
> > > >> >> > > > transports/etc. And it wasn't clear to me whether updating
> > > >> batches in
> > > >> >> > > > place (and the producer/consumer coordination that comes
> with
> > > >> that)
> > > >> >> was
> > > >> >> > > > supported or encouraged as part of the proposal. But
> > > regardless,
> > > >> as
> > > >> >> an
> > > >> >> > > end
> > > >> >> > > > user, this seems like an easier and more efficient way to
> > glue
> > > >> pieces
> > > >> >> > in
> > > >> >> > > > the Arrow ecosystem together if it was adopted broadly.
> > > >> >> > > >
> > > >> >> > >
> > > >> >> > > > Paul
> > > >> >> > > >
> > > >> >> > >
> > > >> >> > > > On Tue, Feb 27, 2024 at 6:05 PM Matt Topol
> > > >> zotthewiz...@gmail.com
> > > >> >> > wrote:
> > > >> >> > > >
> > > >> >> > >
> > > >> >> > > > > I'll continue my efforts of trying to reach out to other
> > > >> interested
> > > >> >> > > > > parties, but if anyone else here has any contacts or
> > > >> connections
> > > >> >> that
> > > >> >> > > they
> > > >> >> > > > > think might be interested please forward them the link to
> > the
> > > >> >> Google
> > > >> >> > > doc.
> > > >> >> > > > >
> > > >> >> > >
> > > >> >> > > > > I really do want to get as much engagement and feedback
> as
> > > >> possible
> > > >> >> > on
> > > >> >> > > > > this.
> > > >> >> > > > >
> > > >> >> > >
> > > >> >> > > > > Thanks!
> > > >> >> > > > >
> > > >> >> > >
> > > >> >> > > > > On Tue, Feb 27, 2024, 6:38 PM Wes McKinney
> > > wesmck...@gmail.com
> > > >> >> > wrote:
> > > >> >> > > > >
> > > >> >> > >
> > > >> >> > > > > > Have there been efforts to proactively reach out to
> other
> > > >> third
> > > >> >> > > parties
> > > >> >> > > > > > that might have an interest in this or be a potential
> > user
> > > at
> > > >> >> some
> > > >> >> > > point?
> > > >> >> > > > > > There are a lot of interested parties in Arrow that may
> > not
> > > >> >> > actively
> > > >> >> > > > > > follow
> > > >> >> > > > > > the mailing list.
> > > >> >> > > > > >
> > > >> >> > >
> > > >> >> > > > > > Seems like folks from the Dask, Ray, RAPIDS (especially
> > > >> folks at
> > > >> >> > > NVIDIA
> > > >> >> > > > > > or
> > > >> >> > > > > > working on UCX), or other communities like that might
> > have
> > > >> >> > > constructive
> > > >> >> > > > > > thoughts about this. DLPack (
> > > >> >> https://dmlc.github.io/dlpack/latest/
> > > >> >> > )
> > > >> >> > > also
> > > >> >> > > > > > seems adjacent and worth reaching out to. Other ideas
> for
> > > >> >> projects
> > > >> >> > or
> > > >> >> > > > > > companies that could be reached out to for feedback.
> > > >> >> > > > > >
> > > >> >> > >
> > > >> >> > > > > > On Tue, Feb 27, 2024 at 5:23 PM Antoine Pitrou
> > > >> >> anto...@python.org
> > > >> >> > > > > > wrote:
> > > >> >> > > > > >
> > > >> >> > >
> > > >> >> > > > > > > If there's no engagement, then I'm afraid it might
> mean
> > > >> that
> > > >> >> > third
> > > >> >> > > > > > > parties have no interest in this. I don't really have
> > any
> > > >> >> > solution
> > > >> >> > > for
> > > >> >> > > > > > > generating engagement except nagging and pinging
> people
> > > >> >> > explicitly
> > > >> >> > > :-)
> > > >> >> > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > Le 27/02/2024 à 19:09, Matt Topol a écrit :
> > > >> >> > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > I would like to see the same Antoine, currently
> given
> > > the
> > > >> >> lack
> > > >> >> > of
> > > >> >> > > > > > > > engagement (both for OR against) I was going to
> take
> > > the
> > > >> >> > silence
> > > >> >> > > as
> > > >> >> > > > > > > > assent
> > > >> >> > > > > > > > and hope for non-Voltron Data PMC members to vote
> in
> > > >> this.
> > > >> >> > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > If anyone has any suggestions on how we could
> > > potentially
> > > >> >> > > generate
> > > >> >> > > > > > > > more
> > > >> >> > > > > > > > engagement and discussion on this, please let me
> know
> > > as
> > > >> I
> > > >> >> want
> > > >> >> > > as
> > > >> >> > > > > > > > many
> > > >> >> > > > > > > > parties in the community as possible to be part of
> > > this.
> > > >> >> > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > Thanks everyone.
> > > >> >> > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > --Matt
> > > >> >> > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > On Tue, Feb 27, 2024 at 12:48 PM Antoine Pitrou
> > > >> >> > > anto...@python.org
> > > >> >> > > > > > > > wrote:
> > > >> >> > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > Hello,
> > > >> >> > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > I'd really like to see more engagement and
> > criticism
> > > >> from
> > > >> >> > > > > > > > > non-Voltron
> > > >> >> > > > > > > > > Data parties before this is formally adopted as
> an
> > > >> Arrow
> > > >> >> > spec.
> > > >> >> > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > Regards
> > > >> >> > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > Antoine.
> > > >> >> > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > Le 27/02/2024 à 18:35, Matt Topol a écrit :
> > > >> >> > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > > Hey all,
> > > >> >> > > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > > I'd like to propose a vote for us to officially
> > > >> adopt the
> > > >> >> > > protocol
> > > >> >> > > > > > > > > > described in the google doc[1] for Dissociated
> > > Arrow
> > > >> IPC
> > > >> >> > > > > > > > > > Transports.
> > > >> >> > > > > > > > > > This
> > > >> >> > > > > > > > > > proposal was originally discussed at 2. Once
> this
> > > >> >> proposal
> > > >> >> > is
> > > >> >> > > > > > > > > > adopted,
> > > >> >> > > > > > > > > > I
> > > >> >> > > > > > > > > > will work on adding the necessary documentation
> > to
> > > >> the
> > > >> >> > Arrow
> > > >> >> > > > > > > > > > website
> > > >> >> > > > > > > > > > along
> > > >> >> > > > > > > > > > with examples etc.
> > > >> >> > > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > > The vote will be open for at least 72 hours.
> > > >> >> > > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > > [ ] +1 Accept this Proposal
> > > >> >> > > > > > > > > > [ ] +0
> > > >> >> > > > > > > > > > [ ] -1 Do not accept this proposal because...
> > > >> >> > > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > > Thank you everyone!
> > > >> >> > > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > > --Matt
> > > >> >> > > > > > > > > >
> > > >> >> > >
> > > >> >> > > > > > > > > > [1]:
> > > >> >> > > > >
> > > >> >> > >
> > > >> >> > > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1zHbnyK1r6KHpMOtEdIg1EZKNzHx-MVgUMOzB87GuXyk/edit#heading=h.38515dnp2bdb
> > > >> >> >
> > > >> >>
> > > >>
> > > >
> > >
> >
>

Reply via email to