[ 
https://issues.apache.org/jira/browse/ARROW-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197330#comment-16197330
 ] 

Jacques Nadeau commented on ARROW-249:
--------------------------------------

Some key performance requirements from my perspective (note that I'm just 
skipping all the basic requirements):

* Send N sidecar ByteBufs in addition to a structured message on the sending 
side, only doing a single copy: the kernel copying the ByteBuf to the socket.
* Receive and pull out a single sidecar ByteBuf on the receiving-side, keeping 
it off heap and minimizing for its lifetime. Preferably zero copies but 
potentially a single copy (from a circular or rotating buffer to a final target 
buffer).
* Support a cooperative backpressure methodology when using multiple streams on 
a single connection.
* Handle out of direct memory situations without disrupting the connection 
(message failure as opposed to connection failure).

I'm trying to piece together how we might do this using some less than ideal 
usage of the GRPC apis. Not sure how bad the shenaniganry will need to be. Note 
that we have something we've built up at Dremio (and before) that we've found 
very reliable for the performance requirements above:

* https://github.com/dremio/dremio-oss/tree/master/services/base-rpc

Obviously, this is less well supported/contributed/featured than GRPC but used 
at many large scale production customers already. It doesn't do all the things 
that GRPC does but it does do the requirements above very well. On the 
flipside, it doesn't have all the client/language support, doesn't have all the 
security features, compression features, etc.




> [Format] Define GRPC IDL / wire protocol for messaging with Arrow data
> ----------------------------------------------------------------------
>
>                 Key: ARROW-249
>                 URL: https://issues.apache.org/jira/browse/ARROW-249
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Format
>            Reporter: Wes McKinney
>            Assignee: Jacques Nadeau
>             Fix For: 1.0.0
>
>
> In addition to memory maps / shared memory, we should be able to assemble one 
> or more Arrow arrays in a memory block suitable for sending over the wire 
> with GRPC (http://www.grpc.io/). This can be similarly adapted to other 
> messaging / RPC frameworks. 
> We can continue to use Flatbuffers for the metadata serialization. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to