[
https://issues.apache.org/jira/browse/ARROW-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014648#comment-16014648
]
Wes McKinney edited comment on ARROW-1047 at 5/17/17 7:21 PM:
--------------------------------------------------------------
The benefits of this work is that stream readers and writers would not need to
know about the underlying transport (whether the messages are being written
directly to a byte channel, or placed in a queue to be sent asynchronously
through some RPC protocol).
was (Author: wesmckinn):
The benefits of this work is that stream readers and writers would not need to
know about the underlying transport (whether the messaging are being written
directly to a byte channel, or placed in a queue to be sent asynchronously
through some RPC protocol).
> [Java] Add generalized stream writer and reader interfaces that are decoupled
> from IO / message framing
> -------------------------------------------------------------------------------------------------------
>
> Key: ARROW-1047
> URL: https://issues.apache.org/jira/browse/ARROW-1047
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Java - Vectors
> Reporter: Wes McKinney
>
> cc [~julienledem] [~elahrvivaz] [~nongli]
> The ArrowWriter
> https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/file/ArrowWriter.java
> accepts a WriteableByteChannel where the stream is written
> It would be useful to be able to support other kinds of message framing and
> transport, like GRPC or HTTP. So rather than writing a complete Arrow stream
> as a single contiguous byte stream, the component messages (schema,
> dictionaries, and record batches) would be framed as separate messages in the
> underlying protocol.
> So if we were using ProtocolBuffers and gRPC as the underlying transport for
> the stream, we could encapsulate components of an Arrow stream in objects
> like:
> {code:language=protobuf}
> message ArrowMessagePB {
> required bytes serialized_data;
> }
> {code}
> If the transport supports zero copy, that is obviously better than
> serializing then parsing a protocol buffer.
> We should do this work in C++ as well to support more flexible stream
> transport.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)