[ 
https://issues.apache.org/jira/browse/AVRO-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12830356#action_12830356
 ] 

Todd Lipcon commented on AVRO-406:
----------------------------------

Here's one proposal I discussed a bit with Philip yesterday:

An RPC can only be considered "streamable" if at least one of the following is 
true:
# The response type is array<X> and marked "streamed" in the protocol.
# The last request parameter is array<X> and marked "streamed" in the protocol.

The "streamed" marking is probably best suited as a mixin property (see 
AVRO-404) since this should be entirely backwards-compatible for RPC 
clients/servers that don't support streaming. It just specifies to the RPC 
client/server that a particular API should be used.

Let's take the following example RPCs for discussion:

{code}
record Chunk {
  fixed checksum(4);
  binary data;
}
streamed array<Chunk> getBlock(int blockId);
PutResult putBlock(int blockid, streamed array<Chunk> chunks)
{code}

The non-streamed java interfaces would look like:
{code}
List<Chunk> getBlock(int blockId);
PutResult putBlock(int blockId, List<Chunk> chunks);
{code}

If streaming is enabled for these chunks, it would change to:

{code}
Iterable<Chunk> getBlock(int blockId);
PutResult putBlock(int blockId, Iterable<Chunk> chunks);
{code}

In this case, the iterable would stream the data in from the network as it is 
iterated for putBlock. And getBlock would be responsible for returning an 
iterable that generates response packets until the block has been entirely sent.

Users of this would probably do something like the following most often:

record ReadHeader {
  int statusCode;
  ...
}
union ReadPacket { ReadHeader, Chunk };
streamed array<ReadPacket> getBlock(int blockId);

and then document that it will always send one ReadHeader followed by an 
unspecified number of chunks.

The streaming parameters and streaming response could certainly be used 
together (eg to provide a putBlock() that "acks" sequence numbers as they're 
written to disk)

For an event-driven server or client, the APIs would probably be more 
callback-oriented, but let's start here.

> Support streaming RPC calls
> ---------------------------
>
>                 Key: AVRO-406
>                 URL: https://issues.apache.org/jira/browse/AVRO-406
>             Project: Avro
>          Issue Type: New Feature
>          Components: java, spec
>            Reporter: Todd Lipcon
>
> Avro nicely supports chunking of container types into multiple frames. We 
> need to expose this to RPC layer to facilitate use cases like the Hadoop 
> Datanode where a single "RPC" can yield far more data than should be buffered 
> in memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to