[ https://issues.apache.org/jira/browse/AVRO-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831253#action_12831253 ]
Todd Lipcon commented on AVRO-406: ---------------------------------- bq. if you say only the first enclosing array is 'streaming' that means the sub-array is NOT streamed, right? Correct. To be really technical, on a wire level we *could* stream any structure that is "tail streamable"... by which I mean array<foo>, or array<array<foo>>, or array<array<MyRecord>> where MyRecord's last field is "tail streamable". However, it will be impossible to enforce that clients or servers consume/provide the values in the correct order. For example: {code} void doStuff(Iterable<Iterable<Foo>> inputFoos) { for (Iterable<Foo> fooIter : inputFoos) { for (Foo foos : fooIter) { // do something with foo } } } {code} could work, since the user is consuming the input in the same order it's being serialized on the wire. However, if the outer iterator were moved before all of the inner iterator's data was consumed, it would no longer work (the second array<Foo> isn't available until the first array<Foo> is done). Granted, we could "skip ahead" at this point, but I think this complexity would be very bad, and probably not clear for framework users either. For your use case, could you get by with a bit more application-level logic and change your array<array<Cell>> to something more like: {code} record ResponseChunk { boolean continuingPreviousRow; array<Cell> cells; } array<ResponseChunk> getCells(...) {code} where you'd send a few cells at a time in a ResponseChunk, and unwrap them on the other side into whatever user-level API you want? bq. If so, then streaming excessively large objects in the process of streaming normal and other associated objects might not be the right thing to do. Sorry, I couldn't parse this sentence. Can you explain further what you mean? I guess you're referring to streaming large binary values? If so, I think it will be impossible to do it in a general way from the API even if the wire protocol supports it. The large binary values can always be "chunked" as above and it shouldn't be a big hassle for developers, right? (should be noted this is probably an "advanced feature" that only a few hardcore apps will need to use... in particular HBase and Hadoop :) ) > Support streaming RPC calls > --------------------------- > > Key: AVRO-406 > URL: https://issues.apache.org/jira/browse/AVRO-406 > Project: Avro > Issue Type: New Feature > Components: java, spec > Reporter: Todd Lipcon > > Avro nicely supports chunking of container types into multiple frames. We > need to expose this to RPC layer to facilitate use cases like the Hadoop > Datanode where a single "RPC" can yield far more data than should be buffered > in memory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.