Thanks for the response, that makes sense, I'll update the documentation based on this and file a JIRA to potentially split up each buffer off the wire.
On Tue, Feb 26, 2019 at 9:30 AM Jacques Nadeau <jacq...@apache.org> wrote: > > > > 1. What is meant by "sidecar patterns" [2] on the data buffer bytes? > > > The idea that really what we want is a structured body plus a bunch of > bytes that are traveling alongside and have an arbitrary encoding (a.k.a. a > sidecar). This isn't supported by the GRPC/proto definitions so instead we > define the bytes as the last field in the protocol and then actually hand > write the code to handle that outside the context of protobuf. > > > > 2. Was using "repeated bytes data_body" considered instead of a single > > body value? If we put back the "page id" in the buffer metadata (removed > > in [3]), it seems like we would get more flexibility for managing the > data > > coming off the wire (e.g. if there was a buffer per repeated element, we > > could free memory immediately on column projections). > > > > It seems like a bunch of extra encoding & duplication of what we already > have in the data_header. A consumer that wants to do a chunked read can > still do so by decoding the data_header and it avoids having to have > interspersing arbitrary small encoded values between each of the buffers > when writing the batch to the wire. >