Reading http://thrift.apache.org/static/thrift-20070401.pdf, I find, in section 4.2:
"Thrift structures are designed to support encoding into a streaming protocol. The implementation should never need to frame or compute the entire data length of a structure prior to encoding it." However, due to the possible gap between principle and practice, I started by asking about this on the IRC channel, and Bryan Duxbury answered that streaming is not supported. I am wondering if perhaps I am approaching my problem the wrong way, so I am writing to describe it. The job in question is to run a potentially large blob of text into a set of analytical components and return a large number of small items that result. Note this is an online process; it's not appropriate to chop the blob into chunks and feed it to a map-reduce system. There's no reason to ask Thrift or something like it to be involved in pushing one giant string in one direction. Coming back the other way, however, what I am looking at is a long sequence of the form [ type1, struct1, type2, struct2 ... ]. I do not want to have all of them in memory at once. What would the readers of this list suggest as an approach to this?
