On Fri, Aug 26, 2011 at 1:38 PM, Joel Meyer <[email protected]> wrote:
> On Fri, Aug 26, 2011 at 8:01 AM, Benson Margulies 
> <[email protected]>wrote:
>
>> Reading http://thrift.apache.org/static/thrift-20070401.pdf, I find,
>> in section 4.2:
>>
>> "Thrift structures are designed to support encoding into a streaming
>> protocol. The implementation should never need to frame or compute the
>> entire data length of a structure prior to encoding it."
>>
>> However, due to the possible gap between principle and practice, I
>> started by asking about this on the IRC channel, and Bryan Duxbury
>> answered that streaming is not supported.
>>
>> I am wondering if perhaps I am approaching my problem the wrong way,
>> so I am writing to describe it.
>>
>> The job in question is to run a potentially large blob of text into a
>> set of analytical components and return a large number of small items
>> that result. Note this is an online process; it's not appropriate to
>> chop the blob into chunks and feed it to a map-reduce system.
>>
>> There's no reason to ask Thrift or something like it to be involved in
>> pushing one giant string in one direction. Coming back the other way,
>> however, what I am looking at is a long sequence of the form [ type1,
>> struct1, type2, struct2 ... ]. I do not want to have all of them in
>> memory at once.
>>
>> What would the readers of this list suggest as an approach to this?
>>
>
> As you noted, there's no reason to use thrift for sending the long string,
> but on receiving end you could have a service definition something like
> this:
>
> service ParsedService {
>  void type1( 1: Struct1 s ),
>  void type2( 2: Struct2 s ),
>  ...
> }
>
> Then when sending back the result of parsing/processing your string you'd
> just end up with a lot of calls like:
>
> ParsedService.type1(struct1);
> ParsedService.type2(struct2);
> ParsedService.type1(struct1);
>

Thanks, this illuminates Bryan's IRC note. How does a service manage
state in this case, or will that be self-evident if I read some
example services? OOh. I see, you have reversed the client and the
server, with the server acting as a client. OK, this deserves some
thought.



> (You could add some sort of identifier if you need to tie the original
> string back to the parts.) Because the rpc calls are void, you're not
> waiting for a response from the server and it's very much like streaming the
> results back. Anyway, that's just the first approach that comes to mind, no
> doubt there are others that may work even better.
>
> HTH,
> Joel
>

Reply via email to