This sounds interesting, though I also agree with David that staying simple has 
its benefits.

Given the complexity of this feature, I have somewhat low confidence that this 
would get consistently implemented across all of Thrift's language targets. 
Hence my gut leans toward seeing it as an extension package or be built on top 
of Thrift.

We've let a fair number of one-language-only things into the language libraries 
themselves, but I feel we should generally a little more protective over which 
features make it into the IDL itself and hold a pretty strict standard about 
"works-in-all-languages" there.

Definitely worth exploring though and would be interesting to see what the IDL 
implementation looks like.

One issue with the proposed design is that it seems somewhat inherently limited 
to blocking I/O, since the data transport semantics get mixed in with the 
application data access semantics via the Iterator construct. This might be 
fine, as a nonblocking interface for this functionality is likely to get pretty 
unwieldy, but it means that this wouldn't ever really be able to work properly 
with something like the C++ TNonBlockingServer. Doing so requires coordination 
of control flow between the core server loop and the application code, two bits 
that Thrift has specifically aimed to abstract away from each other.

Cheers,
Mark

-----Original Message-----
From: David Reiss [mailto:dre...@facebook.com] 
Sent: Sunday, February 28, 2010 9:29 AM
To: thrift-dev@incubator.apache.org
Subject: Re: Streaming responses

Thanks for the well-written proposal.  If we were to implement this,
I think you have described the way we would do it.

My main objection to implementing streaming RPC is a fear of feature creep.
Keeping Thrift small(ish) makes it a lot easier to make sure that all of
the language implementations have good support for all features.

What to other committers think?

--David

Jeff Brown wrote:
> Today I encountered a case where it would be useful if Thrift provided
> native support for streaming RPC.  To be sure, everything I needed to do I
> could implement (albeit with difficult) using the existing infrastructure
> but I feel it could be made much better by extending the IDL.
> 
> Please forgive me if this idea has already been discussed here.
> 
> *Background:
> *
> Sometimes a service needs to perform a long-running operation that
> periodically yields output.  Rather than waiting for the full result to be
> processed, it would be nice if clients can get a head start on the
> intermediate output.  This is particularly important when the results are
> intended to be presented to an end user.
> 
> A simple way to accomplish this trick is to allow services to write multiple
> messages to the protocol.
> 
> 
> *Proposed IDL extensions:
> *
> * Option 1: New *stream* keyword admitted as a method return type.
> 
> *struct Response
> {
>     1: required string item;
> }
> 
> service Example
> {
>     stream<Response> Method()
> }
> *
> With this approach, the method can be conceived of as returning a stream of
> zero or more responses.  This is particularly appealing because it leverages
> familiar patterns of iteration.
> 
> The server side implementation can be conveniently written as an iterator
> method.  Here's a C# example.  Similar syntax exists in many popular
> languages but it is not essential for the API.
> 
> public IEnumerable<Response> Method()
> {
>     yield return new Response() { Item = "a" };
>     // expensive work
>     yield return new Response() { Item = "b" };
> }
> 
> Likewise the client side can be modeled as a consumer of an iterator.
> 
> public void Consumer()
> {
>     IEnumerable<Response> responses = client.Method();
>     foreach (Response response in responses)
>     {
>         Console.WriteLine(response.item);
>     }
> }
> 
> * Option 2: New *streams* keyword admitted as an auxiliary clause similar to
> throws.
> 
> *struct Item
> {
>     1: required string name
> }
> 
> struct Response
> {
>     1: required string summary
> }
> 
> service Example
> {
>     Response Method() streams(1: Item item)
> }
> *
> With this approach, the method has both a standard return value and can
> stream items on the side.  It could potentially be set up to allow a method
> to provide multiple streams.
> 
> The server side implementation is straightforward.  The method just needs to
> be provided with an object that can publish to the stream in addition to
> whatever arguments it receives.
> 
> public Response Method(TStream<Item> stream)
> {
>     stream.Send(new Item() { name = "a" });
>     // expensive work
>     stream.Send(new Item() { name = "b" });
>     return new Reponse() { summary = "2 items" });
> }
> 
> The client side is not so nice since we can't leverage the iterator pattern
> because we could have multiple distinct streams.  Here's a synchronous
> implementation using a callback.
> 
> public void Consumer()
> {
>     TStreamCallback<Item> callback = (item) => Console.WriteLine(item.name);
>     Response response = client.Method(callback);
> }
> 
> 
> *Proposed protocol extension:*
> 
> To make this work, the server needs to be able to send multiple stream
> messages before the final reply.
> 
> One way to do that is to add a new TMessageType value called Stream that
> indicates that the TMessage contains a streaming response.  When the client
> receives a message of type Stream, it makes the content available to the
> consumer and then (conceptually) loops again until it sees a final message
> of type Reply.
> 
> The final reply message might only indicate success / failure if the stream
> is empty or if everything of interest has already been streamed by the time
> the method returns.
> 
> Adding a new TMessageType is a non-invasive change that won't break existing
> clients or servers unless they use streaming methods.
> 
> Jeff.

Reply via email to