Today I encountered a case where it would be useful if Thrift provided
native support for streaming RPC.  To be sure, everything I needed to do I
could implement (albeit with difficult) using the existing infrastructure
but I feel it could be made much better by extending the IDL.

Please forgive me if this idea has already been discussed here.

*Background:
*
Sometimes a service needs to perform a long-running operation that
periodically yields output.  Rather than waiting for the full result to be
processed, it would be nice if clients can get a head start on the
intermediate output.  This is particularly important when the results are
intended to be presented to an end user.

A simple way to accomplish this trick is to allow services to write multiple
messages to the protocol.


*Proposed IDL extensions:
*
* Option 1: New *stream* keyword admitted as a method return type.

*struct Response
{
    1: required string item;
}

service Example
{
    stream<Response> Method()
}
*
With this approach, the method can be conceived of as returning a stream of
zero or more responses.  This is particularly appealing because it leverages
familiar patterns of iteration.

The server side implementation can be conveniently written as an iterator
method.  Here's a C# example.  Similar syntax exists in many popular
languages but it is not essential for the API.

public IEnumerable<Response> Method()
{
    yield return new Response() { Item = "a" };
    // expensive work
    yield return new Response() { Item = "b" };
}

Likewise the client side can be modeled as a consumer of an iterator.

public void Consumer()
{
    IEnumerable<Response> responses = client.Method();
    foreach (Response response in responses)
    {
        Console.WriteLine(response.item);
    }
}

* Option 2: New *streams* keyword admitted as an auxiliary clause similar to
throws.

*struct Item
{
    1: required string name
}

struct Response
{
    1: required string summary
}

service Example
{
    Response Method() streams(1: Item item)
}
*
With this approach, the method has both a standard return value and can
stream items on the side.  It could potentially be set up to allow a method
to provide multiple streams.

The server side implementation is straightforward.  The method just needs to
be provided with an object that can publish to the stream in addition to
whatever arguments it receives.

public Response Method(TStream<Item> stream)
{
    stream.Send(new Item() { name = "a" });
    // expensive work
    stream.Send(new Item() { name = "b" });
    return new Reponse() { summary = "2 items" });
}

The client side is not so nice since we can't leverage the iterator pattern
because we could have multiple distinct streams.  Here's a synchronous
implementation using a callback.

public void Consumer()
{
    TStreamCallback<Item> callback = (item) => Console.WriteLine(item.name);
    Response response = client.Method(callback);
}


*Proposed protocol extension:*

To make this work, the server needs to be able to send multiple stream
messages before the final reply.

One way to do that is to add a new TMessageType value called Stream that
indicates that the TMessage contains a streaming response.  When the client
receives a message of type Stream, it makes the content available to the
consumer and then (conceptually) loops again until it sees a final message
of type Reply.

The final reply message might only indicate success / failure if the stream
is empty or if everything of interest has already been streamed by the time
the method returns.

Adding a new TMessageType is a non-invasive change that won't break existing
clients or servers unless they use streaming methods.

Jeff.

Reply via email to