Make Thrift EventMachine Compatible
-----------------------------------

                 Key: THRIFT-146
                 URL: https://issues.apache.org/jira/browse/THRIFT-146
             Project: Thrift
          Issue Type: Improvement
          Components: Library (Ruby)
         Environment: Ruby and EventMachine
            Reporter: Ben Taitelbaum


I wrote a prototype EventMachine client/server for Thrift (BinaryProtocol), and 
while it's not fully functional, the results are so promising, that I think it 
would be worth it to either include an EM client/server with the thrift 
distribution, or at least rework the libraries so that they can support EM 
without having to copy/paste all the code I did.

I'm attaching my prototype as an example. It requires the eventmachine and 
statemachine gems. It's wired to support a simple echo service. It's a little 
confusing in that I reused the same statemachine for both client and server, 
doesn't handle exceptions, and has a bug where it'll handle all requests from 
one client before handling requests from another, but hey, it's a prototype.

EM uses very different semantics from typical client/server, and this presents 
difficulty wiring it to the current thrift protocols. For example, while 
currently, you can call call prot.read(n) to read n bytes off of a stream, or 
wait until they're available, with EM, if n bytes aren't available, you simply 
don't handle the call yet. I handled this by trying to read as many bytes as 
need be, and keeping track of how much to backup in the stream, ultimately only 
transitioning to the next state in the statemachine if we were able to read all 
the bytes needed. This actually works fairly well, but it would be better if we 
had a byte count at each step so we could tell right away if we had enough data.

The use of EM allows for new asynchronous semantics for thrift calls, in that 
the client returns right away from a call, and its callback is called when a 
result (or exception) is available. Backwards compatibility could still be 
achieved by blocking on a call, but at least in my experience, in most of the 
cases having callbacks would be a huge optimization. There are some cases where 
I do want to either ensure that calls happen sequentially, or are at least 
handled in the same order.

These are the changes I believe would have to be made in order to support an EM 
thrift connection:
1. syntax to indicate that certain methods can be called asynchronously with 
callbacks. Since async is already a reserved keyword, and is changing to 
noreturn, what about using this?
2. I would propose moving all reading and writing to the same class. So instead 
of having ThriftStruct#read, we would have Protocol#read_struct. It seems out 
of place to me to have a model handle its own reading / writing.
3. support for byte counts in the protocols. It would be helpful to have a 
clear way to get the number of bytes that must be read to retrieve message 
headers, arguments, and results. Alternatively, the protocols could support 
read_or_backup semantics, but as you can see in my prototype, this can get 
messy.

Please let me know what you think about this proposal, and the possibility of 
using EventMachine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to