Sounds great!

Put a patch into:
https://issues.apache.org/jira/browse/AVRO-539
(We may rename that and modify the description to better describe your change; 
but it clearly fulfills the intention of the original JIRA)

If you'd like you can also put it in ReviewBoard, with a link to that inside 
the ticket.  But until there is a patch uploaded in JIRA granting license to 
Apache it can't be considered for committing.

We can discuss remaining details in the ticket.

Thanks!

On 6/1/11 10:22 PM, "James Baldassari" 
<[email protected]<mailto:[email protected]>> wrote:

I just finished a second attempt at the asynchronous RPC implementation 
incorporating Philip's feedback and some other ideas that I had.  I think it's 
easiest to explain how it works with an example.  So here's a simple IDL and 
schema:

IDL:
protocol Calculator {
  int add(int arg1, int arg2);
}

Schema:
{"protocol":"Calculator","messages":{
  "add":{
    "request":[{"name":"arg1","type":"int"},{"name":"arg2","type":"int"}],
    "response":"int"}}}

No changes are required to the IDL or schema to enable async RPCs.  The Avro 
Java compiler will generate two interfaces instead of one.  The first 
interface, Calculator, contains the standard synchronous methods.  The second 
interface, CalculatorClient, extends Calculator and adds asynchronous methods 
for all two-way messages.  The reason why the async methods are separated out 
into a separate interface is that the responder/server side doesn't need to 
know (and shouldn't know) about the client-side async methods.  So the 
Responder/server implements Calculator, and the Requestor/client can either use 
Calculator or CalculatorClient to invoke the RPCs.  For reference, here is what 
the two generated interfaces look like (without the PROTOCOL field and package 
names):

public interface Calculator {
  int add(int arg1, int arg2) throws AvroRemoteException;
}
public interface CalculatorClient extends Calculator {
  CallFuture<Integer> addAsync(int arg1, int arg2) throws IOException;
  CallFuture<Integer> addAsync(int arg1, int arg2, Callback<Integer> callback) 
throws IOException;
}

The CalculatorClient interface is the only new component.  It has two methods 
for each message, one that takes a Callback and one that does not.  Both 
methods return a CallFuture so that the client has the option of using either 
the Future or the Callback to obtain the result of the RPC.  Future.get() 
blocks until the RPC is complete, and either returns the result or throws an 
exception if one occurred during the RPC.  The Callback interface has two 
methods, handleResult(T result) and handleError(Exception error).  One or the 
other is always called depending on whether the RPC was successful or an 
Exception was thrown.

In addition to the compiler changes, I had to make some changes in the avro-ipc 
project to get the async plumbing to work correctly.  Most of these changes are 
in Requestor and NettyTransceiver.  As part of the changes I had to make to 
Requestor I ended up replacing a couple of large synchronized blocks with 
finer-grained critical sections protected by reentrant locks.  I think this 
change improved performance overall, at least in the case where multiple 
threads are using the same client.  I implemented a rudimentary performance 
test that spins up a bunch of threads, executes the same RPC 
(Simple.hello(String)) repeatedly for a fixed amount of time, and then 
calculates the average number of RPCs completed per second.  With Avro 1.5.1 I 
got 7,450 RPCs/sec, and with my modified version of trunk I got 19,050 
RPCs/sec.  That was a very simple test, but if there is a standard benchmark 
that the Avro team uses I'd be happy to rerun my tests using that.

So that's basically it.  All existing unit tests pass, and I wrote additional 
tests for all the new async functionality.  I've documented all public 
interfaces, and I think the changes are ready to be reviewed if any of the 
committers have time to take a look.  Should I post a patch somewhere?  
AVRO-539?  ReviewBoard?

-James


On Tue, May 31, 2011 at 9:09 PM, James Baldassari 
<[email protected]<mailto:[email protected]>> wrote:
Thanks for the helpful feedback!

After thinking about this more, I agree that it would be cleaner and simpler to 
remove the "async" keyword/property from the IDL and schema.  Instead I'll just 
generate the asynchronous companion methods for all two-way messages.

Regarding the passing of RPC results and exceptions/errors back to the client 
asynchronously, I'm not sure what the best approach is.  I had considered the 
use of both the future pattern and the callback pattern, but I think the 
callback pattern is more useful in general.  One option I had considered was 
having the async methods accept a Callback parameter and also return a Future 
so that the client could choose which pattern to use on a case-by-case basis.  
I think I may go with the combined callback/future approach as it provides 
clients with the most flexibility.

For Futures, error handling is straightforward: the get() method either returns 
the result of the callback or throws the exception if one occurred.  The 
correct solution for the callback approach is less obvious, but I'll tell you 
the approach that I took, and I'm open to comments.  The Callback interface has 
two methods, one for handling a successful RPC result, and one for handling 
exceptions that occur during RPC execution.  For each RPC, either one or the 
other will be called, but never both.  The methods look like this:

void handleCallback(T result);
void handleError(Exception error);

I'll work on making the changes you suggested, and hopefully I'll have a patch 
that's in decent shape by the end of the week.

-James



On Tue, May 31, 2011 at 4:40 PM, Philip Zeyliger 
<[email protected]<mailto:[email protected]>> wrote:
Hi James,

I think this is a fine approach.  You're correct that the place to do it is in 
the code generator.  It's not obvious to me that it belongs in the schema, 
however, since you might have two different pieces of software that want to use 
the same RPCs differently.  Is there any harm in always generating both types?

As for your API, you'll want to specify very explicitly what happens to the 
exceptions that an Avro RPC call may declare.  Future<V> is one mechanism.  As 
is your Callback<Foo> mechanism, if there's a way to get at exceptional states.


On Tue, May 31, 2011 at 12:08 AM, James Baldassari 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

I recently started playing with Avro RPCs, and while it's great that Avro can 
use Netty for asynchronous I/O, all client-facing Java RPC interfaces are 
currently synchronous (as far as I can tell).  It would be nice to have 
asynchronous RPCs that use callbacks to take full advantage of Netty's 
asynchronous features.  I found at least one other request for this feature on 
the Avro list (http://bit.ly/iCD0ae), and I believe this enhancement is already 
documented as AVRO-539.

I took it on as a weekend project to add async Java RPCs to Avro, and I think I 
have it all working now including unit tests.  I'd love to contribute this 
patch once I've gotten some feedback, cleaned up the documentation, and written 
a few more tests.  I'll give a quick example demonstrating this new 
functionality.  Consider the following IDL and its associated schema which use 
the asynchronous feature:

IDL:
protocol Calculator {
  int add(int arg1, int arg2) async;
}

Schema:
{"protocol":"Calculator","messages":{
  "add":{
    "request":[{"name":"arg1","type":"int"},{"name":"arg2","type":"int"}],
    "response":"int",
    "async":true}}}

When the "async" keyword/property is present in a message, the Avro Java 
compiler generates two methods instead of one: the standard synchronous method 
and an asynchronous companion method.  For the example I gave above, the 
following interface would be generated (the static PROTOCOL field and package 
names are omitted for brevity):

public interface Calculator {
  int add(int arg1, int arg2) throws AvroRemoteException;
  void addAsync(int arg1, int arg2, Callback<Integer> callback) throws 
IOException;
}

The addAsync method returns immediately without blocking (except one special 
case which I'll describe shortly).  The callback provided as the last argument 
to addAsync will be invoked with the result of the RPC as soon as the server 
returns it.  There are a couple of caveats.  The first RPC, whether synchronous 
or asynchronous, must block until the client-server handshake has been 
completed.  Subsequent async RPCs will never block.  Also, only the 
NettyTransceiver currently works with async RPCs; all other transceivers throw 
UnsupportedOperationException.

So that's the basic overview of my changes.  Please let me know if there are 
any comments or questions.  Is this something people are interested in?  If so, 
should I post the patch in AVRO-539 or somewhere else?

Thanks,
James




Reply via email to