I'm actually considering using Thrift in a similar way: as a fast, cross-language serialization-and-transport mechanism between a bunch of different apps in a pub/sub architecture.

There are a number of possible message types -- 100? -- and it won't always be possible for each consumer to know if the other supports a given message type, so I'd like to avoid each app having an RPC service for each possible message type; I'd rather hand them objects and let them just ignore the ones they don't care about.

I see a few different ways to do this:

1. Define only one struct with all possible fields for all possible messages, and a "type" field that lets you figure out what it is. It seems kinda stupid to do this, since one of the major reasons I'm interested in Thrift is type-awareness.

2. Modify the TService layer so that RPC arguments aren't statically typed: make it possible to declare an RPC call that accepts any struct. Feels like (void*), and probably also irritates purists who like the simple no-object-inheritance, no-function-overloading model Thrift uses today.

3. Build something custom at the TProcessor layer and skip TService altogether.

Both #2 and #3 require changing some guts. I think that would go something like this:

* Add a TMessageType (T_STRUCT?) that indicates "I'm sending you data, not calling a function!". Or is that what T_ONEWAY is for? * Modify TBinaryProtocol so that writeStructBegin() and writeStructEnd() aren't noops -- otherwise the receiver doesn't know what he's receiving! * Implement a TProcessor that can read the struct type, instantiate it, and do some sort of dispatch to the client app.

Any thoughts on this?  Has someone else already solved this?

Will


On Apr 3, 2009, at 6:22 PM, Brian Hammond wrote:

That's neat Joel. However, does this scale? I mean, the underlying assumption here is that clients are using persistent connections to the service, and you [not so simply] are sending messages back to the client over that same connection. Thus, your service now has to handle a potentially large number of client connections. Unless you're using something libev[ent] I don't see this scaling beyond say 20K connections. Two things, I could be missing something here, and this level of scalability is probably just fine for *many* types of services (perhaps not for a chat server though)!

I'm curious to hear other people's thoughts on this and how it could be made scalable since, well, I'm planning on using polling in my project since I am expecting potentially a very large number of simultaneous users of the service and my servers can only handle so many connections.

Thanks for sharing this.

Brian

On Apr 3, 2009, at 7:50 PM, Joel Meyer wrote:

On Thu, Apr 2, 2009 at 4:17 PM, Joel Meyer <[email protected]> wrote:

On Tue, Mar 24, 2009 at 5:01 PM, Doug Daniels <[email protected] >wrote:

Ok I definitely plan on giving the Async RPC methods a try tonight, but I figured I'd just throw out some questions before I get home to start
hacking
on this stuff.

The one-to-one message to RPC call Async solution will let a client send messages of any given type in my defined protocol, but how would a server respond to a client with a message that the client didn't request? For example say I was trying to write a FPS like Quake and I want to server to send position updates for all clients to all clients, how would i model
that
as a client RPC request for that. With the Async RPC solutions I could
make
a RPC call for Map<Integer, Position> getPositionUpdates(), Now say that
the
client needs to request 50 other messages to be notified of. I guess the solution would be to make an Async RPC call requesting those updates and respond to it when I receive it asynchronously and then reissue another Async RPC call for the next set of updates. It just seems inefficient to actively make the client request for data when the server could implicitly know that when connected on this game protocol I can just send these messages to the clients without them asking for it. Not to mention you'd
have make sure you don't "miss" sending a client a message if they
finished
their Async call but haven't reestablished a new one.


I think I've done something similar to what you're trying to do, and as long as you can commit to using only async messages it's possible to pull it off without having to start a server on the client to accept RPCs from the
server.

When your RPC is marked as async the server doesn't send a response and the client doesn't try to read one. So, if all your RPC calls from the client to the server are async you have effectively freed up the inbound half of the socket connection. That means that you can use it for receiving async messages from the server - the only catch is that you have to start a new
thread to read and dispatch the incoming async RPC calls.

In a typical Thrift RPC system you'd create a MyService.Processor on your server and a MyService.Client on your client. To do bidirectional async
message sending you'll need to go a step further and create a
MyService.Client on your server for each client that connects (this can be accomplished by providing your own TProcessorFactory) and then on each client you create a MyService.Processor. (This assumes that you've gone with a generic MyService definition like you described above that has a bunch of optional messages, another option would be to define separate service definitions for the client and server.) With two clients connected the
objects in existence would look something like this:

Server:
MyService.Processor mainProcessor - handles incoming async RPCs
MyService.Client clientA - used to send outgoing async RPCs to ClientA MyService.Client clientB - used to send outgoing async RPCs to ClientB

ClientA:
MyService.Client - used to send messages to Server
MyService.Processor clientProcessor - used (by a separate thread) to
process incoming async RPCs

ClientB:
MyService.Client - used to send messages to Server
MyService.Processor clientProcessor - used (by a separate thread) to
process incoming async RPCs

Hopefully that explains the concept. If you need example code I can try and pull something together (it will be in Java). The nice thing about this method is that you don't have to establish two connections, so you can get around the firewall issues others have mentioned. I've been using this method on a service in production and haven't had any problems. When you have a separate thread in your client running a Processor you're basically blocking on a read, waiting for a message from the server. The benefit of this is that you're notified immediately when the server shuts down instead of having to wait until you send a message and then finding out that the TCP
connection was reset.

Cheers,
Joel


Thanks for the feedback. I've created a simple example in Java demonstrating
this in action:
http://www.joelpm.com/wp-content/uploads/2009/04/bidimessages.tgz

Post with a few details on the implementation:
http://www.joelpm.com/2009/04/03/thrift-bidirectional-async-rpc/

Please add me to the list of people who think there's value in a full async transport that provides (optional?) synchronization at the api level using
futures/deferreds/etc.

Cheers,
Joel





The biggest issue is that not all client request will result in a single response (like shooting a bullet, may blowup an entity, and damage all players in the area those events are seperate messages sent from the
respective entities).

At a game development studio I used to work at we developed a cross
language
IDL network protocol definition (C++, Java) very similiar to Protocol Buffers and Thrift (without some of the more mature features like being transport agnostic we explicitly built it for binary TCP socket transport, or protocol versioning), the stream of packets would contain as the first
32
bits a message ID that would be a key to a map a Message class that would
have methods to read in that message type from a byte[] stream.

Looking through Thrift code in the TBinaryProtocol writeMessage it looks like it's including the name of the message being sent and it's type (is
the
concept of Message in thrift the same as RPC?), if so what's the
corresponding code pathway for the client waiting for an RPC response because if I could just use this message name or type to key into what I need to serialize off the network from both client and server end then
that
would be perfect.



On Tue, Mar 24, 2009 at 1:51 PM, Ted Dunning <[email protected]>
wrote:

I really think that using async service methods which are matched one to
one
with the message types that you want to send gives you exactly the
semantics
that are being requested with very simple implementation cost.

It is important to not get toooo hung up on what RPC stands for. I use async methods all the time to stream data structures for logging and it works great. Moreover, it provides a really simple way of building
extractors and processors for this data since I have an interface
sitting
there that will tell me about all of the methods (data types) that I
need
to
handle or explicitly ignore.

So the trick works and works really well.  Give it a try!

On Tue, Mar 24, 2009 at 8:23 AM, Bryan Duxbury <[email protected]>
wrote:

Optional fields are not serialized onto the wire. There is a slight performance penalty at serialization time if you have a ton of unset
fields,
but that's it.

Am I over complicating things


Personally, sounds like it to me. Why do you need this streaming
behavior
or whatnot? Hotwiring the rpc stack to let you send any message you
want
is
going to be a ton of work and not really that much of a functionality
improvement.

-Bryan




--
Ted Dunning, CTO
DeepDyve






Reply via email to