Chad Walters wrote:
Re-open that discussion and I imagine you might get some interested parties.

I think I just did, no?

Bumping up a level, rather than inventing a whole new set of Hadoop-specific 
RPC and serialization mechanisms

Whatever we use, we'd probably end up recycling much of Hadoop's client/server implementation, since it's been finely tuned for Hadoop's performance needs, and I've not yet seen a Thrift transport that looks appropriate. We also need to add authentication and authorization layers to Hadoop's RPC, which don't exist in Thrift either, as far as I can tell. So mostly what we'd use from Thrift directly is object serialization.

That said, if we use Thrift for object serialization then we'd probably eventually contribute our transport, authentication and authorization stuff to the Thrift project. We'd probably want to build it first in Hadoop, since it's critical kernel stuff for Hadoop, but, once it's stable, contribute it to Thrift if it seemed useful to others.

As a serialization layer, Thrift lacks the self-describing stuff that I think is critical. If JSON will be the primary text format, then it looks to me that it would be easier and more natural to base a binary self-describing format on JSON schema than on Thrift IDL, but perhaps I can be convinced otherwise.

Doug

Reply via email to