[ 
https://issues.apache.org/jira/browse/HBASE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705388#action_12705388
 ] 

Bryan Duxbury commented on HBASE-794:
-------------------------------------

Wow, guess I really should have been watching this issue. I'll try and address 
some things.

Returning null: Thrift methods can't return null *directly*, but they can 
return a non-null struct with none of its fields set, or a non-null struct with 
a flag set. This isn't anything new necessarily, but I should note that we do 
this all over the place at Rapleaf to get around this restriction. You 
definitely do not need to use exceptions to communicate "null". Moreover, using 
exceptions this way is probably worse than you think, as I *think* returning an 
exception causes the connection to close, at least in some libraries. Also, it 
might be possible to allow null to be returned by Thrift methods _in general_, 
just for C++ to be unable to return null. If this is a do-or-die issue, please 
help us out by opening a ticket over on the Thrift JIRA so we can discuss 
solutions.

Thrift's Java RPC layer: I did in fact write a bunch of the server layer to use 
native Java NIO. This code lives in TNonblockingServer (single threaded) and 
THsHaServer (thread pool) respectively. Both server implementations also add 
some nice stuff like fixed total read buffer size (to protect server from 
overload). It's been very robust in our use of the code at Rapleaf so far. I 
would recommend it on the strength of my experiences. 

Garbage/instantiation cost: Thrift objects are probably a little more memory 
inefficient than they need to be right now due to some slightly naive 
implementation decisions, but I've taken some steps to reducing the overhead of 
an object. Additionally, you could probably reuse some instances of objects at 
the top level with almost no work. With a little work in the library, you could 
probably reuse most objects all the way down your instance's object tree, 
saving you memory. If you are more interested in this bullet, shoot me an email 
and we can talk about it in more detail.

Zero copy system: Right now, Thrift is not zero-copy. I think it would be very 
cool, though, to create the framework to make that happen. We'd probably only 
need to make a few transport interface changes. Maybe we should open a ticket?

Framed Transport: This is very effective at improving the performance of the 
Thrift IO stuff, especially if you're doing real IO without a buffer somewhere 
in between. It's also mandatory for using the nonblocking servers.

Custom protocols: Certainly, if you wanted to, you could write your own Thrift 
protocol. However, I would say this defeats the purpose of Thrift, in giving 
you a respectable cross-platform library out of the box. Further, protobuf as a 
Thrift protocol has been proposed before, and the two systems are not trivially 
compatible.

"Raw" RPC: If your goal is to avoid deserializing some stuff, Chad has 
previously suggested having the ability to specify that you don't want certain 
fields deserialized. I don't know if this is your objective. If your keys and 
values are actually just byte arrays on either side, then there isn't any 
serialization to speak of, beyond the byte[] copy off the wire. I could imagine 
doing something to make this a non-copy operation, though. (See comment above 
on zero-copy architecture.)

I think Andrew's idea of making a simulator is a great idea. Otherwise it's 
going to mean a ton of work and a subjective evaluation. 

I also want to say that there are few things I would like to improve as much as 
Thrift performance. Thrift is a cornerstone at Rapleaf, so anything we can do 
to make it faster is a big win. I am eager and willing to work with anyone who 
can show me use cases that identify slowness in Thrift so that I can erase the 
problem. 

> Language neutral IPC as a first class component of HBase architecture
> ---------------------------------------------------------------------
>
>                 Key: HBASE-794
>                 URL: https://issues.apache.org/jira/browse/HBASE-794
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, ipc, master, regionserver
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>
> This issue considers making a language neutral IPC mechanism and wire format 
> a first class component of HBase architecture. Clients could talk to the 
> master and regionserver using this protocol instead of HRPC at their option.
> Options for language neutral IPC include:
> * Thrift: http://incubator.apache.org/thrift/
> * Protocol buffers: http://code.google.com/p/protobuf/
> * XDR: http://en.wikipedia.org/wiki/External_Data_Representation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to