[
https://issues.apache.org/jira/browse/HBASE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705388#action_12705388
]
Bryan Duxbury commented on HBASE-794:
-------------------------------------
Wow, guess I really should have been watching this issue. I'll try and address
some things.
Returning null: Thrift methods can't return null *directly*, but they can
return a non-null struct with none of its fields set, or a non-null struct with
a flag set. This isn't anything new necessarily, but I should note that we do
this all over the place at Rapleaf to get around this restriction. You
definitely do not need to use exceptions to communicate "null". Moreover, using
exceptions this way is probably worse than you think, as I *think* returning an
exception causes the connection to close, at least in some libraries. Also, it
might be possible to allow null to be returned by Thrift methods _in general_,
just for C++ to be unable to return null. If this is a do-or-die issue, please
help us out by opening a ticket over on the Thrift JIRA so we can discuss
solutions.
Thrift's Java RPC layer: I did in fact write a bunch of the server layer to use
native Java NIO. This code lives in TNonblockingServer (single threaded) and
THsHaServer (thread pool) respectively. Both server implementations also add
some nice stuff like fixed total read buffer size (to protect server from
overload). It's been very robust in our use of the code at Rapleaf so far. I
would recommend it on the strength of my experiences.
Garbage/instantiation cost: Thrift objects are probably a little more memory
inefficient than they need to be right now due to some slightly naive
implementation decisions, but I've taken some steps to reducing the overhead of
an object. Additionally, you could probably reuse some instances of objects at
the top level with almost no work. With a little work in the library, you could
probably reuse most objects all the way down your instance's object tree,
saving you memory. If you are more interested in this bullet, shoot me an email
and we can talk about it in more detail.
Zero copy system: Right now, Thrift is not zero-copy. I think it would be very
cool, though, to create the framework to make that happen. We'd probably only
need to make a few transport interface changes. Maybe we should open a ticket?
Framed Transport: This is very effective at improving the performance of the
Thrift IO stuff, especially if you're doing real IO without a buffer somewhere
in between. It's also mandatory for using the nonblocking servers.
Custom protocols: Certainly, if you wanted to, you could write your own Thrift
protocol. However, I would say this defeats the purpose of Thrift, in giving
you a respectable cross-platform library out of the box. Further, protobuf as a
Thrift protocol has been proposed before, and the two systems are not trivially
compatible.
"Raw" RPC: If your goal is to avoid deserializing some stuff, Chad has
previously suggested having the ability to specify that you don't want certain
fields deserialized. I don't know if this is your objective. If your keys and
values are actually just byte arrays on either side, then there isn't any
serialization to speak of, beyond the byte[] copy off the wire. I could imagine
doing something to make this a non-copy operation, though. (See comment above
on zero-copy architecture.)
I think Andrew's idea of making a simulator is a great idea. Otherwise it's
going to mean a ton of work and a subjective evaluation.
I also want to say that there are few things I would like to improve as much as
Thrift performance. Thrift is a cornerstone at Rapleaf, so anything we can do
to make it faster is a big win. I am eager and willing to work with anyone who
can show me use cases that identify slowness in Thrift so that I can erase the
problem.
> Language neutral IPC as a first class component of HBase architecture
> ---------------------------------------------------------------------
>
> Key: HBASE-794
> URL: https://issues.apache.org/jira/browse/HBASE-794
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client, ipc, master, regionserver
> Reporter: Andrew Purtell
> Assignee: Andrew Purtell
> Priority: Minor
>
> This issue considers making a language neutral IPC mechanism and wire format
> a first class component of HBase architecture. Clients could talk to the
> master and regionserver using this protocol instead of HRPC at their option.
> Options for language neutral IPC include:
> * Thrift: http://incubator.apache.org/thrift/
> * Protocol buffers: http://code.google.com/p/protobuf/
> * XDR: http://en.wikipedia.org/wiki/External_Data_Representation
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.