Replace forked HBase RPC with Hadoop RPC
----------------------------------------

                 Key: HBASE-2742
                 URL: https://issues.apache.org/jira/browse/HBASE-2742
             Project: HBase
          Issue Type: Improvement
          Components: ipc
            Reporter: Gary Helmling


The HBase RPC code (org.apache.hadoop.hbase.ipc.*) was originally forked off of 
Hadoop RPC classes, with some performance tweaks added.  Those optimizations 
have come at a cost in keeping up with Hadoop RPC changes however, both bug 
fixes and improvements/new features.  

In particular, this impacts how we implement security features in HBase (see 
HBASE-1697 and HBASE-2016).  The secure Hadoop implementation (HADOOP-4487) 
relies heavily on RPC changes to support client authentication via kerberos and 
securing and mutual authentication of client/server connections via SASL.  
Making use of the built-in Hadoop RPC classes will gain us these pieces for 
free in a secure HBase.

So, I'm proposing that we drop the HBase forked version of RPC and convert to 
direct use of Hadoop RPC, while working to contribute important fixes back 
upstream to Hadoop core.  Based on a review of the HBase RPC changes, the key 
divergences seem to be:

HBaseClient:
 - added use of TCP keepalive (HBASE-1754)
 - made connection retries and sleep configurable (HBASE-1815)
 - prevent NPE if socket == null due to creation failure (HBASE-2443)

HBaseRPC:
 - mapping of method names <-> codes (removed in HBASE-2219)

HBaseServer:
 - use of TCP keep alives (HBASE-1754)
 - OOME in server does not trigger abort (HBASE-1198)

HbaseObjectWritable:
 - allows List<> serialization
 - includes it's own class <-> code mapping (HBASE-328)


Proposed process is:

1. open issues with patches on Hadoop core for important fixes/adjustments from 
HBase RPC (HBASE-1198, HBASE-1815, HBASE-1754, HBASE-2443, plus a pluggable 
ObjectWritable implementation in RPC.Invocation to allow use of 
HbaseObjectWritable).

2. ship a Hadoop version with RPC patches applied -- ideally we should avoid 
another copy-n-paste code fork, subject to ability to isolate changes from 
impacting Hadoop internal RPC wire formats

3. if all Hadoop core patches are applied we can drop back to a plain vanilla 
Hadoop version


I realize there are many different opinions on how to proceed with HBase RPC, 
so I'm hoping this issue will kick off a discussion on what the best approach 
might be.  My own motivation is maximizing re-use of the authentication and 
connection security work that's already gone into Hadoop core.  I'll put 
together a set of patches around #1 and #2, but obviously we need some 
consensus around this to move forward.  If I'm missing other differences 
between HBase and Hadoop RPC, please list as well.  Discuss!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to