Replace forked HBase RPC with Hadoop RPC
----------------------------------------
Key: HBASE-2742
URL: https://issues.apache.org/jira/browse/HBASE-2742
Project: HBase
Issue Type: Improvement
Components: ipc
Reporter: Gary Helmling
The HBase RPC code (org.apache.hadoop.hbase.ipc.*) was originally forked off of
Hadoop RPC classes, with some performance tweaks added. Those optimizations
have come at a cost in keeping up with Hadoop RPC changes however, both bug
fixes and improvements/new features.
In particular, this impacts how we implement security features in HBase (see
HBASE-1697 and HBASE-2016). The secure Hadoop implementation (HADOOP-4487)
relies heavily on RPC changes to support client authentication via kerberos and
securing and mutual authentication of client/server connections via SASL.
Making use of the built-in Hadoop RPC classes will gain us these pieces for
free in a secure HBase.
So, I'm proposing that we drop the HBase forked version of RPC and convert to
direct use of Hadoop RPC, while working to contribute important fixes back
upstream to Hadoop core. Based on a review of the HBase RPC changes, the key
divergences seem to be:
HBaseClient:
- added use of TCP keepalive (HBASE-1754)
- made connection retries and sleep configurable (HBASE-1815)
- prevent NPE if socket == null due to creation failure (HBASE-2443)
HBaseRPC:
- mapping of method names <-> codes (removed in HBASE-2219)
HBaseServer:
- use of TCP keep alives (HBASE-1754)
- OOME in server does not trigger abort (HBASE-1198)
HbaseObjectWritable:
- allows List<> serialization
- includes it's own class <-> code mapping (HBASE-328)
Proposed process is:
1. open issues with patches on Hadoop core for important fixes/adjustments from
HBase RPC (HBASE-1198, HBASE-1815, HBASE-1754, HBASE-2443, plus a pluggable
ObjectWritable implementation in RPC.Invocation to allow use of
HbaseObjectWritable).
2. ship a Hadoop version with RPC patches applied -- ideally we should avoid
another copy-n-paste code fork, subject to ability to isolate changes from
impacting Hadoop internal RPC wire formats
3. if all Hadoop core patches are applied we can drop back to a plain vanilla
Hadoop version
I realize there are many different opinions on how to proceed with HBase RPC,
so I'm hoping this issue will kick off a discussion on what the best approach
might be. My own motivation is maximizing re-use of the authentication and
connection security work that's already gone into Hadoop core. I'll put
together a set of patches around #1 and #2, but obviously we need some
consensus around this to move forward. If I'm missing other differences
between HBase and Hadoop RPC, please list as well. Discuss!
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.