[ 
https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879800#action_12879800
 ] 

Gary Helmling commented on HBASE-2742:
--------------------------------------

On the issues listed:
HBASE-2443 looks to be already present in Hadoop trunk.  HBASE-1198 (OOME) is 
really about the RS not getting notified of an OOME happening in the RPC layer. 
 Seems like bad practice to me to swallow the exception and an important fix 
for us, but it may not be an issue or be handled differently by the Hadoop 
procs?  The other 2 are not applied, but maybe not high priority for Hadoop 
core:
* HBASE-1815: configurable max retries and sleep for connection timeouts
* HBASE-1754: allow use of TCP keep alive

@Todd
Understood about the long timeline to get any of these changes committed.  
Based on that and discussion on IRC, maybe a reasonable alternative would be to 
just update our copy-paste fork with the security enabled code.  I can still 
file the issues against core, as they seem generally useful, and in the future, 
either core integrates them or we move to our own alternate RPC (avro), either 
way potentially allowing us to drop the fork later on.

@Stack
I'm generally pro alternate rpc for HBase internal purposes.  But I'm hesitant 
to make the security implementation (already a big project) depend on replacing 
internal rpc with avro (another big project).  I just envision too many 
complications with one bleeding into the other.  I also think that 
piggy-backing off the client user authentication and SASL setup code already in 
the secure Hadoop RPC will get us a good way started on secure HBase.

In any case, for a full security implementation, we will need hooks into the 
new Avro connector, Thrift, Stargate/REST, etc. in order to secure all 
endpoints.  But it seems wiser to leave those as separate tasks and off the 
critical path.  And work done there could then be used to enable an Avro-based 
internal RPC.

For the HBase authentication implementation, the plan for the first stage is to 
have HBase authenticate clients and perform access control, but have HBase 
interact with HDFS as a single server principal -- so all of /hbase is owned by 
an "hbase" user, say.  There are some issues to work out there with bulk 
loading mechanisms, but it's much simpler than passing ownership/credentials 
all the way down to HDFS.  We've discussed a future stage mapping users to file 
ownership for isolation all the way down, but it's not clear that we will wind 
up going there or that it even provides better security to do so.  So passing 
client credentials through to NN may not be necessary at all.

Based on all of the above, I guess I'm leaning towards updating our copy-paste 
fork as part of the security implementation, at least as a first step.  The 
original proposal to ship a patched Hadoop seems a non-starter if the changes 
won't be quickly integrated and we want to support multiple versions.

> Replace forked HBase RPC with Hadoop RPC
> ----------------------------------------
>
>                 Key: HBASE-2742
>                 URL: https://issues.apache.org/jira/browse/HBASE-2742
>             Project: HBase
>          Issue Type: Improvement
>          Components: ipc
>            Reporter: Gary Helmling
>
> The HBase RPC code (org.apache.hadoop.hbase.ipc.*) was originally forked off 
> of Hadoop RPC classes, with some performance tweaks added.  Those 
> optimizations have come at a cost in keeping up with Hadoop RPC changes 
> however, both bug fixes and improvements/new features.  
> In particular, this impacts how we implement security features in HBase (see 
> HBASE-1697 and HBASE-2016).  The secure Hadoop implementation (HADOOP-4487) 
> relies heavily on RPC changes to support client authentication via kerberos 
> and securing and mutual authentication of client/server connections via SASL. 
>  Making use of the built-in Hadoop RPC classes will gain us these pieces for 
> free in a secure HBase.
> So, I'm proposing that we drop the HBase forked version of RPC and convert to 
> direct use of Hadoop RPC, while working to contribute important fixes back 
> upstream to Hadoop core.  Based on a review of the HBase RPC changes, the key 
> divergences seem to be:
> HBaseClient:
>  - added use of TCP keepalive (HBASE-1754)
>  - made connection retries and sleep configurable (HBASE-1815)
>  - prevent NPE if socket == null due to creation failure (HBASE-2443)
> HBaseRPC:
>  - mapping of method names <-> codes (removed in HBASE-2219)
> HBaseServer:
>  - use of TCP keep alives (HBASE-1754)
>  - OOME in server does not trigger abort (HBASE-1198)
> HbaseObjectWritable:
>  - allows List<> serialization
>  - includes it's own class <-> code mapping (HBASE-328)
> Proposed process is:
> 1. open issues with patches on Hadoop core for important fixes/adjustments 
> from HBase RPC (HBASE-1198, HBASE-1815, HBASE-1754, HBASE-2443, plus a 
> pluggable ObjectWritable implementation in RPC.Invocation to allow use of 
> HbaseObjectWritable).
> 2. ship a Hadoop version with RPC patches applied -- ideally we should avoid 
> another copy-n-paste code fork, subject to ability to isolate changes from 
> impacting Hadoop internal RPC wire formats
> 3. if all Hadoop core patches are applied we can drop back to a plain vanilla 
> Hadoop version
> I realize there are many different opinions on how to proceed with HBase RPC, 
> so I'm hoping this issue will kick off a discussion on what the best approach 
> might be.  My own motivation is maximizing re-use of the authentication and 
> connection security work that's already gone into Hadoop core.  I'll put 
> together a set of patches around #1 and #2, but obviously we need some 
> consensus around this to move forward.  If I'm missing other differences 
> between HBase and Hadoop RPC, please list as well.  Discuss!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to