Hi folks,

With Yahoo's latest security release on github (
http://github.com/yahoo/hadoop-common/tree/yahoo-hadoop-0.20.104), it looks
like we now have a real-world usable version of secure Hadoop, based on
0.20.  This is exciting stuff, because now we have something solid to start
working towards implementing similar security controls in HBase (HBASE-1697,
HBASE-2014, HBASE-2016, HBASE-2420)!

However, this is going to be a large undertaking, with a strong dependency
on the secure Hadoop branch (more on that in a bit --  unfortunately the
fragmented hadoop-0.20 world is already leaking through).  So I'd like to
propose a feature branch in the HBase svn repo for security work, to:

1) ensure that changes towards implementing secure HBase have an ASF home
2) provide more visibility and granularity for review (esp. JIRA &
reviewboard usage)
3) ease interaction/integration with other branched changes underway (master
rewrite)

I've already started pushing some preliminary changes up to github (
http://github.com/ghelmling/hbase/tree/security), and will continue to do
so, but I'd like to avoid both massive patch sets accumulating too many
changes and making interested committers & contributors go digging to see
what the current state is.

On the secure Hadoop branch dependency -- I've integrated the
org.apache.hadoop.ipc changes into o.a.h.hbase.ipc.* (HBASE-2742) and run
into a couple complications:

* Hadoop RPC version rolled from 3 to 4 (apparently 0.20-append also does
this!)
* various bits in the updated HBaseClient, HBaseServer, etc. now depend on
the security implementation, so building and running on top of non-secure
Hadoop will not be possible.

I'd like to post the diff on review.hbase.org for more review and feedback,
but that begs the question of where the changes should go?

Longer term, I think we need to dump Hadoop RPC (AVRO-405 seems promising in
this) so that HBase internals aren't so intertwined with Hadoop
implementation details, but that's it's own large scale project which we
shouldn't couple to security.

So, to sum up, thoughts on:

a) creating a "security" feature branch in svn?
b) RPC related changes, specifically cross Hadoop branch incompatibility due
to version increment and Hadoop security dependencies?

Thanks,
Gary

Reply via email to