Hi Gary and Ryan,

Thank you for your replies. Judging by your comments, I do not want to get into 
the issues with the RS as of now. I think the best way for me to go about this 
would be to implement the encryption at the HDFS layer. Due to the paucity of 
time, access control and authentication are not on my list right now. However, 
another person on my team is working on these issues. I will direct him to the 
links in this thread.


I will let you know if I come up with something interesting. 

Regards,
Preetam



From: Gary Helmling <[email protected]>
To: [email protected]; Preetam Joshi <[email protected]>
Cc: 
Sent: Tuesday, November 16, 2010 7:03:22 PM
Subject: Re: Hbase security: Encryption of Data before storage on physical disk


Hi Preetam,

A group of us at Trend Micro have been working on adding some security features 
to HBase.  We've been focused on authentication and access control though, as 
opposed to data encryption.

HBase sits on top of HDFS and relies on it as the underlying filesystem.  HBase 
implements it's own file format (HFile) for its data files, but just writes 
them out to a directory tree in HDFS.

So one approach would be to encrypt the underlying data files at the HDFS 
layer.  I believe you could do this by implementing a CompressionCodec that 
does the encryption, and found this mailing list thread discussing the same 
topic:
http://www.mail-archive.com/[email protected]/msg06222.html

Alternately, if you wanted to encrypt individual values as they're stored, you 
may be able to use the coprocessor framework that we've been building.  This 
would essentially allow you to have a listener on the server side and encrypt 
values for new writes, and decrypt values for reads.  The problem with this is 
that you can't easily encrypt the row and column keys used by HBase while still 
preserving the data ordering characteristics that are central to the bigtable 
architecture.  Since row keys and column qualifiers often represent "data" in 
HBase usage, you'd potentially be leaking sensitive information.

As Ryan mentions, a big thing to consider is key management.  The RS needs to 
be able to decrypt the data, without random users being able to.

For more info on the coprocessor framework, see:
https://issues.apache.org/jira/browse/HBASE-2000
https://issues.apache.org/jira/browse/HBASE-2001

For more info on the current security work, see:
https://issues.apache.org/jira/browse/HBASE-1697
https://issues.apache.org/jira/browse/HBASE-3025
http://hbaseblog.com/2010/10/11/secure-hbase-access-controls/

Hope this helps.  I'm interested to hear more about what you're planning.


Gary



On Tue, Nov 16, 2010 at 3:16 PM, Preetam Joshi <[email protected]> wrote:

Hi,
>
>>I am a graduate student and I am working on implementing a few security 
>>features for HBase, one of which is described as follows:
>
>>=> Before the data is stored into the actual physical disk, I would want to 
>>encrypt the data before storing it. I would like to do it on the server side.
>
>>Could anyone tell me which particular module I should look at to achieve this?
>
>>Thanks in advance.
>
>>Regards,
>Preetam
>
>
>
>>      


      

Reply via email to