Hi Gary and Ryan,
Thank you for your replies. Judging by your comments, I do not want to get into the issues with the RS as of now. I think the best way for me to go about this would be to implement the encryption at the HDFS layer. Due to the paucity of time, access control and authentication are not on my list right now. However, another person on my team is working on these issues. I will direct him to the links in this thread. I will let you know if I come up with something interesting. Regards, Preetam From: Gary Helmling <[email protected]> To: [email protected]; Preetam Joshi <[email protected]> Cc: Sent: Tuesday, November 16, 2010 7:03:22 PM Subject: Re: Hbase security: Encryption of Data before storage on physical disk Hi Preetam, A group of us at Trend Micro have been working on adding some security features to HBase. We've been focused on authentication and access control though, as opposed to data encryption. HBase sits on top of HDFS and relies on it as the underlying filesystem. HBase implements it's own file format (HFile) for its data files, but just writes them out to a directory tree in HDFS. So one approach would be to encrypt the underlying data files at the HDFS layer. I believe you could do this by implementing a CompressionCodec that does the encryption, and found this mailing list thread discussing the same topic: http://www.mail-archive.com/[email protected]/msg06222.html Alternately, if you wanted to encrypt individual values as they're stored, you may be able to use the coprocessor framework that we've been building. This would essentially allow you to have a listener on the server side and encrypt values for new writes, and decrypt values for reads. The problem with this is that you can't easily encrypt the row and column keys used by HBase while still preserving the data ordering characteristics that are central to the bigtable architecture. Since row keys and column qualifiers often represent "data" in HBase usage, you'd potentially be leaking sensitive information. As Ryan mentions, a big thing to consider is key management. The RS needs to be able to decrypt the data, without random users being able to. For more info on the coprocessor framework, see: https://issues.apache.org/jira/browse/HBASE-2000 https://issues.apache.org/jira/browse/HBASE-2001 For more info on the current security work, see: https://issues.apache.org/jira/browse/HBASE-1697 https://issues.apache.org/jira/browse/HBASE-3025 http://hbaseblog.com/2010/10/11/secure-hbase-access-controls/ Hope this helps. I'm interested to hear more about what you're planning. Gary On Tue, Nov 16, 2010 at 3:16 PM, Preetam Joshi <[email protected]> wrote: Hi, > >>I am a graduate student and I am working on implementing a few security >>features for HBase, one of which is described as follows: > >>=> Before the data is stored into the actual physical disk, I would want to >>encrypt the data before storing it. I would like to do it on the server side. > >>Could anyone tell me which particular module I should look at to achieve this? > >>Thanks in advance. > >>Regards, >Preetam > > > >>
