Hi all, I am investigating how HBase can be used to store sensitive/confidential information. This research is part of my master thesis for computing science at a university.
The research involves mostly confidentiality, for example: - Describing the location of the data within the distributed system - Role based access control - Fine grained access control (at column/row level) - Build-in encryption based on the role - The impact on performance and validation of the above security. My questions are: 1) are the above features interesting for HBase? 2) should I propose my changes and results in the Jira of HBase? This research assumes that the data is so sensitive that even administrators, developers or other malicious accessors may not see the data unless they have an authorized role. If I observed correctly (correct me if I am wrong), security in HBase now focuses primarily on authentication and discretionary access control and assumes that no malicious user has access to the underlying system, for example HDFS, hard drive or shell access because data can still be read in that way. My research focuses on extending HBase security with more authorization and confidentiality features. Thanks in advance! Kind regards, erwinx
