[jira] [Updated] (HIVE-5207) Support data encryption for Hive tables

2013-09-26 Thread Jerry Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Chen updated HIVE-5207:
-

Attachment: HIVE-5207.patch

Correct the typo pointed out by Larry.
Thanks Larry.

 Support data encryption for Hive tables
 ---

 Key: HIVE-5207
 URL: https://issues.apache.org/jira/browse/HIVE-5207
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.12.0
Reporter: Jerry Chen
  Labels: Rhino
 Attachments: HIVE-5207.patch, HIVE-5207.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 For sensitive and legally protected data such as personal information, it is 
 a common practice that the data is stored encrypted in the file system. To 
 enable Hive with the ability to store and query the encrypted data is very 
 crucial for Hive data analysis in enterprise. 
  
 When creating table, user can specify whether a table is an encrypted table 
 or not by specify a property in TBLPROPERTIES. Once an encrypted table is 
 created, query on the encrypted table is transparent as long as the 
 corresponding key management facilities are set in the running environment of 
 query. We can use hadoop crypto provided by HADOOP-9331 for underlying data 
 encryption and decryption. 
  
 As to key management, we would support several common key management use 
 cases. First, the table key (data key) can be stored in the Hive metastore 
 associated with the table in properties. The table key can be explicit 
 specified or auto generated and will be encrypted with a master key. There 
 are cases that the data being processed is generated by other applications, 
 we need to support externally managed or imported table keys. Also, the data 
 generated by Hive may be consumed by other applications in the system. We 
 need to a tool or command for exporting the table key to a java keystore for 
 using externally.
  
 To handle versions of Hadoop that do not have crypto support, we can avoid 
 compilation problems by segregating crypto API usage into separate files 
 (shims) to be included only if a flag is defined on the Ant command line 
 (something like –Dcrypto=true).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5207) Support data encryption for Hive tables

2013-09-25 Thread Jerry Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Chen updated HIVE-5207:
-

Attachment: HIVE-5207.patch

Attach patch for reference. It depends on hadoop crypto feature.

 Support data encryption for Hive tables
 ---

 Key: HIVE-5207
 URL: https://issues.apache.org/jira/browse/HIVE-5207
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.12.0
Reporter: Jerry Chen
  Labels: Rhino
 Attachments: HIVE-5207.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 For sensitive and legally protected data such as personal information, it is 
 a common practice that the data is stored encrypted in the file system. To 
 enable Hive with the ability to store and query the encrypted data is very 
 crucial for Hive data analysis in enterprise. 
  
 When creating table, user can specify whether a table is an encrypted table 
 or not by specify a property in TBLPROPERTIES. Once an encrypted table is 
 created, query on the encrypted table is transparent as long as the 
 corresponding key management facilities are set in the running environment of 
 query. We can use hadoop crypto provided by HADOOP-9331 for underlying data 
 encryption and decryption. 
  
 As to key management, we would support several common key management use 
 cases. First, the table key (data key) can be stored in the Hive metastore 
 associated with the table in properties. The table key can be explicit 
 specified or auto generated and will be encrypted with a master key. There 
 are cases that the data being processed is generated by other applications, 
 we need to support externally managed or imported table keys. Also, the data 
 generated by Hive may be consumed by other applications in the system. We 
 need to a tool or command for exporting the table key to a java keystore for 
 using externally.
  
 To handle versions of Hadoop that do not have crypto support, we can avoid 
 compilation problems by segregating crypto API usage into separate files 
 (shims) to be included only if a flag is defined on the Ant command line 
 (something like –Dcrypto=true).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira