[
https://issues.apache.org/jira/browse/PIG-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jerry Chen updated PIG-3289:
----------------------------
Attachment: PIG-3289.patch
Attach patch for reference. It depends on HADOOP-9331, MAPREDUCE-5025 and a few
of others.
> Encryption aware load and store functions
> -----------------------------------------
>
> Key: PIG-3289
> URL: https://issues.apache.org/jira/browse/PIG-3289
> Project: Pig
> Issue Type: New Feature
> Components: data
> Affects Versions: 0.11.2
> Reporter: Jerry Chen
> Attachments: PIG-3289.patch
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> With HADOOP-9331 and MAPREDUCE-5025 in place, MapReduce jobs have the ability
> to process and output the encrypted data. For pig users, to take advantage of
> this capability and process and output the encrypted data, pig should have
> capability to accept the key and pass it to the MapReduce , so that MapReduce
> can do the job on the behalf of pig. The scope of this Jira is limited to
> passing the key to MapReduce and takes the advantage of HADOOP-9331 and
> MAPREDUCE-5025 without breaking Pig.
> To achieve that, file input formats or file output formats interface will be
> modified to handle CryptoCodec and set the context properly and provide key
> facilities.
> The file [input/output] formats that does not support compression (by using
> CompressionCodec) can't be addressed by this work because the encryption
> feature (HADOOP-9331 and related) is based on CompressionCodec.
> By making this change, pig can cover the following use case:
> a. Pig user can run a query on an encrypted data
> b. Pig users can store an encrypted data
> c. Outputting the encrypted data
> Accessing of encrypted HBase storage/tables or any other encrypted storage
> format, who pig can query, should be addressed with separate Jiras, if needed
> because HBase | Other systems might have specific key management mechanisms
> or interfacing with Pig.
> To handle versions of Hadoop that do not have crypto support, we can avoid
> compilation problems by segregating crypto API usage into separate files to
> be included only if a flag is defined on the Ant command line (something like
> –Dcrypto).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira