[ 
https://issues.apache.org/jira/browse/PIG-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Chen updated PIG-3289:
----------------------------

    Attachment: PIG-3289.patch

Attach patch for reference. It depends on HADOOP-9331, MAPREDUCE-5025 and a few 
of others.
                
> Encryption aware load and store functions
> -----------------------------------------
>
>                 Key: PIG-3289
>                 URL: https://issues.apache.org/jira/browse/PIG-3289
>             Project: Pig
>          Issue Type: New Feature
>          Components: data
>    Affects Versions: 0.11.2
>            Reporter: Jerry Chen
>         Attachments: PIG-3289.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> With HADOOP-9331 and MAPREDUCE-5025 in place, MapReduce jobs have the ability 
> to process and output the encrypted data. For pig users, to take advantage of 
> this capability and process and output the encrypted data, pig should have 
> capability to accept the key and pass it to the MapReduce , so that MapReduce 
> can do the job on the behalf of  pig. The scope of this Jira is limited to 
> passing the key to MapReduce and takes the advantage of HADOOP-9331 and 
> MAPREDUCE-5025 without breaking Pig. 
> To achieve that, file input formats or file output formats interface will be 
> modified to handle CryptoCodec and set the context properly and provide key 
> facilities.
> The file [input/output] formats that does not support compression (by using 
> CompressionCodec) can't be addressed by this work because the encryption 
> feature (HADOOP-9331 and related) is based on CompressionCodec. 
> By making this change, pig can cover the following use case:
> a. Pig user can run a query on an encrypted data
> b. Pig users can store an encrypted data 
> c. Outputting the encrypted data
> Accessing of encrypted HBase storage/tables or any other encrypted storage 
> format, who pig can query, should be addressed with separate Jiras, if needed 
> because HBase | Other systems might have specific key management mechanisms 
> or interfacing with Pig.
> To handle versions of Hadoop that do not have crypto support, we can avoid 
> compilation problems by segregating crypto API usage into separate files to 
> be included only if a flag is defined on the Ant command line (something like 
> –Dcrypto).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to