[jira] [Resolved] (MAPREDUCE-5589) MapReduce Job setup error leaves no useful info to users (when LinuxTaskController is used)

2017-10-16 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-5589.
-
Resolution: Won't Do

Cleaning up jiras which is not relevant anymore.

> MapReduce Job setup error leaves no useful info to users  (when 
> LinuxTaskController is used)
> 
>
> Key: MAPREDUCE-5589
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5589
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 1.2.1
> Environment: I found that a significant portion of the main() in 
> JobLocalizer will not have its output displayed in the tasklogs.
> Before fix :
> Error initializing attempt_201310122204_0002_m_02_0:
> java.io.IOException: Job initialization failed (1) with output: Reading task 
> controller config from /etc/hadoop/taskcontroller.cfg
> main : command provided 0
> main : user is bantony
> Good mapred-local-dirs are 
> /hadoop12/scratch,/hadoop04/scratch,/hadoop09/scratch,/hadoop03/scratch,/hadoop05/scratch,/hadoop01/scratch,/hadoop11/scratch,/hadoop10/scratch,/hadoop08/scratch,/hadoop06/scratch,/hadoop07/scratch,/hadoop02/scratch
>   at 
> org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:193)
>   at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1340)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at 
> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1315)
>   at 
> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1230)
>   at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2641)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.hadoop.util.Shell$ExitCodeException: 
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
>   at org.apache.hadoop.util.Shell.run(Shell.java:182)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
>   at 
> org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:186)
>   ... 8 more
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-5368.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-4552) Encryption: Add support for PGP Encryption

2017-10-16 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4552.
-
Resolution: Won't Do

Cleaning up jiras which is not relevant anymore.

> Encryption:  Add support for PGP Encryption
> ---
>
> Key: MAPREDUCE-4552
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4552
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: MR_4552_1_1.patch, MR_4552_trunk.patch
>
>
> Provide support for PGP encryption by implementing Encrypter and Decrypter 
> interfaces defined in MAPREDUCE-4450.  This can be used by the cluster to 
> protect the job secrets. This also be used map reduce jobs to encrypt/decrypt 
> data. 
> Add PGPCodec as a CompressionCodec  so that encrypted data can be processed 
> transparently like compressed data . The aliases to the keys can be specified 
> as part of Job. 
> Based on PGPCodec, a number of utilities are provided to encrypt, decrypt the 
> data in cluster.  They include
> 1.DistributedSplitter – Split an encrypted file into smaller files.
> 2.DistributedEncrypter – encrypt files in a cluster.
> 3.DistributedDecrypter – decrypt encrypted files in a cluster.
> 4.DistributedRecrypter – decrypt an encrypted file and encrypt it with 
> another key.
> Uitlities are added to encrypt/decrypt files in local file system
> 1.Genkey - Generate an asymmetric key pair (public and private keys) of a 
> specified strength
> 2.Encrypt - Encrypt a file 
> 3.Decrypt – Decrypt a file
> Added as a contrib project -  hadoop-crypto.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-4491) Encryption and Key Protection

2017-10-16 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4491.
-
Resolution: Won't Do

Cleaning up jiras which is not relevant anymore.

> Encryption and Key Protection
> -
>
> Key: MAPREDUCE-4491
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: documentation, security, task-controller, tasktracker
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf, 
> crypto_abstractions.zip
>
>
> When dealing with sensitive data, it is required to keep the data encrypted 
> wherever it is stored. Common use case is to pull encrypted data out of a 
> datasource and store in HDFS for analysis. The keys are stored in an external 
> keystore. 
> The feature adds a customizable framework to integrate different types of 
> keystores, support for Java KeyStore, read keys from keystores, and transport 
> keys from JobClient to Tasks.
> The feature adds PGP encryption as a codec and additional utilities to 
> perform encryption related steps.
> The design document is attached. It explains the requirement, design and use 
> cases.
> Kindly review and comment. Collaboration is very much welcome.
> I have a tested patch for this for 1.1 and will upload it soon as an initial 
> work for further refinement.
> Update: The patches are uploaded to subtasks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers

2017-10-16 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4551.
-
Resolution: Won't Do

Cleaning up jiras which is not relevant anymore.

> Key Protection :  Add ability to read keys and protect keys  in  JobClient 
> and TTS/NodeManagers
> ---
>
> Key: MAPREDUCE-4551
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: job submission, security
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: MR_4551_1_1.patch, MR_4551_trunk.patch
>
>
> Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  
> to decrypt the job's secrets.
> Based on Job configuration, JobClient reads secrets from a KeyStore using a 
> Keyprovider implementation and encrypts them using the cluster's public key.
> The encrypted secrets are stored in Job Credentials.
> The task addresses the following requirements:
> • Plug in different key store mechanisms.
> • Retrieve specified keys from a configured keystore as part of job 
> submission
> • Protect keys during its transport through the cluster.
> • Make sure that keys are handed over only to the tasks of the correct 
> job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-4553) Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore

2017-10-16 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4553.
-
Resolution: Won't Do

Cleaning up jiras which is not relevant anymore.

> Key Protection :  Implement KeyProvider to read key from a WebService Based 
> KeyStore
> 
>
> Key: MAPREDUCE-4553
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4553
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: job submission, security
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: MR_4553_1_1.patch, MR_4553_trunk.patch
>
>
> Normally keys have to be stored in a central location using custom key 
> management system.  organizations can implement KeyProvider to integrate 
> their custom key management system to Hadoop. This interface is specified in 
> MAPREDUCE-4550
> Optionally , developers can use Safe to integrate custom key management 
> system with Hadoop. 
> Safe is an open source web service based keystore to securely store secret 
> keys and passwords. 
> Safe authenticates the user using SPNego, checks whether the user is 
> authorized to read the secret and returns the secret. 
> It is easy to plug in different mechanisms for authentication,authorization 
> and Key storage. 
> Safe is kept as a separate open source project at 
> (http://benoyantony.github.com/safe/)
> The hadoop proxy to safe is added as a contrib project -  hadoop-safe. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-4550) Key Protection : Define Encryption and Key Protection interfaces and default implementations

2017-10-16 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4550.
-
Resolution: Won't Do

Cleaning up jiras which is not relevant anymore.

> Key Protection : Define Encryption and Key Protection interfaces and default 
> implementations
> 
>
> Key: MAPREDUCE-4550
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4550
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: security
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: MR_4550_1_1.patch, MR_4550_trunk.patch
>
>
> A secret key is read from a Key Store and then encrypted during transport 
> between JobClient and Task. The tasktrackers/nodemanagers decrypt the secrets 
> and provide the secrets to child tasks which part of the job.
> This jira defines the interfaces to accomplish the above :
> 1) KeyProvider - to read keys from a KeyStore
> 2) Encrypter and Decrypter - to and encrypt and decrypt secrets/data.
> The default/dummy implementations will also be added. This includes a 
> KeyProvider implementation to read keys from a Java KeyStore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-5378) Enable ApplicationMaster to support different QOP for client and server communications

2014-03-31 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-5378.
-

  Resolution: Duplicate
Release Note: This is handled at the RPC layer via HADOOP-10221

 Enable ApplicationMaster to support different QOP for client and server 
 communications
 --

 Key: MAPREDUCE-5378
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5378
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: mr-5378.patch


 Currently ApplicationMaster's QOP(quality of protection) is derived from the 
 client's config. If the client uses privacy, all the communication done by 
 the application master will be set to privacy. 
 As part of the feature to support multiple QOP (HADOOP -9709), the 
 application master is modified so that application master can support a 
 different QOPs  for its communication with client vs its communication with 
 other hadoop components.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5378) Enable ApplicationMaster to support different QOP for client and server communications

2014-03-31 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-5378:


Release Note:   (was: This is handled at the RPC layer via HADOOP-10221)

 Enable ApplicationMaster to support different QOP for client and server 
 communications
 --

 Key: MAPREDUCE-5378
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5378
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: mr-5378.patch


 Currently ApplicationMaster's QOP(quality of protection) is derived from the 
 client's config. If the client uses privacy, all the communication done by 
 the application master will be set to privacy. 
 As part of the feature to support multiple QOP (HADOOP -9709), the 
 application master is modified so that application master can support a 
 different QOPs  for its communication with client vs its communication with 
 other hadoop components.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5378) Enable ApplicationMaster to support different QOP for client and server communications

2014-03-31 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955888#comment-13955888
 ] 

Benoy Antony commented on MAPREDUCE-5378:
-

resolved at the RPC layer and data transfer protocol via HADOOP-10221 and 
HDFS-5910


 Enable ApplicationMaster to support different QOP for client and server 
 communications
 --

 Key: MAPREDUCE-5378
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5378
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: mr-5378.patch


 Currently ApplicationMaster's QOP(quality of protection) is derived from the 
 client's config. If the client uses privacy, all the communication done by 
 the application master will be set to privacy. 
 As part of the feature to support multiple QOP (HADOOP -9709), the 
 application master is modified so that application master can support a 
 different QOPs  for its communication with client vs its communication with 
 other hadoop components.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-2353) Make the MR changes to reflect the API changes in SecureIO library

2013-12-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-2353.
-

Resolution: Fixed

 Make the MR changes to reflect the API changes in SecureIO library
 --

 Key: MAPREDUCE-2353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2353
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: security, task, tasktracker
Affects Versions: 0.22.0
Reporter: Devaraj Das
Assignee: Benoy Antony
 Fix For: 0.22.1

 Attachments: MR-2353.patch, mr-2353-0.22.patch


 Make the MR changes to reflect the API changes in SecureIO library. 
 Specifically, the 'group' argument is never used in the SecureIO library, and 
 hence the API changes.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (MAPREDUCE-5589) MapReduce Job setup error leaves no useful info to users (when LinuxTaskController is used)

2013-10-17 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-5589:
---

 Summary: MapReduce Job setup error leaves no useful info to users  
(when LinuxTaskController is used)
 Key: MAPREDUCE-5589
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5589
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.2.1
 Environment: I found that a significant portion of the main() in 
JobLocalizer will not have its output displayed in the tasklogs.


Before fix :
Error initializing attempt_201310122204_0002_m_02_0:
java.io.IOException: Job initialization failed (1) with output: Reading task 
controller config from /etc/hadoop/taskcontroller.cfg
main : command provided 0
main : user is bantony
Good mapred-local-dirs are 
/hadoop12/scratch,/hadoop04/scratch,/hadoop09/scratch,/hadoop03/scratch,/hadoop05/scratch,/hadoop01/scratch,/hadoop11/scratch,/hadoop10/scratch,/hadoop08/scratch,/hadoop06/scratch,/hadoop07/scratch,/hadoop02/scratch

at 
org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:193)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1340)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at 
org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1315)
at 
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1230)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2641)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.util.Shell$ExitCodeException: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
at 
org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:186)
... 8 more

Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: 1.3.0






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5589) MapReduce Job setup error leaves no useful info to users (when LinuxTaskController is used)

2013-10-17 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-5589:


Attachment: MAPREDUCE-5368.patch

Patch for branch-1

 MapReduce Job setup error leaves no useful info to users  (when 
 LinuxTaskController is used)
 

 Key: MAPREDUCE-5589
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5589
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.2.1
 Environment: I found that a significant portion of the main() in 
 JobLocalizer will not have its output displayed in the tasklogs.
 Before fix :
 Error initializing attempt_201310122204_0002_m_02_0:
 java.io.IOException: Job initialization failed (1) with output: Reading task 
 controller config from /etc/hadoop/taskcontroller.cfg
 main : command provided 0
 main : user is bantony
 Good mapred-local-dirs are 
 /hadoop12/scratch,/hadoop04/scratch,/hadoop09/scratch,/hadoop03/scratch,/hadoop05/scratch,/hadoop01/scratch,/hadoop11/scratch,/hadoop10/scratch,/hadoop08/scratch,/hadoop06/scratch,/hadoop07/scratch,/hadoop02/scratch
   at 
 org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:193)
   at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1340)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1315)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1230)
   at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2641)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.util.Shell$ExitCodeException: 
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
   at org.apache.hadoop.util.Shell.run(Shell.java:182)
   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
   at 
 org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:186)
   ... 8 more
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5368.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5589) MapReduce Job setup error leaves no useful info to users (when LinuxTaskController is used)

2013-10-17 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798169#comment-13798169
 ] 

Benoy Antony commented on MAPREDUCE-5589:
-

Proper error output by JobLocalizer is displayed when the patch is applied.

Error initializing attempt_201310140952_0001_m_02_0:
java.io.IOException: Job initialization failed (255) with output: Reading task 
controller config from /etc/hadoop/taskcontroller.cfg
main : command provided 0
main : user is bantony
Good mapred-local-dirs are 
/hadoop12/scratch,/hadoop04/scratch,/hadoop09/scratch,/hadoop05/scratch,/hadoop01/scratch,/hadoop11/scratch,/hadoop10/scratch,/hadoop08/scratch,/hadoop06/scratch,/hadoop07/scratch,/hadoop02/scratch
java.io.IOException: Call to localhost/127.0.0.1:60487 failed on local 
exception: java.io.IOException: javax.security.sasl.SaslException: DIGEST-MD5: 
No common protection layer between client and server
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1155)
at org.apache.hadoop.ipc.Client.call(Client.java:1123)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at $Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:414)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:444)
at org.apache.hadoop.mapred.JobLocalizer$2.run(JobLocalizer.java:529)
at org.apache.hadoop.mapred.JobLocalizer$2.run(JobLocalizer.java:527)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.JobLocalizer.main(JobLocalizer.java:526)
Caused by: java.io.IOException: javax.security.sasl.SaslException: DIGEST-MD5: 
No common protection layer between client and server
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:568)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at 
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:513)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:616)
at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:203)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1254)
at org.apache.hadoop.ipc.Client.call(Client.java:1098)
... 13 more
Caused by: javax.security.sasl.SaslException: DIGEST-MD5: No common protection 
layer between client and server
at 
com.sun.security.sasl.digest.DigestMD5Client.checkQopSupport(DigestMD5Client.java:396)
at 
com.sun.security.sasl.digest.DigestMD5Client.evaluateChallenge(DigestMD5Client.java:208)
at 
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:168)
at 
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:410)
at org.apache.hadoop.ipc.Client$Connection.access$1300(Client.java:203)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:609)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:606)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:605)
... 16 more

at 
org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:193)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1340)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at 
org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1315)
at 
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1230)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2641)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.util.Shell$ExitCodeException: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
at 

[jira] [Commented] (MAPREDUCE-5541) Improved algorithm for whether need speculative task

2013-10-15 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13796429#comment-13796429
 ] 

Benoy Antony commented on MAPREDUCE-5541:
-

+1.
Looks good. Solves a problem seen in our clusters.
Please  review and commit.'
John, Is there a trunk version for this improvement ?


 Improved algorithm for whether need speculative task
 

 Key: MAPREDUCE-5541
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5541
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.2.1
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.2.2

 Attachments: MAPREDUCE-5541-branch-1.2.patch, 
 MAPREDUCE-5541-branch-1.2.patch


 Most of time, tasks won't start running at same time.
 In this case hasSpeculativeTask in TaskInProgress not working very well.
 Some times, some tasks just start running, and scheduler already decide it 
 need speculative task to run.
 And this waste a lot of resource.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5541) Improved algorithm for whether need speculative task

2013-10-11 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792786#comment-13792786
 ] 

Benoy Antony commented on MAPREDUCE-5541:
-

John , 

1. Could you please make these parameters { SPECULATIVE_PROGRESS , 
SPECULATIVE_FACTOR } configurable  ?
2. Could you please share some test results indicating the improvement ?




 Improved algorithm for whether need speculative task
 

 Key: MAPREDUCE-5541
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5541
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.2.1
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.2.2

 Attachments: MAPREDUCE-5541-branch-1.2.patch


 Most of time, tasks won't start running at same time.
 In this case hasSpeculativeTask in TaskInProgress not working very well.
 Some times, some tasks just start running, and scheduler already decide it 
 need speculative task to run.
 And this waste a lot of resource.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (MAPREDUCE-5378) Make the QOP for application master to be derived based on

2013-07-08 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-5378:
---

 Summary: Make the QOP for application master to be derived based 
on 
 Key: MAPREDUCE-5378
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5378
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Benoy Antony
Assignee: Benoy Antony


Currently ApplicationMaster's QOP(quality of protection) is derived from the 
client's config. If the client uses privacy, all the communication done by the 
application master will be set to privacy. 

As part of the feature to support multiple QOP (HADOOP -9709), the application 
master is modified so that application master can support a different QOPs  for 
its communication with client vs its communication with other hadoop components.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5378) Make the QOP for application master to be derived based on

2013-07-08 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-5378:


Attachment: mr-5378.patch

I'll add the testcases once the approach is reviewed.

 Make the QOP for application master to be derived based on 
 ---

 Key: MAPREDUCE-5378
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5378
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: mr-5378.patch


 Currently ApplicationMaster's QOP(quality of protection) is derived from the 
 client's config. If the client uses privacy, all the communication done by 
 the application master will be set to privacy. 
 As part of the feature to support multiple QOP (HADOOP -9709), the 
 application master is modified so that application master can support a 
 different QOPs  for its communication with client vs its communication with 
 other hadoop components.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5378) Enable ApplicationMaster to support different QOP for client and server communications

2013-07-08 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-5378:


Summary: Enable ApplicationMaster to support different QOP for client and 
server communications  (was: Make the QOP for application master to be derived 
based on )

 Enable ApplicationMaster to support different QOP for client and server 
 communications
 --

 Key: MAPREDUCE-5378
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5378
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: mr-5378.patch


 Currently ApplicationMaster's QOP(quality of protection) is derived from the 
 client's config. If the client uses privacy, all the communication done by 
 the application master will be set to privacy. 
 As part of the feature to support multiple QOP (HADOOP -9709), the 
 application master is modified so that application master can support a 
 different QOPs  for its communication with client vs its communication with 
 other hadoop components.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state

2013-05-20 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony reassigned MAPREDUCE-5260:
---

Assignee: zhaoyunjiong

 Job failed because of JvmManager running into inconsistent state
 

 Key: MAPREDUCE-5260
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.2
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.1.3

 Attachments: MAPREDUCE-5260-branch-1.1.patch


 In our cluster, jobs failed due to randomly task initialization failed 
 because of JvmManager running into inconsistent state and TaskTracker failed 
 to exit:
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
 ---
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state

2013-05-20 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662080#comment-13662080
 ] 

Benoy Antony commented on MAPREDUCE-5260:
-

reviewed. +1

 Job failed because of JvmManager running into inconsistent state
 

 Key: MAPREDUCE-5260
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.2
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.1.3

 Attachments: MAPREDUCE-5260-branch-1.1.patch


 In our cluster, jobs failed due to randomly task initialization failed 
 because of JvmManager running into inconsistent state and TaskTracker failed 
 to exit:
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
 ---
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4553) Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore

2013-02-18 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4553:


Description: 
Normally keys have to be stored in a central location using custom key 
management system.  organizations can implement KeyProvider to integrate their 
custom key management system to Hadoop. This interface is specified in 
MAPREDUCE-4550

Optionally , developers can use Safe to integrate custom key management system 
with Hadoop. 
Safe is an open source web service based keystore to securely store secret keys 
and passwords. 
Safe authenticates the user using SPNego, checks whether the user is authorized 
to read the secret and returns the secret. 
It is easy to plug in different mechanisms for authentication,authorization and 
Key storage. 
Safe is kept as a separate open source project at 
(http://benoyantony.github.com/safe/)

The hadoop proxy to safe is added as a contrib project -  hadoop-safe. 


  was:
Normally keys have to be stored in a central location suing custom key 
management system.  organizations can implement KeyProvider to integrate their 
custom key management system to Hadoop. This interface is specified in 
MAPREDUCE-4550

Optionally , developers can use Safe to integrate custom key management system 
with Hadoop. 
Safe is an open source web service based keystore to securely store secret keys 
and passwords. 
Safe authenticates the user using SPNego, checks whether the user is authorized 
to read the secret and returns the secret. 
It is easy to plug in different mechanisms for authentication,authorization and 
Key storage. 
Safe is kept as a separate open source project at 
(http://benoyantony.github.com/safe/)

The hadoop proxy to safe is added as a contrib project -  hadoop-safe. 



 Key Protection :  Implement KeyProvider to read key from a WebService Based 
 KeyStore
 

 Key: MAPREDUCE-4553
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4553
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4553_1_1.patch, MR_4553_trunk.patch


 Normally keys have to be stored in a central location using custom key 
 management system.  organizations can implement KeyProvider to integrate 
 their custom key management system to Hadoop. This interface is specified in 
 MAPREDUCE-4550
 Optionally , developers can use Safe to integrate custom key management 
 system with Hadoop. 
 Safe is an open source web service based keystore to securely store secret 
 keys and passwords. 
 Safe authenticates the user using SPNego, checks whether the user is 
 authorized to read the secret and returns the secret. 
 It is easy to plug in different mechanisms for authentication,authorization 
 and Key storage. 
 Safe is kept as a separate open source project at 
 (http://benoyantony.github.com/safe/)
 The hadoop proxy to safe is added as a contrib project -  hadoop-safe. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2013-01-31 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1356#comment-1356
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

Yes, That makes sense.  

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: crypto_abstractions.zip, Hadoop_Encryption.pdf, 
 Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement.
 Update: The patches are uploaded to subtasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4552) Encryption: Add support for PGP Encryption

2013-01-31 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568325#comment-13568325
 ] 

Benoy Antony commented on MAPREDUCE-4552:
-

Sure. I'll decompose this into smaller patches. 

What you mentioned about directory structure is true.  If that's going to 
change, then this feature is going to break. I would need some guidance on 
this. Once I break it this into smaller patches, we will review that piece 
separately.

 Encryption:  Add support for PGP Encryption
 ---

 Key: MAPREDUCE-4552
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4552
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4552_1_1.patch, MR_4552_trunk.patch


 Provide support for PGP encryption by implementing Encrypter and Decrypter 
 interfaces defined in MAPREDUCE-4450.  This can be used by the cluster to 
 protect the job secrets. This also be used map reduce jobs to encrypt/decrypt 
 data. 
 Add PGPCodec as a CompressionCodec  so that encrypted data can be processed 
 transparently like compressed data . The aliases to the keys can be specified 
 as part of Job. 
 Based on PGPCodec, a number of utilities are provided to encrypt, decrypt the 
 data in cluster.  They include
 1.DistributedSplitter – Split an encrypted file into smaller files.
 2.DistributedEncrypter – encrypt files in a cluster.
 3.DistributedDecrypter – decrypt encrypted files in a cluster.
 4.DistributedRecrypter – decrypt an encrypted file and encrypt it with 
 another key.
 Uitlities are added to encrypt/decrypt files in local file system
 1.Genkey - Generate an asymmetric key pair (public and private keys) of a 
 specified strength
 2.Encrypt - Encrypt a file 
 3.Decrypt – Decrypt a file
 Added as a contrib project -  hadoop-crypto.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2013-01-28 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564819#comment-13564819
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

I'll continue working on this jira. I'll start by incorporating Jerry's 
framework changes.

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: crypto_abstractions.zip, Hadoop_Encryption.pdf, 
 Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement.
 Update: The patches are uploaded to subtasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4661) Add HTTPS for WebUIs on Branch-1

2012-10-17 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13478017#comment-13478017
 ] 

Benoy Antony commented on MAPREDUCE-4661:
-

Would you be able to provide a patch for Hadoop 1  ?

 Add HTTPS for WebUIs on Branch-1
 

 Key: MAPREDUCE-4661
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4661
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: security, webapps
Affects Versions: 1.0.3
Reporter: Plamen Jeliazkov
Assignee: Plamen Jeliazkov
 Fix For: 1.0.4

 Attachments: https.patch, MAPREDUCE-4461.patch, MAPREDUCE-4661.patch, 
 MAPREDUCE-4661.patch, MAPREDUCE-4661.patch


 After investigating the methodology used to add HTTPS support in branch-2, I 
 feel that this same approach should be back-ported to branch-1. I have taken 
 many of the patches used for branch-2 and merged them in.
 I was working on top of HDP 1 at the time - I will provide a patch for trunk 
 soon once I can confirm I am adding only the necessities for supporting HTTPS 
 on the webUIs.
 As an added benefit -- this patch actually provides HTTPS webUI to HBase by 
 extension. If you take a hadoop-core jar compiled with this patch and put it 
 into the hbase/lib directory and apply the necessary configs to hbase/conf.
 = OLD IDEA(s) BEHIND ADDING HTTPS (look @ Sept 17th patch) ==
 In order to provide full security around the cluster, the webUI should also 
 be secure if desired to prevent cookie theft and user masquerading. 
 Here is my proposed work. Currently I can only add HTTPS support. I do not 
 know how to switch reliance of the HttpServer from HTTP to HTTPS fully.
 In order to facilitate this change I propose the following configuration 
 additions:
 CONFIG PROPERTY - DEFAULT VALUE
 mapred.https.enable - false
 mapred.https.need.client.auth - false
 mapred.https.server.keystore.resource - ssl-server.xml
 mapred.job.tracker.https.port - 50035
 mapred.job.tracker.https.address - IP_ADDR:50035
 mapred.task.tracker.https.port - 50065
 mapred.task.tracker.https.address - IP_ADDR:50065
 I tested this on my local box after using keytool to generate a SSL 
 certficate. You will need to change ssl-server.xml to point to the .keystore 
 file after. Truststore may not be necessary; you can just point it to the 
 keystore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2012-10-17 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13478230#comment-13478230
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

+1 . I agree. A more generic framework is useful in addressing encryption in 
components other than MR. Let us work on it together.

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: crypto_abstractions.zip, Hadoop_Encryption.pdf, 
 Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement.
 Update: The patches are uploaded to subtasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-03 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Open  (was: Patch Available)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-03 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: MR_4554_trunk.patch

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-03 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Patch Available  (was: Open)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-03 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468649#comment-13468649
 ] 

Benoy Antony commented on MAPREDUCE-4554:
-

Attached the patch which addresses Robert's comments in regards to sleep job 
related names and comments in the test class.  
It also passed all the unit tests and received a +1 from Jenkins. 

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-01 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Open  (was: Patch Available)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-01 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Patch Available  (was: Open)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-01 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: MR_4554_trunk.patch

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-01 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Open  (was: Patch Available)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-01 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Patch Available  (was: Open)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-28 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Open  (was: Patch Available)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-28 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: MR_4554_trunk.patch

fixing the spacing issue on the new files

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-28 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Patch Available  (was: Open)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-21 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460629#comment-13460629
 ] 

Benoy Antony commented on MAPREDUCE-4554:
-

Thanks Robert.  I'll fix the tab issue.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-07 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Open  (was: Patch Available)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-07 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Patch Available  (was: Open)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-07 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: MR_4554_trunk.patch

Attaching patch with the following changes 

1) Fixed TestStagingCleanup to fix the test failures.

2) Fixed TestMRCredentials to fix the deprecation warnings

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2012-09-07 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451050#comment-13451050
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

Key Protection is simple to explain.
JobClient retrieves keys from a configured Keystore ,encrypts the keys along 
with jobId  using cluster public key , submits the encrypted blob 
as part of the job credentials. 
TaskTrackers decrypts the encrypted blob using cluster private key during job 
localization, verifies that jobId inside the encrypted blob matches the JobId 
of the task. During Task Launch, the keys are made available to the  child 
(task) process as an environment variable.

Since the JobId is part of the encrypted blob, the replay attack is prevented 
with the JobId verification. It is easy to add integrity protection also.

Now, the scheme was designed to be used in a secure cluster. It is good to 
explore whether it can be used in a non-secure cluster. 

One issue was with the cluster private key. It should be made accessible only 
to TaskTracker process. If the access is determined by the user's permissions, 
then tasks should be run as a different user. But it need not be the job owner. 
It can be a fixed user. 

I believe , you are bringing up another issue in this regard.  
If a rogue task can  make a TT launch another rogue task with a jobId matching 
the one inside encrypted blob, then the keys area available to the newly 
launched rogue task.
That's a good point. Basically the rogue task is acting as a JT/AppMaster. I am 
not sure whether that is possible. Even if its possible, there should be ways 
to detect it. 





 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement.
 Update: The patches are uploaded to subtasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-05 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: (was: MR_4554_1_1.patch)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-05 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: (was: MR_4554_trunk_testonly.patch)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-05 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Fix Version/s: trunk
 Hadoop Flags: Reviewed
   Status: Patch Available  (was: In Progress)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-05 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Open  (was: Patch Available)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-05 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Status: Patch Available  (was: Open)

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-09-05 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: MR_4554_trunk.patch

Suppressing the deprecation warning due to the use of MiniMRCluster

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: trunk

 Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-31 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: MR_4554_trunk.patch

Attaching latest patch after merging changes from trunk.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk_testonly.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-31 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: MR_4554_trunk.patch

Attaching the new patch with Daryn's suggested improvement.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk_testonly.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-31 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-4554 started by Benoy Antony.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
 MR_4554_trunk_testonly.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-29 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13443864#comment-13443864
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

1.  Do I understand correctly that your approach can be used to securely 
store (encrypt) data even on non-secure (security=simple) clusters?   

You are right!! If TaskTracker and Task processes are owned by different users, 
then it is possible to use this approach to encrypt/decrypt data in a 
non-secure cluster.  This  does not require each task to be run as job owner, 
instead a fixed user other than TT user is sufficient.  The cluster private key 
can be made readable/accessible only by TaskTracker user. In this way, the 
Tasks cannot get hold of the cluster private key. But it requires the use of 
LinuxTaskController to spawn  tasks as a different user. It also requires some 
code changes to enable this via configuration. 

2.  So JobClient uses current user credentials to obtain keys from the 
KeyStore, encrypts them with cluster-public-key and sends to the cluster along 
with the user credentials. JobTracker has nothing to do with the keys and 
passes the encrypted blob over to TaskTrackers scheduled to execute the tasks. 
TT decrypts the user keys using private-cluster-key and handles them to the 
local tasks, which is secure as keys don't travel over the wires. Is it right 
so far?

That is correct. Its a clear and concise  explanation of this straight forward 
approach.  Please note that though the design is described in terms of 
TaskTrackers and TaskControllers (1.0 terminology) , the implementation is 
available for both 1.0 and 2.0 .

3.  TT should be using user credentials to decrypt the blob of keys 
somehow? Or does it authenticate the user and then decrypts if authentication 
passes? I did not find it in your document.

This is an important point as we do not want Tasktracker to decrypt the blob of 
keys and blindly hand over to Tasks. The JobClient stores JobId along with keys 
as part of the encrypted blob. The taskTracker decrypts the encrypted blob, 
verifies that the JobId in the encrypted blob matches  JobId of the task. The 
keys are handed over to Tasks only if the JobId verification is successful. 
This ensures that keys are handed over to the correct tasks.

4.  How cluster-private-key is delivered to TTs?

The TTs can use an implementation of the KeyProvider interface to retrieve 
keys. The implementation can be configured as a cluster configuration. The 
default Key provider is Java keystore based key provider in which private key 
is stored in a Java keystore file on the TT machines. This is the same scheme 
used by web servers to store their private keys. It is possible to plugin more 
complex KeyStorage mechanisms via configuration.

5. I think configuration parameters naming need some changes. They should not 
start with mapreduce.job. Based on your examples you can just encrypt a HDFS 
file without spawning any actual jobs. In this case seeing mapreduce.job.* 
seems confusing.
My suggestion is to prefix all parameters with simply 
hadoop.crypto.* Then you can use e.g. full word keystore instead of ks.

The distributed utility to encrypt/decrypt an HDFS file actually spawns map 
jobs. Irrespective of that, I think it make perfect sense to rename the 
configurations as hadoop.crypto  as this approach is useful in non-mapreduce 
situations. I'll change the configuration names.

I plan to get into reviewing the implementation soon.  

Thanks and please post your comments.

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement.
 Update: The patches are 

[jira] [Commented] (MAPREDUCE-4550) Key Protection : Define Encryption and Key Protection interfaces and default implementations

2012-08-29 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13443869#comment-13443869
 ] 

Benoy Antony commented on MAPREDUCE-4550:
-

I was following the pattern of Compressor/DeCompressor interfaces. But that is 
more like an implementation detail. So I'll try to combine them and see if it 
holds good.

 Key Protection : Define Encryption and Key Protection interfaces and default 
 implementations
 

 Key: MAPREDUCE-4550
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4550
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4550_1_1.patch, MR_4550_trunk.patch


 A secret key is read from a Key Store and then encrypted during transport 
 between JobClient and Task. The tasktrackers/nodemanagers decrypt the secrets 
 and provide the secrets to child tasks which part of the job.
 This jira defines the interfaces to accomplish the above :
 1) KeyProvider - to read keys from a KeyStore
 2) Encrypter and Decrypter - to and encrypt and decrypt secrets/data.
 The default/dummy implementations will also be added. This includes a 
 KeyProvider implementation to read keys from a Java KeyStore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-27 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13442537#comment-13442537
 ] 

Benoy Antony commented on MAPREDUCE-4554:
-

It works in MR 1.0 , but coes not work in trunk as mentioned in description.

HADOOP-8225 did not introduce the problem mentioned in this issue. 
HADOOP-8225 introduced the problem mentioned in HADOOP-8276 . 



 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk_testonly.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-27 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13442546#comment-13442546
 ] 

Benoy Antony commented on MAPREDUCE-4554:
-

The included test passes in 1.0 , but fails on trunk. 
The attached patch for 1.1 is a test only patch.
The patch for trunk includes test plus the required fix.


 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk_testonly.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-27 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13442551#comment-13442551
 ] 

Benoy Antony commented on MAPREDUCE-4554:
-

A correction : HADOOP-8225 introduced the problem mentioned in HADOOP-8726 .


 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch, 
 MR_4554_trunk.patch, MR_4554_trunk_testonly.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-4526) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4526.
-

Resolution: Duplicate

Duplicate of MAPREDUCE-4554.
Created when Jira was messed up.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4526
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4526
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4527) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4527.
-

Resolution: Duplicate

Duplicate of MAPREDUCE-4554.
Created when Jira was messed up.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4527
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4527
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4528) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4528.
-

Resolution: Duplicate

Duplicate of MAPREDUCE-4554.
Created when Jira was messed up.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4528
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4528
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4529) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4529.
-

Resolution: Duplicate

Duplicate of MAPREDUCE-4554.
Created when Jira was messed up.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4529
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4529
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4530) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4530.
-

Resolution: Duplicate

Duplicate of MAPREDUCE-4554.
Created when Jira was messed up.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4530
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4530
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

  Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4531) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4531.
-

Resolution: Duplicate

Duplicate of MAPREDUCE-4554.
Created when Jira was messed up.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4531
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4531
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

  Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4532) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4532.
-

Resolution: Duplicate

Duplicate of MAPREDUCE-4554.
Created when Jira was messed up.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4532
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4532
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

  Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4537) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4537.
-

Resolution: Duplicate

Accidentially created duplicate of MAPREDUCE-4554. Must be closed.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4537
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4537
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4536) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4536.
-

Resolution: Duplicate

Accidentially created duplicate of MAPREDUCE-4554. Must be closed.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4536
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4536
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4540) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4540.
-

Resolution: Duplicate

Accidentially created duplicate of MAPREDUCE-4554. Must be closed.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4540
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4540
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed as part of a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2, these credentials are transmitted only when security is turned 
 on.
 This should be changed in HADOOP 2 for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4542) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4542.
-

Resolution: Duplicate

Accidentially created duplicate of MAPREDUCE-4554. Must be closed.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed as part of a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2, these credentials are transmitted only when security is turned 
 on.
 This should be changed in HADOOP 2 for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4544) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4544.
-

Resolution: Duplicate

Accidentially created duplicate of MAPREDUCE-4554. Must be closed.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4544
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4544
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4543) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4543.
-

Resolution: Duplicate

Accidentially created duplicate of MAPREDUCE-4554. Must be closed.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4543
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4543
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4547) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4547.
-

Resolution: Duplicate

Accidentially created duplicate of MAPREDUCE-4554. Must be closed.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4547
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4547
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4546) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4546.
-

Resolution: Duplicate

Accidentially created duplicate of MAPREDUCE-4554. Must be closed.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4546
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4546
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4545) Job Credentials are not transmitted if security is turned off

2012-08-14 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4545.
-

Resolution: Fixed

Accidentially created duplicate of MAPREDUCE-4554. Must be closed.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4545
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4545
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be fixed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433237#comment-13433237
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

To make the reviewing this patch easier, I am dividing this patch  into smaller 
patches. I am opening sub tasks under this jira issue and attaching the patches 
to those liras.

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf, 
 MR_4491_1.1.patch, MR_4491_trunk.patch


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4550) Key Protection : Define Encryption and Key Protection interfaces and default implementations

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4550:
---

 Summary: Key Protection : Define Encryption and Key Protection 
interfaces and default implementations
 Key: MAPREDUCE-4550
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4550
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony


A secret key is read from a Key Store and then encrypted during transport 
between JobClient and Task. The tasktrackers/nodemanagers decrypt the secrets 
and provide the secrets to child tasks which part of the job.

This jira defines the interfaces to accomplish the above :

1) KeyProvider - to read keys from a KeyStore

2) Encrypter and Decrypter - to and encrypt and decrypt secrets/data.

The default/dummy implementations will also be added. This includes a 
KeyProvider implementation to read keys from a Java KeyStore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4550) Key Protection : Define Encryption and Key Protection interfaces and default implementations

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4550:


Attachment: MR_4550_1_1.patch
MR_4550_trunk.patch

 Key Protection : Define Encryption and Key Protection interfaces and default 
 implementations
 

 Key: MAPREDUCE-4550
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4550
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4550_1_1.patch, MR_4550_trunk.patch


 A secret key is read from a Key Store and then encrypted during transport 
 between JobClient and Task. The tasktrackers/nodemanagers decrypt the secrets 
 and provide the secrets to child tasks which part of the job.
 This jira defines the interfaces to accomplish the above :
 1) KeyProvider - to read keys from a KeyStore
 2) Encrypter and Decrypter - to and encrypt and decrypt secrets/data.
 The default/dummy implementations will also be added. This includes a 
 KeyProvider implementation to read keys from a Java KeyStore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4551:
---

 Summary: Key Protection :  Add ability to read keys and protect 
keys  in  JobClient and TTS/NodeManagers
 Key: MAPREDUCE-4551
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony


The following requirements are addressed.

•   Plug in different key store mechanisms.
•   Retrieve specified keys from a configured keystore as part of job 
submission
•   Protect keys during its transport through the cluster.
•   Make sure that keys are handed over only to the tasks of the correct 
job.

Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  to 
decrypt the job's secrets.
Based on Job configuration, JobClient reads secrets from a KeyStore using a 
Keyprovider implementation and encrypts them using the cluster's public key.

The encrypted secrets are stored in Job Credentials.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4551:


Description: 
Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  to 
decrypt the job's secrets.
Based on Job configuration, JobClient reads secrets from a KeyStore using a 
Keyprovider implementation and encrypts them using the cluster's public key.

The encrypted secrets are stored in Job Credentials.

The task addresses the following requirements:


•   Plug in different key store mechanisms.
•   Retrieve specified keys from a configured keystore as part of job 
submission
•   Protect keys during its transport through the cluster.
•   Make sure that keys are handed over only to the tasks of the correct 
job.


  was:
The following requirements are addressed.

•   Plug in different key store mechanisms.
•   Retrieve specified keys from a configured keystore as part of job 
submission
•   Protect keys during its transport through the cluster.
•   Make sure that keys are handed over only to the tasks of the correct 
job.

Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  to 
decrypt the job's secrets.
Based on Job configuration, JobClient reads secrets from a KeyStore using a 
Keyprovider implementation and encrypts them using the cluster's public key.

The encrypted secrets are stored in Job Credentials.


 Key Protection :  Add ability to read keys and protect keys  in  JobClient 
 and TTS/NodeManagers
 ---

 Key: MAPREDUCE-4551
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4551_1_1.patch, MR_4551_trunk.patch


 Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  
 to decrypt the job's secrets.
 Based on Job configuration, JobClient reads secrets from a KeyStore using a 
 Keyprovider implementation and encrypts them using the cluster's public key.
 The encrypted secrets are stored in Job Credentials.
 The task addresses the following requirements:
 • Plug in different key store mechanisms.
 • Retrieve specified keys from a configured keystore as part of job 
 submission
 • Protect keys during its transport through the cluster.
 • Make sure that keys are handed over only to the tasks of the correct 
 job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4551:


Attachment: MR_4551_trunk.patch
MR_4551_1_1.patch

 Key Protection :  Add ability to read keys and protect keys  in  JobClient 
 and TTS/NodeManagers
 ---

 Key: MAPREDUCE-4551
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4551_1_1.patch, MR_4551_trunk.patch


 Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  
 to decrypt the job's secrets.
 Based on Job configuration, JobClient reads secrets from a KeyStore using a 
 Keyprovider implementation and encrypts them using the cluster's public key.
 The encrypted secrets are stored in Job Credentials.
 The task addresses the following requirements:
 • Plug in different key store mechanisms.
 • Retrieve specified keys from a configured keystore as part of job 
 submission
 • Protect keys during its transport through the cluster.
 • Make sure that keys are handed over only to the tasks of the correct 
 job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4552) Encryption: Add support for PGP Encryption

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4552:
---

 Summary: Encryption:  Add support for PGP Encryption
 Key: MAPREDUCE-4552
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4552
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony


Provide support for PGP encryption by implementing Encrypter and Decrypter 
interfaces defined in MAPREDUCE-4450.  This can be used by the cluster to 
protect the job secrets. This also be used map reduce jobs to encrypt/decrypt 
data. 

Add PGPCodec as a CompressionCodec  so that encrypted data can be processed 
transparently like compressed data . The aliases to the keys can be specified 
as part of Job. 

Based on PGPCodec, a number of utilities are provided to encrypt, decrypt the 
data in cluster.  They include

1.  DistributedSplitter – Split an encrypted file into smaller files.
2.  DistributedEncrypter – encrypt files in a cluster.
3.  DistributedDecrypter – decrypt encrypted files in a cluster.
4.  DistributedRecrypter – decrypt an encrypted file and encrypt it with 
another key.

Uitlities are added to encrypt/decrypt files in local file system

1.  Genkey - Generate an asymmetric key pair (public and private keys) of a 
specified strength
2.  Encrypt - Encrypt a file 
3.  Decrypt – Decrypt a file

Added as a contrib project -  hadoop-crypto.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4552) Encryption: Add support for PGP Encryption

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4552:


Attachment: MR_4552_1_1.patch
MR_4552_trunk.patch

 Encryption:  Add support for PGP Encryption
 ---

 Key: MAPREDUCE-4552
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4552
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4552_1_1.patch, MR_4552_trunk.patch


 Provide support for PGP encryption by implementing Encrypter and Decrypter 
 interfaces defined in MAPREDUCE-4450.  This can be used by the cluster to 
 protect the job secrets. This also be used map reduce jobs to encrypt/decrypt 
 data. 
 Add PGPCodec as a CompressionCodec  so that encrypted data can be processed 
 transparently like compressed data . The aliases to the keys can be specified 
 as part of Job. 
 Based on PGPCodec, a number of utilities are provided to encrypt, decrypt the 
 data in cluster.  They include
 1.DistributedSplitter – Split an encrypted file into smaller files.
 2.DistributedEncrypter – encrypt files in a cluster.
 3.DistributedDecrypter – decrypt encrypted files in a cluster.
 4.DistributedRecrypter – decrypt an encrypted file and encrypt it with 
 another key.
 Uitlities are added to encrypt/decrypt files in local file system
 1.Genkey - Generate an asymmetric key pair (public and private keys) of a 
 specified strength
 2.Encrypt - Encrypt a file 
 3.Decrypt – Decrypt a file
 Added as a contrib project -  hadoop-crypto.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4553) Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4553:
---

 Summary: Key Protection :  Implement KeyProvider to read key from 
a WebService Based KeyStore
 Key: MAPREDUCE-4553
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4553
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony


Normally keys have to be stored in a central location suing custom key 
management system.  organizations can implement KeyProvider to integrate their 
custom key management system to Hadoop. This interface is specified in 
MAPREDUCE-4550

Optionally , developers can use Safe to integrate custom key management system 
with Hadoop. 
Safe is an open source web service based keystore to securely store secret keys 
and passwords. 
Safe authenticates the user using SPNego, checks whether the user is authorized 
to read the secret and returns the secret. 
It is easy to plug in different mechanisms for authentication,authorization and 
Key storage. 
Safe is kept as a separate open source project at 
(http://benoyantony.github.com/safe/)

The hadoop proxy to safe is added as a contrib project -  hadoop-safe. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4553) Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4553:


Attachment: MR_4553_trunk.patch
MR_4553_1_1.patch

 Key Protection :  Implement KeyProvider to read key from a WebService Based 
 KeyStore
 

 Key: MAPREDUCE-4553
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4553
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4553_1_1.patch, MR_4553_trunk.patch


 Normally keys have to be stored in a central location suing custom key 
 management system.  organizations can implement KeyProvider to integrate 
 their custom key management system to Hadoop. This interface is specified in 
 MAPREDUCE-4550
 Optionally , developers can use Safe to integrate custom key management 
 system with Hadoop. 
 Safe is an open source web service based keystore to securely store secret 
 keys and passwords. 
 Safe authenticates the user using SPNego, checks whether the user is 
 authorized to read the secret and returns the secret. 
 It is easy to plug in different mechanisms for authentication,authorization 
 and Key storage. 
 Safe is kept as a separate open source project at 
 (http://benoyantony.github.com/safe/)
 The hadoop proxy to safe is added as a contrib project -  hadoop-safe. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4491:


Attachment: (was: MR_4491_1.1.patch)

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4491:


Attachment: (was: MR_4491_trunk.patch)

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4491:


Description: 
When dealing with sensitive data, it is required to keep the data encrypted 
wherever it is stored. Common use case is to pull encrypted data out of a 
datasource and store in HDFS for analysis. The keys are stored in an external 
keystore. 

The feature adds a customizable framework to integrate different types of 
keystores, support for Java KeyStore, read keys from keystores, and transport 
keys from JobClient to Tasks.
The feature adds PGP encryption as a codec and additional utilities to perform 
encryption related steps.


The design document is attached. It explains the requirement, design and use 
cases.
Kindly review and comment. Collaboration is very much welcome.

I have a tested patch for this for 1.1 and will upload it soon as an initial 
work for further refinement.

Update: The patches are uploaded to subtasks. 








  was:
When dealing with sensitive data, it is required to keep the data encrypted 
wherever it is stored. Common use case is to pull encrypted data out of a 
datasource and store in HDFS for analysis. The keys are stored in an external 
keystore. 

The feature adds a customizable framework to integrate different types of 
keystores, support for Java KeyStore, read keys from keystores, and transport 
keys from JobClient to Tasks.
The feature adds PGP encryption as a codec and additional utilities to perform 
encryption related steps.


The design document is attached. It explains the requirement, design and use 
cases.
Kindly review and comment. Collaboration is very much welcome.

I have a tested patch for this for 1.1 and will upload it soon as an initial 
work for further refinement. 









 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement.
 Update: The patches are uploaded to subtasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4554:
---

 Summary: Job Credentials are not transmitted if security is turned 
off
 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony


Credentials (secret keys) can be passed to a job via 
mapreduce.job.credentials.json or mapreduce.job.credentials.binary .

These credentials get submitted during job submission and are made available to 
the task processes.

In HADOOP 1, these credentials get submitted and routed to task processes even 
if security was off.
In HADOOP 2 , these credentials are transmitted only when the security is 
turned on.

This should be changed for two reasons:
1) It is not backward compatible.

2) Credentials should be passed even if security is turned off .


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: MR_4554_1_1.patch
MR_4554_trunk.patch

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Affects Version/s: 2.0.0-alpha

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-13 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433301#comment-13433301
 ] 

Benoy Antony commented on MAPREDUCE-4554:
-

The patch adds a test case for 1.1  and 2.0 .
It also removes security on/off checks when transmitting credentials.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433401#comment-13433401
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

One of the goals of this feature is to achieve encryption of files in transit 
and at rest(when stored on disk). One way to achieve this goal is to depend on 
a software/hardware which allows encryption in the local file system plus rely 
on HDFS-3637  and MR shuffle encryption.

This jira  explores an alternative approach to the problem without depending on 
s special software to do local file system encryption. 

The key advantages of this approach over the local file system encryption 
approach are

1)  A file can be decrypted only if the user provides the correct key. So even 
if someone managed to read the file, he cannot read its contents without key. 
So user's possession of the key is required in addition to his read permission. 
So there are two levels of protection. 

There could be cases where a user accidentally set read permissions for 
everyone. There could be cases where a superuser reads the file. But  this 
scheme protects the data.

2) No dependency on local file system encryption software.  This approach 
allows encryption without such special setup.

3) A file is decrypted/encrypted only during processing and not when it is 
read.  So this results in a less number of encryption/decryption.


Other key points will be :

1) Encrypted and plain text files can coexist in a normal file system. 

2) Developers can plugin other encryption algorithms/standards - CMS, AES, 
custom encryption and thus have more flexibility.

3) Allows transporting keys/password/tokens  from JobClient to tasks for use 
cases other than encryption like connecting to a webservice . MAPREDUCE-4491 
adds keyProtection and encryption uses it.

4) Can manage keys in one central location. JobClient  gets on behalf of user 
like any other application. 

If we look at these two approaches from a higher level, we can see that one 
local file system approach is an internal approach to encryption and 
MAPREDUCE-4491 approach is an external approach. These two choices are 
available in normal (non-distributed) application development also where 
developers can rely on the file system to provide encryption or do encryption 
themselves. There are tradeoffs and flexibilities in the both the approaches 
and we choose it based on our use cases and needs.  So I believe , we should 
provide  these two alternatives  in Hadoop.

In addition, this feature allows key protection in general, which can be used 
for purposes other than encryption. The keys also will be encrypted when stored 
on disk and decrypted only in memory.


 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement.
 Update: The patches are uploaded to subtasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-09 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4491:


Attachment: Hadoop_Encryption.pdf
MR_4491_1.1.patch
MR_4491_trunk.patch

Attaching the initial patches for trunk and branch-1.1. Please review and let 
me know the comments. 

Did minor updates in the design document.

One of the test cases in the patch depends on a test class which will be part 
of another jira (yet to be filed due to the ASF Jira problem)

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf, 
 MR_4491_1.1.patch, MR_4491_trunk.patch


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4526) Job Credentials are not transmitted if security is turned off

2012-08-09 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4526:


Description: 
Credentials (secret keys) can be passed to a job via 
mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .

These credentials get submitted during job submission and are made available to 
the task processes. 

In HADOOP 1, these credentials get submitted and routed to task processes even 
if security was off.

In HADOOP 2 , these credentials are transmitted only when the security is 
turned on.

This should be changed for two reasons:

1) It is not backward compatible.
2) Credentials should be passed even if security is turned off .

 

  was:
Credentials (secret keys) can be passed to a job via 
mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .

These credentials get submitted during job submission and are made available to 
the task processes. 

In HADOOP 1, these credentials get submitted and routed to task processes even 
if security was off.

In HADOOP 2 , these credentials are transmitted only when the security is 
turned on.

This should be fixed for two reasons:

1) It is not backward compatible.
2) Credentials should be passed even if security is turned off .

 


 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4526
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4526
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony

 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or  mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes. 
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.
 2) Credentials should be passed even if security is turned off .
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

2012-08-01 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426756#comment-13426756
 ] 

Benoy Antony commented on MAPREDUCE-4481:
-

This will not impact 0.22.  MAPREDUCE -2415 was not ported to 0.22 . So 
userlogs directory will not be under the scratch directories. 


 User Log Retention across TT restarts
 -

 Key: MAPREDUCE-4481
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Affects Versions: 0.22.0
Reporter: Benoy Antony
Priority: Minor

 The tasktrackers cleanup the userlog directory when they restart.
 This happens independent of value of mapred.userlog.retain.hours.
 The feature is to add a configurable feature to respect 
 mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

2012-08-01 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426894#comment-13426894
 ] 

Benoy Antony commented on MAPREDUCE-4481:
-

Good point related to porting MAPREDUCE-2415 to 0.22.
Another related question will be porting MAPREDUCE-1213 to 1.1 

 User Log Retention across TT restarts
 -

 Key: MAPREDUCE-4481
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Affects Versions: 0.22.0
Reporter: Benoy Antony
Priority: Minor

 The tasktrackers cleanup the userlog directory when they restart.
 This happens independent of value of mapred.userlog.retain.hours.
 The feature is to add a configurable feature to respect 
 mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

2012-07-31 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425927#comment-13425927
 ] 

Benoy Antony commented on MAPREDUCE-4481:
-

This issue occurs only in those distributions where MAPREDUCE-2415 is applied 
as well as MRAsyncDiskService is used to cleanup the volumes during TT startup.
This is not applicable to 1.0 or 1.1 since MRAsyncDiskService is not present in 
those.
I don't think , it is applicable to trunk. 

 User Log Retention across TT restarts
 -

 Key: MAPREDUCE-4481
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Affects Versions: 1.0.0
Reporter: Benoy Antony
Assignee: Benoy Antony
Priority: Minor

 The tasktrackers cleanup the userlog directory when they restart.
 This happens independent of value of mapred.userlog.retain.hours.
 The feature is to add a configurable feature to respect 
 mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4481) User Log Retention across TT restarts

2012-07-31 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4481:


Affects Version/s: (was: 1.0.0)

 User Log Retention across TT restarts
 -

 Key: MAPREDUCE-4481
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
Priority: Minor

 The tasktrackers cleanup the userlog directory when they restart.
 This happens independent of value of mapred.userlog.retain.hours.
 The feature is to add a configurable feature to respect 
 mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4481) User Log Retention across TT restarts

2012-07-31 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved MAPREDUCE-4481.
-

Resolution: Not A Problem

 User Log Retention across TT restarts
 -

 Key: MAPREDUCE-4481
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
Priority: Minor

 The tasktrackers cleanup the userlog directory when they restart.
 This happens independent of value of mapred.userlog.retain.hours.
 The feature is to add a configurable feature to respect 
 mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2012-07-30 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425286#comment-13425286
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

To Rob's questions :

Different Encryption Keys for Different files:  At this point, the PGPCodec 
supports only one secret key/Key Pair  for all input files. 
What we need is the ability to specify secret keys/key pair per input file. 
Another enhancement will be to specify secret keys/key pair per each phase like 
map-output , reduce-output .
As you mentioned, this mapping has to specified via configuration.
I'll try to add these two enhancements. 

Decryption/Encryption of different columns within the same file: This is 
actually left to the mapreduce programmer as he has to do the 
Decryption/Encryption of the fields programmatically. The programmer can choose 
to use different keys  for different fields in the mapreduce program. Multiple 
keys can be retrieved from the keystore and these keys can be retrieved in the 
mapper/reducer using the credentials API.  
In a higher level interface like Hive, it may be possible to add additional 
metadata information to specify the key name. Another reviewer also has 
recommended to add this capability Hive to identify an encryption field and 
specify the key (name of the key)  to be used to decrypt/encrypt it.

Thanks for the review and recommendations, Rob. Please let me know if I have 
not answered the question correctly.

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2012-07-30 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425320#comment-13425320
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

To Alejandro's questions:

1) If using compression codec for encryption, are you losing the compression 
capabilities if doing using encryption or will it work as a composition?
What I have done is to first compress and then encrypt. I have hardcoded to 
ZIP. I can expose this as a configuration with a choice of {UNCOMPRESSED, ZIP, 
ZLIB, BZIP2}. This is an enhancement that I can add.
I have also provided a DistributedSplitter  so that files can be split into 
smaller files.
I am not aware of an ability to chain multiple compression Codecs, though it 
was a desirable capability in this case. 

2) For the keystores, are you proposing to store them in HDFS use file system 
permissions to protect them?

Actually, I am not proposing to store them in HDFS. The keystores themselves 
are encrypted and a password is required to read keys from them. 

In the use cases that I have encountered, the keystores were external to the 
cluster. They were either on the CLI machine from where the jobs were submitted 
or on a separate machine from where the keys were retrieved based on user's 
credentials. (Alfredo was used in this regard to fetch keys via webservice)
So they were two schemes that I have supported -
  1) reading keys from Java keystore
  2) reading keys from a web Service based keystore  (Safe)





 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >