[ 
https://issues.apache.org/jira/browse/HADOOP-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439792#comment-13439792
 ] 

Daryn Sharp commented on HADOOP-8225:
-------------------------------------

This is actually caused by multiple bugs:
# MR job submission requests tokens it already has
# MR job submission doesn't always pass all tokens (JHS, MR, HIVE, etc)
# Oozie is using a devious way to detect the exit code of an action

Details:
# The reported exception occurs when a task tries to get tokens it ALREADY has. 
 Job submission gets missing tokens for input/output paths and adds them to the 
UGI for RPC connections.  Job submission doesn't check the UGI, so it doesn't 
think it has the token, thus requests another.  The NN connection uses the 
token that the job doesn't think it has!  The NN squawks that you can't use a 
token to get a token.
# Similarly, distcp also does some prep work to acquire tokens prior to job 
submission.  So again, a task tries and fails to get the tokens it already 
has....  Invoking a command like distcp directly will "work" (masks the bug) 
because it uses the TGT to get another token even if it already has one in the 
UGI.
# Job submission doesn't appear to propagate non-FS/MR tokens in the task's UGI 
into the new job submission.
# Oozie uses a security manager to intercept an action's System.exit, throws a 
SecurityException containing the exit code, and later catches that exception to 
determine success/failure.  Devious!  Distcp calls System.exit(0) inside a try 
block which catches oozie's SecurityException, logs it, and then calls 
System.exit(-999), again generating an oozie SecurityException.  Due to the 
try/catch, distcp will ALWAYS appear to fail.

Solutions:
* #1,2,3: Seeding Job with existing UGI tokens
* #4 Distcp calls System.exit OUTSIDE of the try block

I seeded the Job's credentials with the existing UGI tokens because it seems 
unreasonable to require all apps that launch jobs to be aware of running as a 
task.

                
> DistCp fails when invoked by Oozie
> ----------------------------------
>
>                 Key: HADOOP-8225
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8225
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.23.1
>            Reporter: Mithun Radhakrishnan
>            Assignee: Daryn Sharp
>         Attachments: HADOOP-8225.patch, HADOOP-8225.patch, HADOOP-8225.patch
>
>
> When DistCp is invoked through a proxy-user (e.g. through Oozie), the 
> delegation-token-store isn't picked up by DistCp correctly. One sees failures 
> such as:
> ERROR [main] org.apache.hadoop.tools.DistCp: Couldn't complete DistCp
> operation: 
> java.lang.SecurityException: Intercepted System.exit(-999)
>     at
> org.apache.oozie.action.hadoop.LauncherSecurityManager.checkExit(LauncherMapper.java:651)
>     at java.lang.Runtime.exit(Runtime.java:88)
>     at java.lang.System.exit(System.java:904)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:357)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:394)
>     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:147)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:142)
> Looking over the DistCp code, one sees that HADOOP_TOKEN_FILE_LOCATION isn't 
> being copied to mapreduce.job.credentials.binary, in the job-conf. I'll post 
> a patch for this shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to