[ https://issues.apache.org/jira/browse/YARN-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zack Marsh updated YARN-3921: ----------------------------- Description: Prior to enabling Kerberos on the Hadoop cluster, I am able to run a simple MapReduce example as the Linux user 'tdatuser': {code} iripiri1:~ # su tdatuser tdatuser@piripiri1:/root> yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi 16 10000 Number of Maps = 16 Samples per Map = 10000 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Wrote input for Map #10 Wrote input for Map #11 Wrote input for Map #12 Wrote input for Map #13 Wrote input for Map #14 Wrote input for Map #15 Starting Job 15/07/13 17:02:31 INFO impl.TimelineClientImpl: Timeline service address: http:/ s/v1/timeline/ 15/07/13 17:02:31 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to 15/07/13 17:02:31 INFO input.FileInputFormat: Total input paths to process : 16 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: number of splits:16 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14 15/07/13 17:02:32 INFO impl.YarnClientImpl: Submitted application application_14 15/07/13 17:02:32 INFO mapreduce.Job: The url to track the job: http://piripiri3 cation_1436821014431_0003/ 15/07/13 17:02:32 INFO mapreduce.Job: Running job: job_1436821014431_0003 15/07/13 17:05:50 INFO mapreduce.Job: Job job_1436821014431_0003 running in uber mode : false 15/07/13 17:05:50 INFO mapreduce.Job: map 0% reduce 0% 15/07/13 17:05:56 INFO mapreduce.Job: map 6% reduce 0% 15/07/13 17:06:00 INFO mapreduce.Job: map 13% reduce 0% 15/07/13 17:06:01 INFO mapreduce.Job: map 38% reduce 0% 15/07/13 17:06:05 INFO mapreduce.Job: map 44% reduce 0% 15/07/13 17:06:07 INFO mapreduce.Job: map 63% reduce 0% 15/07/13 17:06:09 INFO mapreduce.Job: map 69% reduce 0% 15/07/13 17:06:11 INFO mapreduce.Job: map 75% reduce 0% 15/07/13 17:06:12 INFO mapreduce.Job: map 81% reduce 0% 15/07/13 17:06:13 INFO mapreduce.Job: map 81% reduce 25% 15/07/13 17:06:14 INFO mapreduce.Job: map 94% reduce 25% 15/07/13 17:06:16 INFO mapreduce.Job: map 100% reduce 31% 15/07/13 17:06:17 INFO mapreduce.Job: map 100% reduce 100% 15/07/13 17:06:17 INFO mapreduce.Job: Job job_1436821014431_0003 completed successfully 15/07/13 17:06:17 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=358 FILE: Number of bytes written=2249017 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=4198 HDFS: Number of bytes written=215 HDFS: Number of read operations=67 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=16 Launched reduce tasks=1 Data-local map tasks=16 Total time spent by all maps in occupied slots (ms)=160498 Total time spent by all reduces in occupied slots (ms)=27302 Total time spent by all map tasks (ms)=80249 Total time spent by all reduce tasks (ms)=13651 Total vcore-seconds taken by all map tasks=80249 Total vcore-seconds taken by all reduce tasks=13651 Total megabyte-seconds taken by all map tasks=246524928 Total megabyte-seconds taken by all reduce tasks=41935872 Map-Reduce Framework Map input records=16 Map output records=32 Map output bytes=288 Map output materialized bytes=448 Input split bytes=2310 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=448 Reduce input records=32 Reduce output records=0 Spilled Records=64 Shuffled Maps =16 Failed Shuffles=0 Merged Map outputs=16 GC time elapsed (ms)=1501 CPU time spent (ms)=13670 Physical memory (bytes) snapshot=13480296448 Virtual memory (bytes) snapshot=72598511616 Total committed heap usage (bytes)=12508463104 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1888 File Output Format Counters Bytes Written=97 Job Finished in 226.813 seconds Estimated value of Pi is 3.14127500000000000000 {code} However, after enabling Kerberos, the job fails: {code} tdatuser@piripiri1:/root> kinit -kt /etc/security/keytabs/tdatuser.headless.keytab tdatuser tdatuser@piripiri1:/root> yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi 16 10000 Number of Maps = 16 Samples per Map = 10000 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Wrote input for Map #10 Wrote input for Map #11 Wrote input for Map #12 Wrote input for Map #13 Wrote input for Map #14 Wrote input for Map #15 Starting Job 15/07/13 17:27:05 INFO impl.TimelineClientImpl: Timeline service address: http://piripiri1.labs.teradata.com:8188/ws/v1/timeline/ 15/07/13 17:27:05 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 140 for tdatuser on ha-hdfs:PIRIPIRI 15/07/13 17:27:05 INFO security.TokenCache: Got dt for hdfs://PIRIPIRI; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: (HDFS_DELEGATION_TOKEN token 140 for tdatuser) 15/07/13 17:27:06 INFO input.FileInputFormat: Total input paths to process : 16 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: number of splits:16 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1436822321287_0007 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: (HDFS_DELEGATION_TOKEN token 140 for tdatuser) 15/07/13 17:27:06 INFO impl.YarnClientImpl: Submitted application application_1436822321287_0007 15/07/13 17:27:06 INFO mapreduce.Job: The url to track the job: http://piripiri2.labs.teradata.com:8088/proxy/application_1436822321287_0007/ 15/07/13 17:27:06 INFO mapreduce.Job: Running job: job_1436822321287_0007 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 running in uber mode : false 15/07/13 17:27:09 INFO mapreduce.Job: map 0% reduce 0% 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 failed with state FAILED due to: Application application_1436822321287_0007 failed 2 times due to AM Container for appattempt_1436822321287_0007_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://piripiri2.labs.teradata.com:8088/cluster/app/application_1436822321287_0007Then, click on links to logs of each attempt. Diagnostics: Application application_1436822321287_0007 initialization failed (exitCode=255) with output: main : command provided 0 main : run as user is tdatuser main : requested yarn user is tdatuser Can't create directory /data1/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data2/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data3/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data4/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data5/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data6/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data7/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data8/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data9/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data10/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data11/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data12/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Did not create any app directories Failing this attempt. Failing the application. 15/07/13 17:27:09 INFO mapreduce.Job: Counters: 0 Job Finished in 4.748 seconds java.io.FileNotFoundException: File does not exist: hdfs://PIRIPIRI/user/tdatuser/QuasiMonteCarlo_1436822823095_2120947622/out/reduce-out at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1752) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1776) at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314) at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} As seen above there are many "Can't create directory... Permission denied errors" related to the local usercache directory for the 'tdatuser'. Prior to enabling Kerberos, the contents of a usercache directory was as follows: {code} piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ total 0 drwxr-xr-x 3 yarn hadoop 21 Jul 13 16:59 ambari-qa drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser {code} After enabling Kerberos the contents are: {code} piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ total 0 drwxr-s--- 4 ambari-qa hadoop 37 Jul 13 17:21 ambari-qa drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser {code} It appears that the owner of the usercache directory for the 'ambari-qa' user was updated, but the 'tdatuser' directory was not. Is this expected behavior, and is there a recommended work-around for this issue? was: Prior to enabling Kerberos on the Hadoop cluster, I am able to run a simple MapReduce example as the Linux user 'tdatuser': {code} iripiri1:~ # su tdatuser tdatuser@piripiri1:/root> yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi 16 10000 Number of Maps = 16 Samples per Map = 10000 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Wrote input for Map #10 Wrote input for Map #11 Wrote input for Map #12 Wrote input for Map #13 Wrote input for Map #14 Wrote input for Map #15 Starting Job 15/07/13 17:02:31 INFO impl.TimelineClientImpl: Timeline service address: http:/ s/v1/timeline/ 15/07/13 17:02:31 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to 15/07/13 17:02:31 INFO input.FileInputFormat: Total input paths to process : 16 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: number of splits:16 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14 15/07/13 17:02:32 INFO impl.YarnClientImpl: Submitted application application_14 15/07/13 17:02:32 INFO mapreduce.Job: The url to track the job: http://piripiri3 cation_1436821014431_0003/ 15/07/13 17:02:32 INFO mapreduce.Job: Running job: job_1436821014431_0003 15/07/13 17:05:50 INFO mapreduce.Job: Job job_1436821014431_0003 running in uber mode : false 15/07/13 17:05:50 INFO mapreduce.Job: map 0% reduce 0% 15/07/13 17:05:56 INFO mapreduce.Job: map 6% reduce 0% 15/07/13 17:06:00 INFO mapreduce.Job: map 13% reduce 0% 15/07/13 17:06:01 INFO mapreduce.Job: map 38% reduce 0% 15/07/13 17:06:05 INFO mapreduce.Job: map 44% reduce 0% 15/07/13 17:06:07 INFO mapreduce.Job: map 63% reduce 0% 15/07/13 17:06:09 INFO mapreduce.Job: map 69% reduce 0% 15/07/13 17:06:11 INFO mapreduce.Job: map 75% reduce 0% 15/07/13 17:06:12 INFO mapreduce.Job: map 81% reduce 0% 15/07/13 17:06:13 INFO mapreduce.Job: map 81% reduce 25% 15/07/13 17:06:14 INFO mapreduce.Job: map 94% reduce 25% 15/07/13 17:06:16 INFO mapreduce.Job: map 100% reduce 31% 15/07/13 17:06:17 INFO mapreduce.Job: map 100% reduce 100% 15/07/13 17:06:17 INFO mapreduce.Job: Job job_1436821014431_0003 completed successfully 15/07/13 17:06:17 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=358 FILE: Number of bytes written=2249017 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=4198 HDFS: Number of bytes written=215 HDFS: Number of read operations=67 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=16 Launched reduce tasks=1 Data-local map tasks=16 Total time spent by all maps in occupied slots (ms)=160498 Total time spent by all reduces in occupied slots (ms)=27302 Total time spent by all map tasks (ms)=80249 Total time spent by all reduce tasks (ms)=13651 Total vcore-seconds taken by all map tasks=80249 Total vcore-seconds taken by all reduce tasks=13651 Total megabyte-seconds taken by all map tasks=246524928 Total megabyte-seconds taken by all reduce tasks=41935872 Map-Reduce Framework Map input records=16 Map output records=32 Map output bytes=288 Map output materialized bytes=448 Input split bytes=2310 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=448 Reduce input records=32 Reduce output records=0 Spilled Records=64 Shuffled Maps =16 Failed Shuffles=0 Merged Map outputs=16 GC time elapsed (ms)=1501 CPU time spent (ms)=13670 Physical memory (bytes) snapshot=13480296448 Virtual memory (bytes) snapshot=72598511616 Total committed heap usage (bytes)=12508463104 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1888 File Output Format Counters Bytes Written=97 Job Finished in 226.813 seconds Estimated value of Pi is 3.14127500000000000000 {code} However, after enabling Kerberos, the job fails: {code} tdatuser@piripiri1:/root> kinit -kt /etc/security/keytabs/tdatuser.headless.keytab tdatuser tdatuser@piripiri1:/root> yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi 16 10000 Number of Maps = 16 Samples per Map = 10000 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Wrote input for Map #10 Wrote input for Map #11 Wrote input for Map #12 Wrote input for Map #13 Wrote input for Map #14 Wrote input for Map #15 Starting Job 15/07/13 17:27:05 INFO impl.TimelineClientImpl: Timeline service address: http://piripiri1.labs.teradata.com:8188/ws/v1/timeline/ 15/07/13 17:27:05 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 140 for tdatuser on ha-hdfs:PIRIPIRI 15/07/13 17:27:05 INFO security.TokenCache: Got dt for hdfs://PIRIPIRI; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: (HDFS_DELEGATION_TOKEN token 140 for tdatuser) 15/07/13 17:27:06 INFO input.FileInputFormat: Total input paths to process : 16 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: number of splits:16 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1436822321287_0007 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: (HDFS_DELEGATION_TOKEN token 140 for tdatuser) 15/07/13 17:27:06 INFO impl.YarnClientImpl: Submitted application application_1436822321287_0007 15/07/13 17:27:06 INFO mapreduce.Job: The url to track the job: http://piripiri2.labs.teradata.com:8088/proxy/application_1436822321287_0007/ 15/07/13 17:27:06 INFO mapreduce.Job: Running job: job_1436822321287_0007 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 running in uber mode : false 15/07/13 17:27:09 INFO mapreduce.Job: map 0% reduce 0% 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 failed with state FAILED due to: Application application_1436822321287_0007 failed 2 times due to AM Container for appattempt_1436822321287_0007_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://piripiri2.labs.teradata.com:8088/cluster/app/application_1436822321287_0007Then, click on links to logs of each attempt. Diagnostics: Application application_1436822321287_0007 initialization failed (exitCode=255) with output: main : command provided 0 main : run as user is tdatuser main : requested yarn user is tdatuser Can't create directory /data1/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data2/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data3/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data4/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data5/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data6/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data7/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data8/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data9/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data10/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data11/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data12/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Did not create any app directories Failing this attempt. Failing the application. 15/07/13 17:27:09 INFO mapreduce.Job: Counters: 0 Job Finished in 4.748 seconds java.io.FileNotFoundException: File does not exist: hdfs://PIRIPIRI/user/tdatuser/QuasiMonteCarlo_1436822823095_2120947622/out/reduce-out at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1752) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1776) at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314) at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} As seen above there are many "Can't create directory... Permission denied errors" related to the local usercache directory for the 'tdatuser'. Prior to enabling Kerberos, the contents of a usercache directory was as follows: {code} piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ total 0 drwxr-xr-x 3 yarn hadoop 21 Jul 13 16:59 ambari-qa drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser {code} After enabling Kerberos the contents are: {code} piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ total 0 drwxr-s--- 4 ambari-qa hadoop 37 Jul 13 17:21 ambari-qa drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser {code} It appears that the owner of the usercache directory for the 'ambari-qa' user was updated, but the 'tdatuser' directory was not. > Permission denied errors for local usercache directories when attempting to > run MapReduce job on Kerberos enabled cluster > -------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-3921 > URL: https://issues.apache.org/jira/browse/YARN-3921 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.7.1 > Environment: sles11sp3 > Reporter: Zack Marsh > > Prior to enabling Kerberos on the Hadoop cluster, I am able to run a simple > MapReduce example as the Linux user 'tdatuser': > {code} > iripiri1:~ # su tdatuser > tdatuser@piripiri1:/root> yarn jar > /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi > 16 10000 > Number of Maps = 16 > Samples per Map = 10000 > Wrote input for Map #0 > Wrote input for Map #1 > Wrote input for Map #2 > Wrote input for Map #3 > Wrote input for Map #4 > Wrote input for Map #5 > Wrote input for Map #6 > Wrote input for Map #7 > Wrote input for Map #8 > Wrote input for Map #9 > Wrote input for Map #10 > Wrote input for Map #11 > Wrote input for Map #12 > Wrote input for Map #13 > Wrote input for Map #14 > Wrote input for Map #15 > Starting Job > 15/07/13 17:02:31 INFO impl.TimelineClientImpl: Timeline service address: > http:/ s/v1/timeline/ > 15/07/13 17:02:31 INFO client.ConfiguredRMFailoverProxyProvider: Failing > over to > 15/07/13 17:02:31 INFO input.FileInputFormat: Total input paths to > process : 16 > 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: number of splits:16 > 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: > job_14 > 15/07/13 17:02:32 INFO impl.YarnClientImpl: Submitted application > application_14 > 15/07/13 17:02:32 INFO mapreduce.Job: The url to track the job: > http://piripiri3 cation_1436821014431_0003/ > 15/07/13 17:02:32 INFO mapreduce.Job: Running job: job_1436821014431_0003 > 15/07/13 17:05:50 INFO mapreduce.Job: Job job_1436821014431_0003 running > in uber mode : false > 15/07/13 17:05:50 INFO mapreduce.Job: map 0% reduce 0% > 15/07/13 17:05:56 INFO mapreduce.Job: map 6% reduce 0% > 15/07/13 17:06:00 INFO mapreduce.Job: map 13% reduce 0% > 15/07/13 17:06:01 INFO mapreduce.Job: map 38% reduce 0% > 15/07/13 17:06:05 INFO mapreduce.Job: map 44% reduce 0% > 15/07/13 17:06:07 INFO mapreduce.Job: map 63% reduce 0% > 15/07/13 17:06:09 INFO mapreduce.Job: map 69% reduce 0% > 15/07/13 17:06:11 INFO mapreduce.Job: map 75% reduce 0% > 15/07/13 17:06:12 INFO mapreduce.Job: map 81% reduce 0% > 15/07/13 17:06:13 INFO mapreduce.Job: map 81% reduce 25% > 15/07/13 17:06:14 INFO mapreduce.Job: map 94% reduce 25% > 15/07/13 17:06:16 INFO mapreduce.Job: map 100% reduce 31% > 15/07/13 17:06:17 INFO mapreduce.Job: map 100% reduce 100% > 15/07/13 17:06:17 INFO mapreduce.Job: Job job_1436821014431_0003 > completed successfully > 15/07/13 17:06:17 INFO mapreduce.Job: Counters: 49 > File System Counters > FILE: Number of bytes read=358 > FILE: Number of bytes written=2249017 > FILE: Number of read operations=0 > FILE: Number of large read operations=0 > FILE: Number of write operations=0 > HDFS: Number of bytes read=4198 > HDFS: Number of bytes written=215 > HDFS: Number of read operations=67 > HDFS: Number of large read operations=0 > HDFS: Number of write operations=3 > Job Counters > Launched map tasks=16 > Launched reduce tasks=1 > Data-local map tasks=16 > Total time spent by all maps in occupied slots (ms)=160498 > Total time spent by all reduces in occupied slots > (ms)=27302 > Total time spent by all map tasks (ms)=80249 > Total time spent by all reduce tasks (ms)=13651 > Total vcore-seconds taken by all map tasks=80249 > Total vcore-seconds taken by all reduce tasks=13651 > Total megabyte-seconds taken by all map tasks=246524928 > Total megabyte-seconds taken by all reduce tasks=41935872 > Map-Reduce Framework > Map input records=16 > Map output records=32 > Map output bytes=288 > Map output materialized bytes=448 > Input split bytes=2310 > Combine input records=0 > Combine output records=0 > Reduce input groups=2 > Reduce shuffle bytes=448 > Reduce input records=32 > Reduce output records=0 > Spilled Records=64 > Shuffled Maps =16 > Failed Shuffles=0 > Merged Map outputs=16 > GC time elapsed (ms)=1501 > CPU time spent (ms)=13670 > Physical memory (bytes) snapshot=13480296448 > Virtual memory (bytes) snapshot=72598511616 > Total committed heap usage (bytes)=12508463104 > Shuffle Errors > BAD_ID=0 > CONNECTION=0 > IO_ERROR=0 > WRONG_LENGTH=0 > WRONG_MAP=0 > WRONG_REDUCE=0 > File Input Format Counters > Bytes Read=1888 > File Output Format Counters > Bytes Written=97 > Job Finished in 226.813 seconds > Estimated value of Pi is 3.14127500000000000000 > {code} > However, after enabling Kerberos, the job fails: > {code} > tdatuser@piripiri1:/root> kinit -kt > /etc/security/keytabs/tdatuser.headless.keytab tdatuser > tdatuser@piripiri1:/root> yarn jar > /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi > 16 10000 > Number of Maps = 16 > Samples per Map = 10000 > Wrote input for Map #0 > Wrote input for Map #1 > Wrote input for Map #2 > Wrote input for Map #3 > Wrote input for Map #4 > Wrote input for Map #5 > Wrote input for Map #6 > Wrote input for Map #7 > Wrote input for Map #8 > Wrote input for Map #9 > Wrote input for Map #10 > Wrote input for Map #11 > Wrote input for Map #12 > Wrote input for Map #13 > Wrote input for Map #14 > Wrote input for Map #15 > Starting Job > 15/07/13 17:27:05 INFO impl.TimelineClientImpl: Timeline service address: > http://piripiri1.labs.teradata.com:8188/ws/v1/timeline/ > 15/07/13 17:27:05 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN > token 140 for tdatuser on ha-hdfs:PIRIPIRI > 15/07/13 17:27:05 INFO security.TokenCache: Got dt for hdfs://PIRIPIRI; > Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: > (HDFS_DELEGATION_TOKEN token 140 for tdatuser) > 15/07/13 17:27:06 INFO input.FileInputFormat: Total input paths to > process : 16 > 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: number of splits:16 > 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: > job_1436822321287_0007 > 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: > (HDFS_DELEGATION_TOKEN token 140 for tdatuser) > 15/07/13 17:27:06 INFO impl.YarnClientImpl: Submitted application > application_1436822321287_0007 > 15/07/13 17:27:06 INFO mapreduce.Job: The url to track the job: > http://piripiri2.labs.teradata.com:8088/proxy/application_1436822321287_0007/ > 15/07/13 17:27:06 INFO mapreduce.Job: Running job: job_1436822321287_0007 > 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 running > in uber mode : false > 15/07/13 17:27:09 INFO mapreduce.Job: map 0% reduce 0% > 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 failed > with state FAILED due to: Application application_1436822321287_0007 failed 2 > times due to AM Container for appattempt_1436822321287_0007_000002 exited > with exitCode: -1000 > For more detailed output, check application tracking > page:http://piripiri2.labs.teradata.com:8088/cluster/app/application_1436822321287_0007Then, > click on links to logs of each attempt. > Diagnostics: Application application_1436822321287_0007 initialization > failed (exitCode=255) with output: main : command provided 0 > main : run as user is tdatuser > main : requested yarn user is tdatuser > Can't create directory > /data1/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data2/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data3/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data4/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data5/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data6/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data7/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data8/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data9/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data10/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data11/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Can't create directory > /data12/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 > - Permission denied > Did not create any app directories > Failing this attempt. Failing the application. > 15/07/13 17:27:09 INFO mapreduce.Job: Counters: 0 > Job Finished in 4.748 seconds > java.io.FileNotFoundException: File does not exist: > hdfs://PIRIPIRI/user/tdatuser/QuasiMonteCarlo_1436822823095_2120947622/out/reduce-out > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1752) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1776) > at > org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314) > at > org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at > org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > As seen above there are many "Can't create directory... Permission denied > errors" related to the local usercache directory for the 'tdatuser'. > Prior to enabling Kerberos, the contents of a usercache directory was as > follows: > {code} > piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ > total 0 > drwxr-xr-x 3 yarn hadoop 21 Jul 13 16:59 ambari-qa > drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser > {code} > After enabling Kerberos the contents are: > {code} > piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ > total 0 > drwxr-s--- 4 ambari-qa hadoop 37 Jul 13 17:21 ambari-qa > drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser > {code} > It appears that the owner of the usercache directory for the 'ambari-qa' user > was updated, but the 'tdatuser' directory was not. > Is this expected behavior, and is there a recommended work-around for this > issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)