[ 
https://issues.apache.org/jira/browse/HDDS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646036#comment-16646036
 ] 

Soumitra Sulav commented on HDDS-584:
-------------------------------------

Before running any job if we clear the yarn-staging-dir i.e. 
/tmp/hadoop-yarn/staging, it runs on the first submit. Reason being the owner 
check is validated only if the mentioned folder exists.

Also for above scenario no two users can run the job simultaneously.

To skip the ownership check code change would be needed in mapreduce/yarn end.

 

> OzoneFS with HDP failing to run YARN jobs
> -----------------------------------------
>
>                 Key: HDDS-584
>                 URL: https://issues.apache.org/jira/browse/HDDS-584
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Filesystem
>    Affects Versions: 0.3.0
>         Environment: OS - RHEL7.3
> Openstack based VMs : 3 Node HDP, 3 Node Ozone
>            Reporter: Soumitra Sulav
>            Priority: Major
>
> YARN jobs are failing on ozonefs with below exception :
> {code:java}
> java.io.IOException: The ownership on the staging directory 
> /tmp/hadoop-yarn/staging/hdfs/.staging is not as expected. It is owned by . 
> The directory must be owned by the submitter hdfs or hdfs
> at 
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:152)
> at 
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:113)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:151)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
> at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
> {code}
> Example Job was run using below command both with user root & hdfs :
> {code:java}
> hadoop jar 
> /usr/hdp/3.0.0.0-1634/hadoop-mapreduce/hadoop-mapreduce-examples.jar 
> wordcount /hosts /tmp/hosts
> {code}
> YARN/MR Job is checking the file/folder ownership of the user staging 
> directory and if it doesn't matches with the user who is submitting the job, 
> it throws above exception.
> Ownership check happens in below file : 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmissionFiles.java#L144]
> In HDDS/OzoneFS staging area is created accordingly but with no owner :
> {code:java}
> [root@hcatest-4 ~]# hdfs dfs -ls -R /tmp
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.0.0-1634/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 18/10/08 10:11:11 INFO conf.Configuration: Removed undeclared tags:
> drwxrwxrwx - 0 2018-10-04 11:20 /tmp/entity-file-history
> drwxrwxrwx - 0 2018-10-04 11:20 /tmp/entity-file-history/active
> drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn
> drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging
> drwxrwxrwx - 0 2018-10-05 11:56 /tmp/hadoop-yarn/staging/hdfs
> drwxrwxrwx - 0 2018-10-05 11:56 /tmp/hadoop-yarn/staging/hdfs/.staging
> drwxrwxrwx - 0 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002
> -rw-rw-rw- 1 316239 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.jar
> -rw-rw-rw- 1 104 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.split
> -rw-rw-rw- 1 23 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.splitmetainfo
> -rw-rw-rw- 1 213088 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.xml
> drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging/root
> drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging/root/.staging
> drwxrwxrwx - 0 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001
> -rw-rw-rw- 1 316239 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.jar
> -rw-rw-rw- 1 104 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.split
> -rw-rw-rw- 1 23 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.splitmetainfo
> -rw-rw-rw- 1 213679 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.xml
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to