Sathishkumar Manimoorthy created HADOOP-14493:
-------------------------------------------------

             Summary: YARN distributed shell application fails, when RM failed 
over or Restarts
                 Key: HADOOP-14493
                 URL: https://issues.apache.org/jira/browse/HADOOP-14493
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 2.7.0
            Reporter: Sathishkumar Manimoorthy
            Priority: Minor


YARN Distributed shell application fails when doing RM failover or RM restarts.

Exception trace:

17/05/30 11:57:38 DEBUG security.UserGroupInformation: PrivilegedAction as:mapr 
(auth:SIMPLE) 
from:org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.renameScriptFile(ApplicationMaster.java:1032)
17/05/30 11:57:38 DEBUG security.UserGroupInformation: 
PrivilegedActionException as:mapr (auth:SIMPLE) cause:java.io.IOException: 
Invalid source or target
17/05/30 11:57:38 ERROR distributedshell.ApplicationMaster: Not able to add 
suffix (.bat/.sh) to the shell script filename
java.io.IOException: Invalid source or target
        at com.mapr.fs.MapRFileSystem.rename(MapRFileSystem.java:1132)
        at 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$2.run(ApplicationMaster.java:1036)
        at 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$2.run(ApplicationMaster.java:1032)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
        at 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.renameScriptFile(ApplicationMaster.java:1032)
        at 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$1400(ApplicationMaster.java:167)
        at 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:953)
        at java.lang.Thread.run(Thread.java:748)

DS application trying to lo launch the additional container and it is failing 
to rename the path Execscript.sh as it was already renamed by the previous 
containers in  filesystem path.

I will upload the logs and path details soon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to