Marcelo Vanzin created SPARK-19277:
--------------------------------------

             Summary: YARN topology script configuration needs to be localized 
by Spark
                 Key: SPARK-19277
                 URL: https://issues.apache.org/jira/browse/SPARK-19277
             Project: Spark
          Issue Type: Bug
          Components: YARN
    Affects Versions: 2.1.0
            Reporter: Marcelo Vanzin
            Priority: Minor


(This really affects multiple versions, not just 2.1.0.)

YARN has this configuration, {{net.topology.script.file.name}}, that defines a 
script to be run to figure out the cluster topology (which hosts are on which 
racks, etc). That configuration is generally a hardcoded path that is not 
parameterized; so when Spark runs the driver or AM on a separate host, that 
path may not exist, and then your log will be spammed with errors like this:

{noformat}
java.io.IOException: Cannot run program "/path/to/script" (in directory 
"/container/working/dir"): error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:548)
        at org.apache.hadoop.util.Shell.run(Shell.java:504)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786)
        at 
org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:251)
        at 
org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:188)
        at 
org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:119)
        at 
org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:101)
        at 
org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:95)
        at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.resolveRacks(AMRMClientImpl.java:548)
        at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.addContainerRequest(AMRMClientImpl.java:410)
        at 
org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$updateResourceRequests$4.apply(YarnAllocator.scala:283)
        at 
org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$updateResourceRequests$4.apply(YarnAllocator.scala:281)
        at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at 
org.apache.spark.deploy.yarn.YarnAllocator.updateResourceRequests(YarnAllocator.scala:281)
        at 
org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:220)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:368)
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:186)
        at java.lang.ProcessImpl.start(ProcessImpl.java:130)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
{noformat}

Normally that error doesn't cause issues; at worst, task localization will be 
off because rack information is not available. But it's noisy, and if it 
happens enough, it may cause the YarnAllocator to slow down and cause other 
issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to