Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/303#issuecomment-39998654
  
    The issue with yarn-cluster is the following: SparkPi.scala uses 
SparkContext.jarOfClass() to define which jar to add to the SparkContext. This 
ends up adding the path of the jar without the "local:" prefix, which means the 
jar is expected to be in the distributed cache (as per the comment in 
SparkContext: "In order for this to work in yarn-cluster mode the user must 
specify the --addjars option").
    
    If you add "-addJars 
/home/tgraves/test2/tgravescs-spark/examples/target/scala-2.10/spark-examples_2.10-assembly-1.0.0-SNAPSHOT.jar"
 to your command line it works (well, it works for me), but it sort of defeats 
the purpose of using local: URIs. I had a modified SparkPi in my tree that 
hardcoded a local: URI for the addJar() argument, and that worked fine without 
needing to add the extra argument (and did not incur in extra copying of the 
jar around).
    
    I'm not sure there's an easy way to fix this (how can SparkPi know to add 
the jar with a local: URI without some kind of command line argument telling it 
to do so?), but it's caused by the client code (in this case, SparkPi), so I'm 
less concerned.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to