Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/601#discussion_r12173906
  
    --- Diff: docs/running-on-yarn.md ---
    @@ -12,12 +12,14 @@ was added to Spark in version 0.6.0, and improved in 
0.7.0 and 0.8.0.
     We need a consolidated Spark JAR (which bundles all the required 
dependencies) to run Spark jobs on a YARN cluster.
    --- End diff --
    
    This section is actually redundant with the "building with maven" section 
about Hadoop versions. Maybe it would make sense to just have a short 
introduction here that explains that (i) you need to have a version of spark 
that is specially compiled with YARN support if you don't already and (ii) if 
you don't have one go to the maven build to learn how to make one.
    
    I think right now if users go to this, the first thing they'll think is 
they have to go build Spark. But actually in almost all cases they can just 
download the pre-build yarn jar and be done with it. I think the first draft of 
this document was when we didn't package a binary with YARN.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to