dbtsai commented on a change in pull request #28788:
URL: https://github.com/apache/spark/pull/28788#discussion_r440618141



##########
File path: docs/running-on-yarn.md
##########
@@ -82,6 +82,19 @@ In `cluster` mode, the driver runs on a different machine 
than the client, so `S
 
 Running Spark on YARN requires a binary distribution of Spark which is built 
with YARN support.
 Binary distributions can be downloaded from the [downloads 
page](https://spark.apache.org/downloads.html) of the project website.
+There are two variants of Spark binary distributions you can download. One is 
pre-built with a certain
+version of Apache Hadoop; this Spark distribution contains built-in Hadoop 
runtime, so we call it `with-hadoop` Spark
+distribution. The other one is pre-built with user-provided Hadoop; since this 
Spark distribution
+doesn't contain a built-in Hadoop runtime, it's smaller, but users have to 
provide a Hadoop installation separately.
+We call this variant `no-hadoop` Spark distribution. For `with-hadoop` Spark 
distribution, since
+it contains a built-in Hadoop runtime already, by default, when a job is 
submitted to Hadoop Yarn cluster, to prevent jar conflict, it will not
+populate Yarn's classpath into Spark. To override this behavior, you can set 
<code>spark.yarn.populateHadoopClasspath=true</code>.
+For `no-hadoop` Spark distribution, Spark will populate Yarn's classpath by 
default in order to get Hadoop runtime. Note that some features such
+as Hive support are not available in `no-hadoop` Spark distribution. For 
`with-hadoop` Spark distribution,

Review comment:
       Maybe I'm wrong, but I got the impression from @dongjoon-hyun that 
no-hadoop Spark distribution doesn't support Hive.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to