[spark] branch master updated: [SPARK-35562][DOC] Fix docs about Kubernetes and Yarn

dongjoon Mon, 31 May 2021 02:45:11 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 8c69e9c  [SPARK-35562][DOC] Fix docs about Kubernetes and Yarn
8c69e9c is described below

commit 8c69e9cd944415aaf301163128306f678e5bbcca
Author: Shiqi Sun <[email protected]>
AuthorDate: Mon May 31 02:43:58 2021 -0700

    [SPARK-35562][DOC] Fix docs about Kubernetes and Yarn
    
    Fixed some places in cluster-overview that are obsolete (i.e. not 
mentioning Kubernetes), and also fixed the Yarn spark-submit sample command in 
submitting-applications.
    
    ### What changes were proposed in this pull request?
    
    This is to fix the docs in "Cluster Overview" and "Submitting Applications" 
for places where Kubernetes is missed (mostly due to obsolete docs that haven't 
got updated) and where Yarn sample spark-submit command is incorrectly written.
    
    ### Why are the changes needed?
    
    To help the Spark users who uses Kubernetes as cluster manager to have a 
correct idea when reading the "Cluster Overview" doc page. Also to make the 
sample spark-submit command for Yarn actually runnable in the "Submitting 
Applications" doc page, by removing the invalid comment after line continuation 
char `\`.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    No test, as this is doc fix.
    
    Closes #32701 from huskysun/doc-fix.
    
    Authored-by: Shiqi Sun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 docs/cluster-overview.md        | 6 +++---
 docs/submitting-applications.md | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/cluster-overview.md b/docs/cluster-overview.md
index 69ae1b6..403f0bb 100644
--- a/docs/cluster-overview.md
+++ b/docs/cluster-overview.md
@@ -28,7 +28,7 @@ Spark applications run as independent sets of processes on a 
cluster, coordinate
 object in your main program (called the _driver program_).
 
 Specifically, to run on a cluster, the SparkContext can connect to several 
types of _cluster managers_
-(either Spark's own standalone cluster manager, Mesos or YARN), which allocate 
resources across
+(either Spark's own standalone cluster manager, Mesos, YARN or Kubernetes), 
which allocate resources across
 applications. Once connected, Spark acquires *executors* on nodes in the 
cluster, which are
 processes that run computations and store data for your application.
 Next, it sends your application code (defined by JAR or Python files passed to 
SparkContext) to
@@ -48,7 +48,7 @@ There are several useful things to note about this 
architecture:
    writing it to an external storage system.
 2. Spark is agnostic to the underlying cluster manager. As long as it can 
acquire executor
    processes, and these communicate with each other, it is relatively easy to 
run it even on a
-   cluster manager that also supports other applications (e.g. Mesos/YARN).
+   cluster manager that also supports other applications (e.g. 
Mesos/YARN/Kubernetes).
 3. The driver program must listen for and accept incoming connections from its 
executors throughout
    its lifetime (e.g., see [spark.driver.port in the network config
    section](configuration.html#networking)). As such, the driver program must 
be network
@@ -117,7 +117,7 @@ The following table summarizes terms you'll see used to 
refer to cluster concept
     </tr>
     <tr>
       <td>Cluster manager</td>
-      <td>An external service for acquiring resources on the cluster (e.g. 
standalone manager, Mesos, YARN)</td>
+      <td>An external service for acquiring resources on the cluster (e.g. 
standalone manager, Mesos, YARN, Kubernetes)</td>
     </tr>
     <tr>
       <td>Deploy mode</td>
diff --git a/docs/submitting-applications.md b/docs/submitting-applications.md
index 831b4f8..0319859 100644
--- a/docs/submitting-applications.md
+++ b/docs/submitting-applications.md
@@ -114,12 +114,12 @@ run it with `--help`. Here are a few examples of common 
options:
   /path/to/examples.jar \
   1000
 
-# Run on a YARN cluster
+# Run on a YARN cluster in cluster deploy mode
 export HADOOP_CONF_DIR=XXX
 ./bin/spark-submit \
   --class org.apache.spark.examples.SparkPi \
   --master yarn \
-  --deploy-mode cluster \  # can be client for client mode
+  --deploy-mode cluster \
   --executor-memory 20G \
   --num-executors 50 \
   /path/to/examples.jar \

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[spark] branch master updated: [SPARK-35562][DOC] Fix docs about Kubernetes and Yarn

Reply via email to