maropu commented on a change in pull request #31598:
URL: https://github.com/apache/spark/pull/31598#discussion_r584685902
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
+ </td>
+ <td>
+ The following is a list of such configurations:
+ <ul>
+ <li><code>spark.driver.memory</code></li>
+ <li><code>spark.driver.memoryOverhead</code></li>
+ <li><code>spark.driver.cores</code></li>
+ <li><code>spark.driver.userClassPathFirst</code></li>
+ <li><code>spark.driver.extraClassPath</code></li>
+ <li><code>spark.driver.defaultJavaOptions</code></li>
+ <li><code>spark.driver.extraJavaOptions</code></li>
+ <li><code>spark.driver.extraLibraryPath</code></li>
+ <li><code>spark.driver.resource.*</code></li>
+ <li><code>spark.pyspark.driver.python</code></li>
+ <li><code>spark.pyspark.python</code></li>
+ <li><code>spark.r.shell.command</code></li>
+ <li><code>spark.launcher.childProcLoggerName</code></li>
+ <li><code>spark.launcher.childConnectionTimeout</code></li>
+ <li><code>spark.yarn.driver.*</code></li>
+ </ul>
+ </td>
+</tr>
+<tr>
+ <td><code>Application Deployment</code></td>
Review comment:
`Application Deployment` -> `Deploying Application` according to
`Launching Driver`?
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
Review comment:
`Configuration` -> `Configurations`?
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
+ </td>
+ <td>
+ The following is a list of such configurations:
+ <ul>
+ <li><code>spark.driver.memory</code></li>
+ <li><code>spark.driver.memoryOverhead</code></li>
+ <li><code>spark.driver.cores</code></li>
+ <li><code>spark.driver.userClassPathFirst</code></li>
+ <li><code>spark.driver.extraClassPath</code></li>
+ <li><code>spark.driver.defaultJavaOptions</code></li>
+ <li><code>spark.driver.extraJavaOptions</code></li>
+ <li><code>spark.driver.extraLibraryPath</code></li>
+ <li><code>spark.driver.resource.*</code></li>
+ <li><code>spark.pyspark.driver.python</code></li>
+ <li><code>spark.pyspark.python</code></li>
+ <li><code>spark.r.shell.command</code></li>
+ <li><code>spark.launcher.childProcLoggerName</code></li>
+ <li><code>spark.launcher.childConnectionTimeout</code></li>
+ <li><code>spark.yarn.driver.*</code></li>
+ </ul>
+ </td>
+</tr>
+<tr>
+ <td><code>Application Deployment</code></td>
+ <td>
+ Like <code>spark.master</code>, <code>spark.executor.instances</code>,
this kind of properties may not be affected when setting programmatically
through <code>SparkConf</code> in runtime after SparkContext has been started,
or the behavior is depending on which cluster manager and deploy mode you
choose, so it would be suggested to set through configuration file,
<code>spark-submit</code> command line options, or setting programmatically
through <code>SparkConf</code> in runtime before start SparkContext.
Review comment:
`through configuration file` -> `through a configuration file`?
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
Review comment:
`configuration file` -> `a configuration file`?
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
Review comment:
`these kind` -> `these kinds`?
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
+ </td>
+ <td>
+ The following is a list of such configurations:
+ <ul>
+ <li><code>spark.driver.memory</code></li>
+ <li><code>spark.driver.memoryOverhead</code></li>
+ <li><code>spark.driver.cores</code></li>
+ <li><code>spark.driver.userClassPathFirst</code></li>
+ <li><code>spark.driver.extraClassPath</code></li>
+ <li><code>spark.driver.defaultJavaOptions</code></li>
+ <li><code>spark.driver.extraJavaOptions</code></li>
+ <li><code>spark.driver.extraLibraryPath</code></li>
+ <li><code>spark.driver.resource.*</code></li>
+ <li><code>spark.pyspark.driver.python</code></li>
+ <li><code>spark.pyspark.python</code></li>
+ <li><code>spark.r.shell.command</code></li>
+ <li><code>spark.launcher.childProcLoggerName</code></li>
+ <li><code>spark.launcher.childConnectionTimeout</code></li>
+ <li><code>spark.yarn.driver.*</code></li>
+ </ul>
+ </td>
+</tr>
+<tr>
+ <td><code>Application Deployment</code></td>
+ <td>
+ Like <code>spark.master</code>, <code>spark.executor.instances</code>,
this kind of properties may not be affected when setting programmatically
through <code>SparkConf</code> in runtime after SparkContext has been started,
or the behavior is depending on which cluster manager and deploy mode you
choose, so it would be suggested to set through configuration file,
<code>spark-submit</code> command line options, or setting programmatically
through <code>SparkConf</code> in runtime before start SparkContext.
Review comment:
`SparkContext` -> `<code>SparkContext</code>`?
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
Review comment:
Is this title "`Configuration Type`" suitable for `Launching Driver` and
`Application Deployment`?
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
Review comment:
`driver's JVM` -> `a driver JVM`?
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
+ </td>
+ <td>
+ The following is a list of such configurations:
+ <ul>
+ <li><code>spark.driver.memory</code></li>
+ <li><code>spark.driver.memoryOverhead</code></li>
+ <li><code>spark.driver.cores</code></li>
+ <li><code>spark.driver.userClassPathFirst</code></li>
+ <li><code>spark.driver.extraClassPath</code></li>
+ <li><code>spark.driver.defaultJavaOptions</code></li>
+ <li><code>spark.driver.extraJavaOptions</code></li>
+ <li><code>spark.driver.extraLibraryPath</code></li>
+ <li><code>spark.driver.resource.*</code></li>
+ <li><code>spark.pyspark.driver.python</code></li>
+ <li><code>spark.pyspark.python</code></li>
+ <li><code>spark.r.shell.command</code></li>
+ <li><code>spark.launcher.childProcLoggerName</code></li>
+ <li><code>spark.launcher.childConnectionTimeout</code></li>
+ <li><code>spark.yarn.driver.*</code></li>
+ </ul>
+ </td>
+</tr>
+<tr>
+ <td><code>Application Deployment</code></td>
+ <td>
+ Like <code>spark.master</code>, <code>spark.executor.instances</code>,
this kind of properties may not be affected when setting programmatically
through <code>SparkConf</code> in runtime after SparkContext has been started,
or the behavior is depending on which cluster manager and deploy mode you
choose, so it would be suggested to set through configuration file,
<code>spark-submit</code> command line options, or setting programmatically
through <code>SparkConf</code> in runtime before start SparkContext.
Review comment:
`before start SparkContext. ` -> `before starting SparkContext. `
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
+ </td>
+ <td>
+ The following is a list of such configurations:
+ <ul>
+ <li><code>spark.driver.memory</code></li>
+ <li><code>spark.driver.memoryOverhead</code></li>
+ <li><code>spark.driver.cores</code></li>
+ <li><code>spark.driver.userClassPathFirst</code></li>
+ <li><code>spark.driver.extraClassPath</code></li>
+ <li><code>spark.driver.defaultJavaOptions</code></li>
+ <li><code>spark.driver.extraJavaOptions</code></li>
+ <li><code>spark.driver.extraLibraryPath</code></li>
+ <li><code>spark.driver.resource.*</code></li>
+ <li><code>spark.pyspark.driver.python</code></li>
+ <li><code>spark.pyspark.python</code></li>
+ <li><code>spark.r.shell.command</code></li>
+ <li><code>spark.launcher.childProcLoggerName</code></li>
+ <li><code>spark.launcher.childConnectionTimeout</code></li>
+ <li><code>spark.yarn.driver.*</code></li>
+ </ul>
+ </td>
+</tr>
+<tr>
+ <td><code>Application Deployment</code></td>
+ <td>
+ Like <code>spark.master</code>, <code>spark.executor.instances</code>,
this kind of properties may not be affected when setting programmatically
through <code>SparkConf</code> in runtime after SparkContext has been started,
or the behavior is depending on which cluster manager and deploy mode you
choose, so it would be suggested to set through configuration file,
<code>spark-submit</code> command line options, or setting programmatically
through <code>SparkConf</code> in runtime before start SparkContext.
Review comment:
`and deploy mode` -> `and the deploy mode`?
##########
File path: docs/configuration.md
##########
@@ -114,12 +114,52 @@ in the `spark-defaults.conf` file. A few configuration
keys have been renamed si
versions of Spark; in such cases, the older key names are still accepted, but
take lower
precedence than any instance of the newer key.
-Spark properties mainly can be divided into two kinds: one is related to
deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may
not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is
depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set
through configuration
-file or `spark-submit` command line options; another is mainly related to
Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either
way.
+Note that Spark properties have different effective timing and they can be
divided into two kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+ <td><code>Launching Driver</code></td>
+ <td>
+ Configuration used to submit an application, such as
<code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>,
these kind of properties only effect before driver's JVM is started, so it
would be suggested to set through configuration file or
<code>spark-submit</code> command line options.
+ </td>
+ <td>
+ The following is a list of such configurations:
+ <ul>
+ <li><code>spark.driver.memory</code></li>
+ <li><code>spark.driver.memoryOverhead</code></li>
+ <li><code>spark.driver.cores</code></li>
+ <li><code>spark.driver.userClassPathFirst</code></li>
+ <li><code>spark.driver.extraClassPath</code></li>
+ <li><code>spark.driver.defaultJavaOptions</code></li>
+ <li><code>spark.driver.extraJavaOptions</code></li>
+ <li><code>spark.driver.extraLibraryPath</code></li>
+ <li><code>spark.driver.resource.*</code></li>
+ <li><code>spark.pyspark.driver.python</code></li>
+ <li><code>spark.pyspark.python</code></li>
+ <li><code>spark.r.shell.command</code></li>
+ <li><code>spark.launcher.childProcLoggerName</code></li>
+ <li><code>spark.launcher.childConnectionTimeout</code></li>
+ <li><code>spark.yarn.driver.*</code></li>
+ </ul>
+ </td>
+</tr>
+<tr>
+ <td><code>Application Deployment</code></td>
+ <td>
+ Like <code>spark.master</code>, <code>spark.executor.instances</code>,
this kind of properties may not be affected when setting programmatically
through <code>SparkConf</code> in runtime after SparkContext has been started,
or the behavior is depending on which cluster manager and deploy mode you
choose, so it would be suggested to set through configuration file,
<code>spark-submit</code> command line options, or setting programmatically
through <code>SparkConf</code> in runtime before start SparkContext.
Review comment:
`is depending` -> `depends`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]