[GitHub] [spark] Ngone51 commented on a change in pull request #33615: [SPARK-36374][SHUFFLE][DOC] Push-based shuffle high level user documentation

GitBox Wed, 11 Aug 2021 21:01:55 -0700


Ngone51 commented on a change in pull request #33615:
URL: https://github.com/apache/spark/pull/33615#discussion_r687326402




##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.

Review comment:
       Could you make this phrase consistent? I see both "push based" and 
"push-based" in many places.

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
+  <td>
+    <code>org.apache.spark.network.shuffle.<br 
/>NoOpMergedShuffleFileManager</code>
+  </td>
+  <td>
+    Class name of the implementation of MergedShuffleFileManager that manages 
push-based shuffle. This acts as a server side config to disable or enable 
push-based shuffle. By default, push-based shuffle is disabled at the server 
side. <p> To enable push-based shuffle on the server side, set this config to 
<code>org.apache.spark.network.shuffle.RemoteBlockPushResolver</code></p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  
<td><code>spark.shuffle.push.server.minChunkSizeInMergedShuffleFile</code></td>
+  <td><code>2m</code></td>
+  <td>
+    <p> The minimum size of a chunk when dividing a merged shuffle file into 
multiple chunks during push-based shuffle. A merged shuffle file consists of 
multiple small shuffle blocks. Fetching the complete merged shuffle file in a 
single disk I/O increases the memory requirements for both the clients and the 
external shuffle service. Instead, the shuffle service serves the merged file 
in <code>MB-sized chunks</code>.<br /> This configuration controls how big a 
chunk can get. A corresponding index file for each merged shuffle file will be 
generated indicating chunk boundaries. </p>
+    <p> Setting this too high would increase the memory requirements on both 
the clients and the external shuffle service. </p>
+    <p> Setting this too low would increase the overall number of RPC requests 
to external shuffle service unnecessarily.</p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedIndexCacheSize</code></td>
+  <td><code>100m</code></td>
+  <td>
+    The size of cache in memory which is used in push-based shuffle for 
storing merged index files. This cache is in addition to the one configured via 
<code>spark.shuffle.service.index.cache.size</code>.
+  </td>
+  <td>3.2.0</td>
+</tr>
+</table>
+
+### Client side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.enabled</code></td>
+  <td><code>false</code></td>
+  <td>
+    Set to 'true' to enable push-based shuffle on the client side and works in 
conjunction with the server side flag 
spark.shuffle.server.mergedShuffleFileManagerImpl.
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.merge.finalize.timeout</code></td>
+  <td><code>10s</code></td>
+  <td>
+    The amount of time driver waits in seconds, after all mappers have 
finished for a given shuffle map stage, before it sends merge finalize requests 
to remote shuffle services. This allows the shuffle services extra time to 
merge blocks.

Review comment:
       Also don't forget to update the corresponding `ConfigBuilder`.

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
+  <td>
+    <code>org.apache.spark.network.shuffle.<br 
/>NoOpMergedShuffleFileManager</code>
+  </td>
+  <td>
+    Class name of the implementation of MergedShuffleFileManager that manages 
push-based shuffle. This acts as a server side config to disable or enable 
push-based shuffle. By default, push-based shuffle is disabled at the server 
side. <p> To enable push-based shuffle on the server side, set this config to 
<code>org.apache.spark.network.shuffle.RemoteBlockPushResolver</code></p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  
<td><code>spark.shuffle.push.server.minChunkSizeInMergedShuffleFile</code></td>
+  <td><code>2m</code></td>
+  <td>
+    <p> The minimum size of a chunk when dividing a merged shuffle file into 
multiple chunks during push-based shuffle. A merged shuffle file consists of 
multiple small shuffle blocks. Fetching the complete merged shuffle file in a 
single disk I/O increases the memory requirements for both the clients and the 
external shuffle service. Instead, the shuffle service serves the merged file 
in <code>MB-sized chunks</code>.<br /> This configuration controls how big a 
chunk can get. A corresponding index file for each merged shuffle file will be 
generated indicating chunk boundaries. </p>
+    <p> Setting this too high would increase the memory requirements on both 
the clients and the external shuffle service. </p>
+    <p> Setting this too low would increase the overall number of RPC requests 
to external shuffle service unnecessarily.</p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedIndexCacheSize</code></td>
+  <td><code>100m</code></td>
+  <td>
+    The size of cache in memory which is used in push-based shuffle for 
storing merged index files. This cache is in addition to the one configured via 
<code>spark.shuffle.service.index.cache.size</code>.
+  </td>
+  <td>3.2.0</td>
+</tr>
+</table>
+
+### Client side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.enabled</code></td>
+  <td><code>false</code></td>
+  <td>
+    Set to 'true' to enable push-based shuffle on the client side and works in 
conjunction with the server side flag 
spark.shuffle.server.mergedShuffleFileManagerImpl.
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.merge.finalize.timeout</code></td>
+  <td><code>10s</code></td>
+  <td>
+    The amount of time driver waits in seconds, after all mappers have 
finished for a given shuffle map stage, before it sends merge finalize requests 
to remote shuffle services. This allows the shuffle services extra time to 
merge blocks.
+  </td>
+ <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.maxRetainedMergerLocations</code></td>
+  <td><code>500</code></td>
+  <td>
+    Maximum number of merger locations cached for push based shuffle. 
Currently, merger locations are hosts of external shuffle services responsible 
for handling pushed blocks, merging them and serving merged blocks for later 
shuffle fetch.
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.mergersMinThresholdRatio</code></td>
+  <td><code>0.05</code></td>
+  <td>
+    Ratio used to compute the minimum number of shuffle merger locations 
required for a stage based on the number of partitions for the reducer stage. 
For example, a reduce stage which has 100 partitions and uses the default value 
0.05 requires at least 5 unique merger locations to enable push based shuffle.
+  </td>
+ <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.mergersMinStaticThreshold</code></td>
+  <td><code>5</code></td>
+  <td>
+    The static threshold for number of shuffle push merger locations should be 
available in order to enable push based shuffle for a stage. Note this config 
works in conjunction with 
<code>spark.shuffle.push.mergersMinThresholdRatio</code>. Maximum of 
<code>spark.shuffle.push.mergersMinStaticThreshold</code> and 
<code>spark.shuffle.push.mergersMinThresholdRatio</code> ratio number of 
mergers needed to enable push based shuffle for a stage. For eg: with 1000 
partitions for the child stage with 
spark.shuffle.push.mergersMinStaticThreshold as 5 and 
spark.shuffle.push.mergersMinThresholdRatio set to 0.05, we would need at least 
50 mergers to enable push based shuffle for that stage.
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.maxBlockSizeToPush</code></td>
+  <td><code>1m</code></td>
+  <td>
+    <p> The max size of an individual block to push to the remote shuffle 
services. Blocks larger than this threshold are not pushed to be merged 
remotely. These shuffle blocks will be fetched by the executors in the original 
manner. </p>

Review comment:
       "These shuffle blocks will be fetched by the executors in the original 
manner. " -> "These shuffle blocks will be fetched in the original manner. "
   
   (it can also be fetched by external shuffle service in the original manner 
when the service is enabled.)

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
+  <td>
+    <code>org.apache.spark.network.shuffle.<br 
/>NoOpMergedShuffleFileManager</code>
+  </td>
+  <td>
+    Class name of the implementation of MergedShuffleFileManager that manages 
push-based shuffle. This acts as a server side config to disable or enable 
push-based shuffle. By default, push-based shuffle is disabled at the server 
side. <p> To enable push-based shuffle on the server side, set this config to 
<code>org.apache.spark.network.shuffle.RemoteBlockPushResolver</code></p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  
<td><code>spark.shuffle.push.server.minChunkSizeInMergedShuffleFile</code></td>
+  <td><code>2m</code></td>
+  <td>
+    <p> The minimum size of a chunk when dividing a merged shuffle file into 
multiple chunks during push-based shuffle. A merged shuffle file consists of 
multiple small shuffle blocks. Fetching the complete merged shuffle file in a 
single disk I/O increases the memory requirements for both the clients and the 
external shuffle service. Instead, the shuffle service serves the merged file 
in <code>MB-sized chunks</code>.<br /> This configuration controls how big a 
chunk can get. A corresponding index file for each merged shuffle file will be 
generated indicating chunk boundaries. </p>

Review comment:
       "...external shuffle service. Instead, the shuffle service..."
   
   Shall we make the "external shuffle service" consistent too?

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
+  <td>
+    <code>org.apache.spark.network.shuffle.<br 
/>NoOpMergedShuffleFileManager</code>
+  </td>
+  <td>
+    Class name of the implementation of MergedShuffleFileManager that manages 
push-based shuffle. This acts as a server side config to disable or enable 
push-based shuffle. By default, push-based shuffle is disabled at the server 
side. <p> To enable push-based shuffle on the server side, set this config to 
<code>org.apache.spark.network.shuffle.RemoteBlockPushResolver</code></p>

Review comment:
       wrap `MergedShuffleFileManager` with code block?

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options

Review comment:
       "Shuffle server" -> "Shuffle service(server)"?

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
+  <td>
+    <code>org.apache.spark.network.shuffle.<br 
/>NoOpMergedShuffleFileManager</code>
+  </td>
+  <td>
+    Class name of the implementation of MergedShuffleFileManager that manages 
push-based shuffle. This acts as a server side config to disable or enable 
push-based shuffle. By default, push-based shuffle is disabled at the server 
side. <p> To enable push-based shuffle on the server side, set this config to 
<code>org.apache.spark.network.shuffle.RemoteBlockPushResolver</code></p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  
<td><code>spark.shuffle.push.server.minChunkSizeInMergedShuffleFile</code></td>
+  <td><code>2m</code></td>
+  <td>
+    <p> The minimum size of a chunk when dividing a merged shuffle file into 
multiple chunks during push-based shuffle. A merged shuffle file consists of 
multiple small shuffle blocks. Fetching the complete merged shuffle file in a 
single disk I/O increases the memory requirements for both the clients and the 
external shuffle service. Instead, the shuffle service serves the merged file 
in <code>MB-sized chunks</code>.<br /> This configuration controls how big a 
chunk can get. A corresponding index file for each merged shuffle file will be 
generated indicating chunk boundaries. </p>
+    <p> Setting this too high would increase the memory requirements on both 
the clients and the external shuffle service. </p>
+    <p> Setting this too low would increase the overall number of RPC requests 
to external shuffle service unnecessarily.</p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedIndexCacheSize</code></td>
+  <td><code>100m</code></td>
+  <td>
+    The size of cache in memory which is used in push-based shuffle for 
storing merged index files. This cache is in addition to the one configured via 
<code>spark.shuffle.service.index.cache.size</code>.
+  </td>
+  <td>3.2.0</td>
+</tr>
+</table>
+
+### Client side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.enabled</code></td>
+  <td><code>false</code></td>
+  <td>
+    Set to 'true' to enable push-based shuffle on the client side and works in 
conjunction with the server side flag 
spark.shuffle.server.mergedShuffleFileManagerImpl.
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.merge.finalize.timeout</code></td>

Review comment:
       Looks like only two configs (another is 
"spark.shuffle.push.merge.results.timeout") use "push.merge". Shall we use 
"push" directly?

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
+  <td>
+    <code>org.apache.spark.network.shuffle.<br 
/>NoOpMergedShuffleFileManager</code>
+  </td>
+  <td>
+    Class name of the implementation of MergedShuffleFileManager that manages 
push-based shuffle. This acts as a server side config to disable or enable 
push-based shuffle. By default, push-based shuffle is disabled at the server 
side. <p> To enable push-based shuffle on the server side, set this config to 
<code>org.apache.spark.network.shuffle.RemoteBlockPushResolver</code></p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  
<td><code>spark.shuffle.push.server.minChunkSizeInMergedShuffleFile</code></td>
+  <td><code>2m</code></td>
+  <td>
+    <p> The minimum size of a chunk when dividing a merged shuffle file into 
multiple chunks during push-based shuffle. A merged shuffle file consists of 
multiple small shuffle blocks. Fetching the complete merged shuffle file in a 
single disk I/O increases the memory requirements for both the clients and the 
external shuffle service. Instead, the shuffle service serves the merged file 
in <code>MB-sized chunks</code>.<br /> This configuration controls how big a 
chunk can get. A corresponding index file for each merged shuffle file will be 
generated indicating chunk boundaries. </p>
+    <p> Setting this too high would increase the memory requirements on both 
the clients and the external shuffle service. </p>
+    <p> Setting this too low would increase the overall number of RPC requests 
to external shuffle service unnecessarily.</p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedIndexCacheSize</code></td>
+  <td><code>100m</code></td>
+  <td>
+    The size of cache in memory which is used in push-based shuffle for 
storing merged index files. This cache is in addition to the one configured via 
<code>spark.shuffle.service.index.cache.size</code>.

Review comment:
       "The size of cache in memory which is used in push-based shuffle...." -> 
"The maximum size of cache in memory which could be used in push-based 
shuffle..."

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
+  <td>
+    <code>org.apache.spark.network.shuffle.<br 
/>NoOpMergedShuffleFileManager</code>
+  </td>
+  <td>
+    Class name of the implementation of MergedShuffleFileManager that manages 
push-based shuffle. This acts as a server side config to disable or enable 
push-based shuffle. By default, push-based shuffle is disabled at the server 
side. <p> To enable push-based shuffle on the server side, set this config to 
<code>org.apache.spark.network.shuffle.RemoteBlockPushResolver</code></p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  
<td><code>spark.shuffle.push.server.minChunkSizeInMergedShuffleFile</code></td>
+  <td><code>2m</code></td>
+  <td>
+    <p> The minimum size of a chunk when dividing a merged shuffle file into 
multiple chunks during push-based shuffle. A merged shuffle file consists of 
multiple small shuffle blocks. Fetching the complete merged shuffle file in a 
single disk I/O increases the memory requirements for both the clients and the 
external shuffle service. Instead, the shuffle service serves the merged file 
in <code>MB-sized chunks</code>.<br /> This configuration controls how big a 
chunk can get. A corresponding index file for each merged shuffle file will be 
generated indicating chunk boundaries. </p>
+    <p> Setting this too high would increase the memory requirements on both 
the clients and the external shuffle service. </p>
+    <p> Setting this too low would increase the overall number of RPC requests 
to external shuffle service unnecessarily.</p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedIndexCacheSize</code></td>
+  <td><code>100m</code></td>
+  <td>
+    The size of cache in memory which is used in push-based shuffle for 
storing merged index files. This cache is in addition to the one configured via 
<code>spark.shuffle.service.index.cache.size</code>.
+  </td>
+  <td>3.2.0</td>
+</tr>
+</table>
+
+### Client side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.enabled</code></td>
+  <td><code>false</code></td>
+  <td>
+    Set to 'true' to enable push-based shuffle on the client side and works in 
conjunction with the server side flag 
spark.shuffle.server.mergedShuffleFileManagerImpl.
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.merge.finalize.timeout</code></td>

Review comment:
       BTW, why "spark.shuffle.push.merge.results.timeout" isn't documented?

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
+  <td>
+    <code>org.apache.spark.network.shuffle.<br 
/>NoOpMergedShuffleFileManager</code>
+  </td>
+  <td>
+    Class name of the implementation of MergedShuffleFileManager that manages 
push-based shuffle. This acts as a server side config to disable or enable 
push-based shuffle. By default, push-based shuffle is disabled at the server 
side. <p> To enable push-based shuffle on the server side, set this config to 
<code>org.apache.spark.network.shuffle.RemoteBlockPushResolver</code></p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  
<td><code>spark.shuffle.push.server.minChunkSizeInMergedShuffleFile</code></td>
+  <td><code>2m</code></td>
+  <td>
+    <p> The minimum size of a chunk when dividing a merged shuffle file into 
multiple chunks during push-based shuffle. A merged shuffle file consists of 
multiple small shuffle blocks. Fetching the complete merged shuffle file in a 
single disk I/O increases the memory requirements for both the clients and the 
external shuffle service. Instead, the shuffle service serves the merged file 
in <code>MB-sized chunks</code>.<br /> This configuration controls how big a 
chunk can get. A corresponding index file for each merged shuffle file will be 
generated indicating chunk boundaries. </p>
+    <p> Setting this too high would increase the memory requirements on both 
the clients and the external shuffle service. </p>
+    <p> Setting this too low would increase the overall number of RPC requests 
to external shuffle service unnecessarily.</p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedIndexCacheSize</code></td>
+  <td><code>100m</code></td>
+  <td>
+    The size of cache in memory which is used in push-based shuffle for 
storing merged index files. This cache is in addition to the one configured via 
<code>spark.shuffle.service.index.cache.size</code>.
+  </td>
+  <td>3.2.0</td>
+</tr>
+</table>
+
+### Client side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.enabled</code></td>
+  <td><code>false</code></td>
+  <td>
+    Set to 'true' to enable push-based shuffle on the client side and works in 
conjunction with the server side flag 
spark.shuffle.server.mergedShuffleFileManagerImpl.
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.merge.finalize.timeout</code></td>
+  <td><code>10s</code></td>
+  <td>
+    The amount of time driver waits in seconds, after all mappers have 
finished for a given shuffle map stage, before it sends merge finalize requests 
to remote shuffle services. This allows the shuffle services extra time to 
merge blocks.
+  </td>
+ <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.maxRetainedMergerLocations</code></td>
+  <td><code>500</code></td>
+  <td>
+    Maximum number of merger locations cached for push based shuffle. 
Currently, merger locations are hosts of external shuffle services responsible 
for handling pushed blocks, merging them and serving merged blocks for later 
shuffle fetch.
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.mergersMinThresholdRatio</code></td>
+  <td><code>0.05</code></td>
+  <td>
+    Ratio used to compute the minimum number of shuffle merger locations 
required for a stage based on the number of partitions for the reducer stage. 
For example, a reduce stage which has 100 partitions and uses the default value 
0.05 requires at least 5 unique merger locations to enable push based shuffle.
+  </td>
+ <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.mergersMinStaticThreshold</code></td>
+  <td><code>5</code></td>
+  <td>
+    The static threshold for number of shuffle push merger locations should be 
available in order to enable push based shuffle for a stage. Note this config 
works in conjunction with 
<code>spark.shuffle.push.mergersMinThresholdRatio</code>. Maximum of 
<code>spark.shuffle.push.mergersMinStaticThreshold</code> and 
<code>spark.shuffle.push.mergersMinThresholdRatio</code> ratio number of 
mergers needed to enable push based shuffle for a stage. For eg: with 1000 
partitions for the child stage with 
spark.shuffle.push.mergersMinStaticThreshold as 5 and 
spark.shuffle.push.mergersMinThresholdRatio set to 0.05, we would need at least 
50 mergers to enable push based shuffle for that stage.

Review comment:
       "For eg" -> "For example"

##########
File path: docs/configuration.md
##########
@@ -3152,3 +3152,109 @@ The stage level scheduling feature allows users to 
specify task and executor res
 This is only available for the RDD API in Scala, Java, and Python.  It is 
available on YARN and Kubernetes when dynamic allocation is enabled. See the 
[YARN](running-on-yarn.html#stage-level-scheduling-overview) page or 
[Kubernetes](running-on-kubernetes.html#stage-level-scheduling-overview) page 
for more implementation details.
 
 See the `RDD.withResources` and `ResourceProfileBuilder` API's for using this 
feature. The current implementation acquires new executors for each 
`ResourceProfile`  created and currently has to be an exact match. Spark does 
not try to fit tasks into an executor that require a different ResourceProfile 
than the executor was created with. Executors that are not in use will idle 
timeout with the dynamic allocation logic. The default configuration for this 
feature is to only allow one ResourceProfile per stage. If the user associates 
more then 1 ResourceProfile to an RDD, Spark will throw an exception by 
default. See config `spark.scheduler.resource.profileMergeConflicts` to control 
that behavior. The current merge strategy Spark implements when 
`spark.scheduler.resource.profileMergeConflicts` is enabled is a simple max of 
each resource within the conflicting ResourceProfiles. Spark will create a new 
ResourceProfile with the max of each of the resources.
+
+# Push-based shuffle overview
+
+Push based shuffle helps improve the reliability and performance of spark 
shuffle. It takes a best-effort approach to push the shuffle blocks generated 
by the map tasks to remote shuffle services to be merged per shuffle partition. 
Reduce tasks fetch a combination of merged shuffle partitions and original 
shuffle blocks as their input data, resulting in converting small random disk 
reads by shuffle services into large sequential reads. Possibility of better 
data locality for reduce tasks additionally helps minimize network IO.
+
+<p> Push-based shuffle improves performance for long running jobs/queries 
which involves large disk I/O during shuffle. Currently it is not well suited 
for jobs/queries which runs quickly dealing with lesser amount of shuffle data. 
This will be further improved in the future releases.</p>
+
+<p> <b> Currently push-based shuffle is only supported for Spark on YARN with 
external shuffle service. </b></p>
+
+### Shuffle server side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedShuffleFileManagerImpl</code></td>
+  <td>
+    <code>org.apache.spark.network.shuffle.<br 
/>NoOpMergedShuffleFileManager</code>
+  </td>
+  <td>
+    Class name of the implementation of MergedShuffleFileManager that manages 
push-based shuffle. This acts as a server side config to disable or enable 
push-based shuffle. By default, push-based shuffle is disabled at the server 
side. <p> To enable push-based shuffle on the server side, set this config to 
<code>org.apache.spark.network.shuffle.RemoteBlockPushResolver</code></p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  
<td><code>spark.shuffle.push.server.minChunkSizeInMergedShuffleFile</code></td>
+  <td><code>2m</code></td>
+  <td>
+    <p> The minimum size of a chunk when dividing a merged shuffle file into 
multiple chunks during push-based shuffle. A merged shuffle file consists of 
multiple small shuffle blocks. Fetching the complete merged shuffle file in a 
single disk I/O increases the memory requirements for both the clients and the 
external shuffle service. Instead, the shuffle service serves the merged file 
in <code>MB-sized chunks</code>.<br /> This configuration controls how big a 
chunk can get. A corresponding index file for each merged shuffle file will be 
generated indicating chunk boundaries. </p>
+    <p> Setting this too high would increase the memory requirements on both 
the clients and the external shuffle service. </p>
+    <p> Setting this too low would increase the overall number of RPC requests 
to external shuffle service unnecessarily.</p>
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.server.mergedIndexCacheSize</code></td>
+  <td><code>100m</code></td>
+  <td>
+    The size of cache in memory which is used in push-based shuffle for 
storing merged index files. This cache is in addition to the one configured via 
<code>spark.shuffle.service.index.cache.size</code>.
+  </td>
+  <td>3.2.0</td>
+</tr>
+</table>
+
+### Client side configuration options
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since 
Version</th></tr>
+<tr>
+  <td><code>spark.shuffle.push.enabled</code></td>
+  <td><code>false</code></td>
+  <td>
+    Set to 'true' to enable push-based shuffle on the client side and works in 
conjunction with the server side flag 
spark.shuffle.server.mergedShuffleFileManagerImpl.
+  </td>
+  <td>3.2.0</td>
+</tr>
+<tr>
+  <td><code>spark.shuffle.push.merge.finalize.timeout</code></td>
+  <td><code>10s</code></td>
+  <td>
+    The amount of time driver waits in seconds, after all mappers have 
finished for a given shuffle map stage, before it sends merge finalize requests 
to remote shuffle services. This allows the shuffle services extra time to 
merge blocks.

Review comment:
       "This allows the shuffle services extra time to merge blocks." -> "This 
gives the shuffle services extra time to merge blocks. Setting this too long 
could potentially lead to performance regression"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] Ngone51 commented on a change in pull request #33615: [SPARK-36374][SHUFFLE][DOC] Push-based shuffle high level user documentation

Reply via email to