This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 4db378fae30 [SPARK-44745][DOCS][K8S] Document shuffle data recovery 
from the remounted K8s PVCs
4db378fae30 is described below

commit 4db378fae30733cbd2be41e95a3cd8ad2184e06f
Author: Dongjoon Hyun <dongj...@apache.org>
AuthorDate: Wed Aug 9 15:25:33 2023 -0700

    [SPARK-44745][DOCS][K8S] Document shuffle data recovery from the remounted 
K8s PVCs
    
    ### What changes were proposed in this pull request?
    
    This PR aims to document an example of shuffle data recovery configuration 
from the remounted K8s PVCs.
    
    ### Why are the changes needed?
    
    This will help the users use this feature more easily.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Manual review because this is a doc-only change.
    
    ![Screenshot 2023-08-09 at 1 39 48 
PM](https://github.com/apache/spark/assets/9700541/8cc7240b-570d-4c2e-b90a-54795c18df0a)
    
    ```
    $ kubectl logs -f xxx-exec-16 | grep Kube
    ...
    23/08/09 21:09:21 INFO KubernetesLocalDiskShuffleExecutorComponents: Try to 
recover shuffle data.
    23/08/09 21:09:21 INFO KubernetesLocalDiskShuffleExecutorComponents: Found 
192 files
    23/08/09 21:09:21 INFO KubernetesLocalDiskShuffleExecutorComponents: Try to 
recover 
/data/spark-x/executor-x/blockmgr-41a810ea-9503-447b-afc7-1cb104cd03cf/11/shuffle_0_11160_0.data
    23/08/09 21:09:21 INFO KubernetesLocalDiskShuffleExecutorComponents: Try to 
recover 
/data/spark-x/executor-x/blockmgr-41a810ea-9503-447b-afc7-1cb104cd03cf/0e/shuffle_0_10063_0.data
    23/08/09 21:09:21 INFO KubernetesLocalDiskShuffleExecutorComponents: Try to 
recover 
/data/spark-x/executor-x/blockmgr-41a810ea-9503-447b-afc7-1cb104cd03cf/0e/shuffle_0_10283_0.data
    23/08/09 21:09:21 INFO KubernetesLocalDiskShuffleExecutorComponents: Ignore 
a non-shuffle block file.
    ```
    
    Closes #42417 from dongjoon-hyun/SPARK-44745.
    
    Authored-by: Dongjoon Hyun <dongj...@apache.org>
    Signed-off-by: Dongjoon Hyun <dongj...@apache.org>
---
 docs/running-on-kubernetes.md | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md
index d3953592c4e..707a76196f3 100644
--- a/docs/running-on-kubernetes.md
+++ b/docs/running-on-kubernetes.md
@@ -394,6 +394,13 @@ 
spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.
 
spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.readOnly=false
 ```
 
+To enable shuffle data recovery feature via the built-in 
`KubernetesLocalDiskShuffleDataIO` plugin, we need to have the followings. You 
may want to enable `spark.kubernetes.driver.waitToReusePersistentVolumeClaim` 
additionally.
+
+```
+spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.path=/data/spark-x/executor-x
+spark.shuffle.sort.io.plugin.class=org.apache.spark.shuffle.KubernetesLocalDiskShuffleDataIO
+```
+
 If no volume is set as local storage, Spark uses temporary scratch space to 
spill data to disk during shuffles and other operations. When using Kubernetes 
as the resource manager the pods will be created with an 
[emptyDir](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir) 
volume mounted for each directory listed in `spark.local.dir` or the 
environment variable `SPARK_LOCAL_DIRS` .  If no directories are explicitly 
specified then a default directory is created and configured  [...]
 
 `emptyDir` volumes use the ephemeral storage feature of Kubernetes and do not 
persist beyond the life of the pod.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to