[
https://issues.apache.org/jira/browse/SPARK-37537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Petri resolved SPARK-37537.
---------------------------
Resolution: Not A Problem
> Spark 3.2.0 driver pod does not mount checkpoint filesystem from Kubernetes
> PVC
> -------------------------------------------------------------------------------
>
> Key: SPARK-37537
> URL: https://issues.apache.org/jira/browse/SPARK-37537
> Project: Spark
> Issue Type: Bug
> Components: Spark Submit
> Affects Versions: 3.2.0
> Reporter: Petri
> Priority: Major
>
> I have Spark 3.2.0 driver executing in Kubernetes pod in client mode and
> following configs has been defined in spark-submit:
> {code:java}
> --deploy-mode client
> --conf
> spark.kubernetes.driver.volumes.persistentVolumeClaim.glustervol.mount.path=/mnt/distributedDisk
> --conf
> spark.kubernetes.driver.volumes.persistentVolumeClaim.glustervol.readOnly=false
> --conf
> spark.kubernetes.driver.volumes.persistentVolumeClaim.glustervol.options.claimName=lolastreamingapp-conf
>
> spark.kubernetes.executor.volumes.persistentVolumeClaim.glustervol.mount.path=/mnt/distributedDisk
> --conf
> spark.kubernetes.executor.volumes.persistentVolumeClaim.glustervol.readOnly=false
> --conf
> spark.kubernetes.executor.volumes.persistentVolumeClaim.glustervol.options.claimName=lolastreamingapp
> {code}
> I face a problem when starting the driver pod that it cannot access the
> filesystem mounted from GlusterFS PVC. I can see that driver pod has not
> mounted the PVC when describing the pod. I can also see that PVC is not
> mounted when describing the PVC.
> This has been working with Spark version 2.4.x, but not with Spark 3.2.0.
> Only notable change we have between using Spark version 2.4.x and 3.2.0 is
> that in 2.4.x we used deploy-mode cluster and in 3.2.0 we use deploy-mode
> client.
>
> Because the filesystem used for checkpointing is not mounted properly, we get
> following kind of error in our application:
> {code:java}
> java.io.FileNotFoundException: File
> /mnt/distributedDisk/SE/LolaStreamingApp/1.0.0/1468589949 does not exist
> at
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:779)
> ~[hadoop-client-api-3.3.1.jar:?]
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:1100)
> ~[hadoop-client-api-3.3.1.jar:?]
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:769)
> ~[hadoop-client-api-3.3.1.jar:?]
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:462)
> ~[hadoop-client-api-3.3.1.jar:?]
> at
> org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:240)
> ~[spark-streaming_2.12-3.2.0.jar:3.2.0]
> at
> org.apache.spark.streaming.api.java.JavaStreamingContext.checkpoint(JavaStreamingContext.scala:509)
> ~[spark-streaming_2.12-3.2.0.jar:3.2.0] {code}
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]