[jira] [Commented] (SPARK-38223) PersistentVolumeClaim does not work in clusters with multiple nodes

Dongjoon Hyun (Jira) Wed, 16 Feb 2022 19:06:44 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-38223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17493620#comment-17493620
 ]


Dongjoon Hyun commented on SPARK-38223:
---------------------------------------

That question is orthogonal from the underlying storage or resource managers. 
Have you try to run Spark Thrift Server in the distributed mode before? If not, 
please check Spark Thrift Server configurations for the distributed modes 
first. In addition, you will need to setup Hive Metasotre Server to use it 
practically.

> PersistentVolumeClaim does not work in clusters with multiple nodes
> -------------------------------------------------------------------
>
>                 Key: SPARK-38223
>                 URL: https://issues.apache.org/jira/browse/SPARK-38223
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 3.2.1
>         Environment: 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html#how-it-works]
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-kubernetes-volumes]
> [https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes]
>  
>            Reporter: Zimo Li
>            Priority: Minor
>
> We are using {{spark-submit}} to establish a ThriftServer warehouse on Google 
> Kubernetes Engine. The Spark documentation on running on Kubernetes suggests 
> that we can use 
> [persistentVolumeClaim|https://kubernetes.io/docs/concepts/storage/volumes/#persistentvolumeclaim]
>  for Spark applications.
> {code:bash}
> spark-submit \
>   --master k8s://$KUBERNETES_SERVICE_HOST \
>   --deploy-mode cluster \
>   --class $THRIFTSERVER \
>   --conf spark.sql.catalogImplementation=hive \
>   --conf spark.sql.hive.metastore.sharedPrefixes=org.postgresql \
>   --conf spark.hadoop.hive.metastore.schema.verification=false \
>   --conf spark.hadoop.datanucleus.schema.autoCreateTables=true \
>   --conf spark.hadoop.datanucleus.autoCreateSchema=false \
>   --conf spark.sql.parquet.int96RebaseModeInWrite=CORRECTED \
>   --conf 
> spark.hadoop.javax.jdo.option.ConnectionDriverName=org.postgresql.Driver \
>   --conf spark.hadoop.javax.jdo.option.ConnectionUserName=spark \
>   --conf spark.hadoop.javax.jdo.option.ConnectionPassword=Password1! \
>   --conf spark.sql.warehouse.dir=$MOUNT_PATH \
>   --conf spark.kubernetes.driver.pod.name=spark-hive-thriftserver-driver \
>   --conf spark.kubernetes.driver.label.app.kubernetes.io/name=thriftserver \
>   --conf 
> spark.kubernetes.executor.volumes.persistentVolumeClaim.$VOLUME_NAME.options.claimName=$CLAIM_NAME
>  \
>   --conf 
> spark.kubernetes.executor.volumes.persistentVolumeClaim.$VOLUME_NAME.mount.path=$MOUNT_PATH
>  \
>   --conf 
> spark.kubernetes.executor.volumes.persistentVolumeClaim.$VOLUME_NAME.mount.readOnly=false
>  \
>   --conf 
> spark.kubernetes.driver.volumes.persistentVolumeClaim.$VOLUME_NAME.options.claimName=$CLAIM_NAME
>  \
>   --conf 
> spark.kubernetes.driver.volumes.persistentVolumeClaim.$VOLUME_NAME.mount.path=$MOUNT_PATH
>  \
>   --conf 
> spark.kubernetes.driver.volumes.persistentVolumeClaim.$VOLUME_NAME.mount.readOnly=false
>  \
>   --conf spark.kubernetes.executor.deleteOnTermination=true \
>   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-kube \
>   --conf spark.kubernetes.container.image=$IMAGE \
>   --conf spark.kubernetes.container.image.pullPolicy=Always \
>   --conf spark.executor.memory=2g \
>   --conf spark.driver.memory=2g \
>   local:///$JAR {code}
> When it ran, it created one driver and two executors. Each of these wanted to 
> use the same pvc. Unfortunately, at least one of these pods was scheduled on 
> a different node from the rest. As GKE mounts pvs to nodes in order to honor 
> pvcs for pods, that odd pod out was unable to attach the pv:
> {code:java}
> FailedMount
> Unable to attach or mount volumes: unmounted volumes=[spark-warehouse], 
> unattached volumes=[kube-api-access-grfld spark-conf-volume-exec 
> spark-warehouse spark-local-dir-1]: timed out waiting for the condition {code}
> This is because GKE like many cloud providers does not support 
> {{ReadWriteMany}} for pvcs/pvs.
> ----
> I suggest changing the documentation not to suggest using pvcs for 
> ThriftServers.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-38223) PersistentVolumeClaim does not work in clusters with multiple nodes

Reply via email to