[ 
https://issues.apache.org/jira/browse/SPARK-43585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

roland updated SPARK-43585:
---------------------------
    Description: 
I created a Spark Connect shell in a pod using the following yaml.
{code:java}
apiVersion: v1
kind: Service
metadata:
  name: spark-connect-svc
  namespace: MY_NAMESPACE
spec:
  clusterIP: None
  selector:
    app: spark-connect-pod
    podType: spark-connect-driver


apiVersion: v1
kind: Pod
metadata:
  name: spark-connect-pod
  namespace: realtime-streaming
  labels:
    app: spark-connect-pod
    podType: spark-connect-driver
spec:
  restartPolicy: Never
  containers:
  - command:
    - sh
    - -c
    - /opt/spark/sbin/start-connect-server.sh --master 
k8s://https://MY_API_SERVER:443 --packages 
org.apache.spark:spark-connect_2.12:3.4.0 --conf 
spark.kubernetes.executor.limit.cores=1.0 --conf 
spark.kubernetes.executor.request.cores=1.0 --conf spark.executor.cores=1 
--conf spark.executor.memory=6G --conf 
spark.kubernetes.container.image=MY_ECR_REPO/spark:3.4-prd  --conf 
spark.kubernetes.executor.podNamePrefix=spark-connect --num-executors=10 --conf 
spark.kubernetes.driver.pod.name=spark-connect-pod --conf 
spark.kubernetes.namespace=MY_NAMESPACE && tail -100f 
/opt/spark/logs/spark--org.apache.spark.sql.connect.service.SparkConnectServer-1-spark-connect-pod.out
    image: MY_ECR_REPO/spark-py:3.4-prd
    name: spark-connect-pod
 {code}
The Spark Connect server was successfully launched and I can connect to it 
using pyspark.

 

But when I want to add a Hive metastore config , it won't work.

 
{code:java}
>>> spark = 
>>> SparkSession.builder.remote("sc://spark-connect-svc").config("spark.hive.metastore.uris",
>>>  "thrift://hive-metastore:9083").getOrCreate()
>>> spark.sql("show databases").show()
+---------+
|namespace|
+---------+
|  default|
+---------+{code}
 

 

  was:
I created a Spark Connect shell in a pod using the following yaml.
{code:java}
apiVersion: v1
kind: Service
metadata:
  name: spark-connect-svc
  namespace: MY_NAMESPACE
spec:
  clusterIP: None
  selector:
    app: spark-connect-pod
    podType: spark-connect-driver
apiVersion: v1
kind: Pod
metadata:
  name: spark-connect-pod
  namespace: realtime-streaming
  labels:
    app: spark-connect-pod
    podType: spark-connect-driver
spec:
  restartPolicy: Never
  containers:
  - command:
    - sh
    - -c
    - /opt/spark/sbin/start-connect-server.sh --master 
k8s://https://MY_API_SERVER:443 --packages 
org.apache.spark:spark-connect_2.12:3.4.0 --conf 
spark.kubernetes.executor.limit.cores=1.0 --conf 
spark.kubernetes.executor.request.cores=1.0 --conf spark.executor.cores=1 
--conf spark.executor.memory=6G --conf 
spark.kubernetes.container.image=MY_ECR_REPO/spark:3.4-prd  --conf 
spark.kubernetes.executor.podNamePrefix=spark-connect --num-executors=10 --conf 
spark.kubernetes.driver.pod.name=spark-connect-pod --conf 
spark.kubernetes.namespace=MY_NAMESPACE && tail -100f 
/opt/spark/logs/spark--org.apache.spark.sql.connect.service.SparkConnectServer-1-spark-connect-pod.out
    image: MY_ECR_REPO/spark-py:3.4-prd
    name: spark-connect-pod
 {code}
The Spark Connect server was successfully launched and I can connect to it 
using pyspark.

 

But when I want to add a Hive metastore config , it won't work.

 
{code:java}
>>> spark = 
>>> SparkSession.builder.remote("sc://spark-connect-svc").config("spark.hive.metastore.uris",
>>>  "thrift://hive-metastore:9083").getOrCreate()
>>> spark.sql("show databases").show()
+---------+
|namespace|
+---------+
|  default|
+---------+{code}
 

 


> Spark Connect client cannot read from Hive metastore
> ----------------------------------------------------
>
>                 Key: SPARK-43585
>                 URL: https://issues.apache.org/jira/browse/SPARK-43585
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.4.0
>            Reporter: roland
>            Priority: Major
>
> I created a Spark Connect shell in a pod using the following yaml.
> {code:java}
> apiVersion: v1
> kind: Service
> metadata:
>   name: spark-connect-svc
>   namespace: MY_NAMESPACE
> spec:
>   clusterIP: None
>   selector:
>     app: spark-connect-pod
>     podType: spark-connect-driver
> apiVersion: v1
> kind: Pod
> metadata:
>   name: spark-connect-pod
>   namespace: realtime-streaming
>   labels:
>     app: spark-connect-pod
>     podType: spark-connect-driver
> spec:
>   restartPolicy: Never
>   containers:
>   - command:
>     - sh
>     - -c
>     - /opt/spark/sbin/start-connect-server.sh --master 
> k8s://https://MY_API_SERVER:443 --packages 
> org.apache.spark:spark-connect_2.12:3.4.0 --conf 
> spark.kubernetes.executor.limit.cores=1.0 --conf 
> spark.kubernetes.executor.request.cores=1.0 --conf spark.executor.cores=1 
> --conf spark.executor.memory=6G --conf 
> spark.kubernetes.container.image=MY_ECR_REPO/spark:3.4-prd  --conf 
> spark.kubernetes.executor.podNamePrefix=spark-connect --num-executors=10 
> --conf spark.kubernetes.driver.pod.name=spark-connect-pod --conf 
> spark.kubernetes.namespace=MY_NAMESPACE && tail -100f 
> /opt/spark/logs/spark--org.apache.spark.sql.connect.service.SparkConnectServer-1-spark-connect-pod.out
>     image: MY_ECR_REPO/spark-py:3.4-prd
>     name: spark-connect-pod
>  {code}
> The Spark Connect server was successfully launched and I can connect to it 
> using pyspark.
>  
> But when I want to add a Hive metastore config , it won't work.
>  
> {code:java}
> >>> spark = 
> >>> SparkSession.builder.remote("sc://spark-connect-svc").config("spark.hive.metastore.uris",
> >>>  "thrift://hive-metastore:9083").getOrCreate()
> >>> spark.sql("show databases").show()
> +---------+
> |namespace|
> +---------+
> |  default|
> +---------+{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to