moelhoussein opened a new issue, #6647:
URL: https://github.com/apache/kyuubi/issues/6647

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the 
[issues](https://github.com/apache/kyuubi/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Describe the bug
   
   # current set up
   I am running kyuubi 1.9.1 on AKS, clients submit `batch` jobs using the 
kyuubi API. I have set the `KUBECONFIG` envvar to the path to a context file 
containing the contexts for my spark worker clusters (I have dedicated K8s for 
spark jobs).
   
   ## Server config:
   Kyuubi server config are as follows:
   ```
       
kyuubi.kubernetes.isost10.spark.authenticate.oauthTokenFile=/etc/isost10/token
       
kyuubi.kubernetes.isost9.spark.authenticate.oauthTokenFile=/etc/isost9/token
   ```
   
   ## Clients sample request
   User are able are able to post the following:
   ```JSON
   {
       "resource": 
"local:///opt/spark/examples/jars/spark-examples_2.12-3.4.1.jar",
       "name": "sample-job",
       "batchType": "SPARK",
       "className": "org.apache.spark.examples.SparkPi",
       "conf": {
           "spark.kubernetes.context": "isost9",
           "spark.master": "k8s://cluster:443",
           "spark.kubernetes.container.image": "acr.io/spark:3.4.1-5250722",
           "spark.kubernetes.namespace": "spark",
           "spark.kubernetes.serviceAccountName": "spark",
           "spark.kubernetes.driver.node.selector.label": "nodepool1",
           "spark.kubernetes.executor.node.selector.label": "nodepool1",
           "spark.executor.memory": "4G",
           "spark.executor.cores": "2",
           "spark.driver.memory": "4G",
           "spark.driver.cores": "2"
       }
   }
   ```
   
   # The problem
   The issue occurs when we set the master on the kyuubi side per context:
   ```
   kyuubi.kubernetes.<context>.master.address=k8s://cluster:443
   kyuubi.kubernetes.<context>.<namespace>.authenticate.oauthTokenFile
   ```
   
   The `batch` job get scheduled in the remote cluster, but kyuubi is unable to 
retrieve the status. The logs will contain:
   ```
   ERROR OkHttp http://k8s/... 
io.fabric8.kubernetes.client.informers.impl.cache.Reflector: listSyncAndWatch 
failed for v1/namespaces/spark/pods, will stop
   java.util.concurrent.CompletionException: java.net.UnknownHostException: k8s
           at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
           at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
           at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:957)
           at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940)
           at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
           at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
           at 
io.fabric8.kubernetes.client.okhttp.OkHttpClientImpl$1.onFailure(OkHttpClientImpl.java:330)
           at okhttp3.RealCall$AsyncCall.execute(RealCall.java:211)
           at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:750)
   Caused by: java.net.UnknownHostException: k8s
   ```
   
   additionally clients are *REQUIRED* to set the spark master.
   
   Is it supported to set spark-master per context, so the clients doesn't have 
to know the API server  IP of the backend cluster?
   
   
   ### Affects Version(s)
   
   1.9.1
   
   ### Kyuubi Server Log Output
   
   _No response_
   
   ### Kyuubi Engine Log Output
   
   _No response_
   
   ### Kyuubi Server Configurations
   
   _No response_
   
   ### Kyuubi Engine Configurations
   
   _No response_
   
   ### Additional context
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes. I would be willing to submit a PR with guidance from the Kyuubi 
community to fix.
   - [ ] No. I cannot submit a PR at this time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to