wangchao732 opened a new issue, #15359:
URL: https://github.com/apache/dolphinscheduler/issues/15359

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   准备在ds中运行spark on k8s任务提示  Error: Master must start with yarn, spark, mesos, 
or local
   
   ### What you expected to happen
   
   export 
KUBECONFIG=/tmp/dolphinscheduler/exec/process/default/11930573660864/11985906373952_1/9/9/config
   ${SPARK_HOME}/bin/spark-submit --master k8s://https://192.168.11.107:6443/ 
--deploy-mode cluster --class uml.tech.spark.SparkApp --conf 
spark.driver.cores=1 --conf spark.driver.memory=512M --conf 
spark.executor.instances=2 --conf spark.executor.cores=2 --conf 
spark.executor.memory=2G --name TE --conf 
spark.kubernetes.driver.label.dolphinscheduler-label=9_9 --conf 
spark.kubernetes.namespace=spark-job 
file:/dolphinscheduler/default/resources/spark-job/uml-ne-gblogs-to-tdengine-1.0.0/lib/uml-ne-gblogs-to-tdengine-1.0.0.jar
 20220311 200
   [INFO] 2023-12-21 11:00:11.546 +0800 - Executing shell command : sudo -u 
dolphinscheduler -i 
/tmp/dolphinscheduler/exec/process/default/11930573660864/11985906373952_1/9/9/9_9.sh
   [INFO] 2023-12-21 11:00:11.565 +0800 - process start, process id is: 14145
   [INFO] 2023-12-21 11:00:12.565 +0800 -  -> 
        WARNING: User-defined SPARK_HOME 
(/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/spark) overrides detected 
(/opt/cloudera/parcels/CDH/lib/spark).
        WARNING: Running spark-class from user-defined location.
   [INFO] 2023-12-21 11:00:14.567 +0800 -  -> 
        Error: Master must start with yarn, spark, mesos, or local
        Run with --help for usage help or --verbose for debug output
   [ERROR] 2023-12-21 11:00:27.571 +0800 - Handle pod log error
   java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
java.lang.RuntimeException: The driver pod does not exist.
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:211)
        at 
org.apache.dolphinscheduler.plugin.task.api.AbstractYarnTask.handle(AbstractYarnTask.java:52)
        at 
org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerDelayTaskExecuteRunnable.executeTask(DefaultWorkerDelayTaskExecuteRunnable.java:57)
        at 
org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecuteRunnable.run(WorkerTaskExecuteRunnable.java:175)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
        at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
   Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The 
driver pod does not exist.
        at 
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.lambda$collectPodLogIfNeeded$0(AbstractCommandExecutor.java:287)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        ... 3 common frames omitted
   Caused by: java.lang.RuntimeException: The driver pod does not exist.
        at 
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.lambda$collectPodLogIfNeeded$0(AbstractCommandExecutor.java:277)
        ... 7 common frames omitted
   [INFO] 2023-12-21 11:00:27.572 +0800 - process has exited. execute 
path:/tmp/dolphinscheduler/exec/process/default/11930573660864/11985906373952_1/9/9,
 processId:14145 ,exitStatusCode:1 ,processWaitForStatus:true 
,processExitValue:1
   [INFO] 2023-12-21 11:00:27.572 +0800 - Start finding appId in 
/opt/dolphinscheduler/worker-server/logs/20231221/11985906373952/1/9/9.log, 
fetch way: log 
   [INFO] 2023-12-21 11:00:27.575 +0800 - 
***********************************************************************************************
   [INFO] 2023-12-21 11:00:27.575 +0800 - *********************************  
Finalize task instance  ************************************
   [INFO] 2023-12-21 11:00:27.575 +0800 - 
***********************************************************************************************
   [INFO] 2023-12-21 11:00:27.578 +0800 - Upload output files: [] successfully
   [INFO] 2023-12-21 11:00:27.579 +0800 - Send task execute status: FAILURE to 
master : 192.168.12.26:1234
   [INFO] 2023-12-21 11:00:27.579 +0800 - Remove the current task execute 
context from worker cache
   [INFO] 2023-12-21 11:00:27.579 +0800 - The current execute mode isn't 
develop mode, will clear the task execute file: 
/tmp/dolphinscheduler/exec/process/default/11930573660864/11985906373952_1/9/9
   [INFO] 2023-12-21 11:00:27.786 +0800 - Success clear the task execute file: 
/tmp/dolphinscheduler/exec/process/default/11930573660864/11985906373952_1/9/9
   [INFO] 2023-12-21 11:00:27.787 +0800 - FINALIZE_SESSION
   
   ### How to reproduce
   
   添加k8s进群到ds控制台,新建工作流,
   
![image](https://github.com/apache/dolphinscheduler/assets/19440083/e54e6cdb-72b1-4a25-8f93-a63114c2088b)
   
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   3.2.x
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to