sunpe commented on PR #33154:
URL: https://github.com/apache/spark/pull/33154#issuecomment-1251836832
> Hello @sunpe, thank you for your very fast answer.
>
> Please let me give you some more context, I am using Spark v3.3.0 in K8s
using [Spark on K8S
operator](https://github.com/GoogleCloudPlatform/spark-on-k8s-operator). My
Spark driver is a Spring Boot application, very similar to the situation you
described above, but instead of starting a web server, I subscribe to a
RabbitMQ queue. Basically, I would like to have the `SparkSession` available as
`@Bean` in my services.
>
> I tried the following:
>
> 1. Activate `--verbose`
> 2. Rebuild from source (v3.3.0), adding a simple `logWarning` before `
SparkContext.getActive.foreach(_.stop())`
>
> Here are my findings:
>
> Spark Operator pod submits the application using this command:
>
> ```
> submission.go:65] spark-submit arguments: [/opt/spark/bin/spark-submit
> --class org.test.myapplication.MyApplication
> --master k8s://https://10.32.0.1:443
> --deploy-mode cluster
> --conf spark.kubernetes.namespace=my-app-namespace
> --conf spark.app.name=my-application
> --conf spark.kubernetes.driver.pod.name=my-application-driver
> --conf spark.kubernetes.container.image=repo/my-application:latest
> --conf spark.kubernetes.container.image.pullPolicy=Always
> --conf spark.kubernetes.submission.waitAppCompletion=false
> --conf
spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=my-application
> --conf
spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true
> --conf
spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=2b225916-cc12-47cc-a898-8549301fdce4
> --conf spark.driver.cores=1
> --conf spark.kubernetes.driver.request.cores=200m
> --conf spark.driver.memory=512m
> --conf
spark.kubernetes.authenticate.driver.serviceAccountName=spark-operator-spark
> --conf
spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name=my-application
> --conf
spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true
> --conf
spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=2b225916-cc12-47cc-a898-8549301fdce4
> --conf spark.executor.instances=1
> --conf spark.executor.cores=1
> --conf spark.executor.memory=512m
> --conf
spark.kubernetes.executor.label.app.kubernetes.io/instance=my-app-namespace-my-application
> local:///opt/spark/work-dir/my-application-0.0.1-SNAPSHOT-all.jar]
> ```
>
> When Driver pod starts, I have the following logs:
>
> ```
> (spark.driver.memory,512m)
> (spark.driver.port,7078)
> (spark.executor.cores,1)
> (spark.executor.instances,1)
> (spark.executor.memory,512m)
> […]
> (spark.kubernetes.resource.type,java)
> (spark.kubernetes.submission.waitAppCompletion,false)
> (spark.kubernetes.submitInDriver,true)
> (spark.master,k8s://https://10.32.0.1:443)
> ```
>
> As you can see, `args.master` starts by `k8s`. Once the application is
started and `main()` thread is release, my custom log is printed,
`SparkContext` is being closed and the executor is stopped. As I understand in
source code, my primary resource is not `spark-shell`, neither the main class
is ` org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver` nor `
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2`. It produces the
following logs:
>
> ```
> 2022-09-19 13:05:18.621 INFO 30 --- [ main]
o.s.a.r.c.CachingConnectionFactory : Created new connection:
rabbitConnectionFactory#1734b1a:0/SimpleConnection@66859ea9
[delegate=amqp://user@100.64.2.141:5672/, localPort= 49038]
> 2022-09-19 13:05:18.850 INFO 30 --- [ main] MyApplication :
Started MyApplication in 18.787 seconds (JVM running for 23.051)
> Warning: SparkContext is going to be stopped!
> 2022-09-19 13:05:18.903 INFO 30 --- [ main]
o.s.j.s.AbstractConnector: Stopped Spark@45d389f2{HTTP/1.1,
(http/1.1)}{0.0.0.0:4040}
> 2022-09-19 13:05:18.914 INFO 30 --- [ main] o.a.s.u.SparkUI
: Stopped Spark web UI at
http://my-app-ee12f78355d97dc2-driver-svc.my-app-namespace.svc:4040
> 2022-09-19 13:05:18.927 INFO 30 --- [ main]
.s.c.k.KubernetesClusterSchedulerBackend : Shutting down all executors
> 2022-09-19 13:05:18.928 INFO 30 --- [rainedScheduler]
chedulerBackend$KubernetesDriverEndpoint : Asking each executor to shut down
> 2022-09-19 13:05:18.938 DEBUG 30 --- [ rpc-server-4-1]
o.a.s.n.u.NettyLogger: [id: 0x07e91ece,
L:/100.64.15.15:7078 - R:/100.64.15.215:40996] WRITE: MessageWithHeader
[headerLength: 13, bodyLength: 198]
> 2022-09-19 13:05:18.939 DEBUG 30 --- [ rpc-server-4-1]
o.a.s.n.u.NettyLogger: [id: 0x07e91ece,
L:/100.64.15.15:7078 - R:/100.64.15.215:40996]