Oliver Koeth created SPARK-39546:
------------------------------------

             Summary: Respect port defininitions on K8S pod templates for both 
driver and executor
                 Key: SPARK-39546
                 URL: https://issues.apache.org/jira/browse/SPARK-39546
             Project: Spark
          Issue Type: Improvement
          Components: Kubernetes
    Affects Versions: 3.3.0
            Reporter: Oliver Koeth


*Description:*

Spark on K8S allows to open additional ports for custom purposes on the driver 
pod via the pod template, but ignores the port specification in the executor 
pod template. Port specifications from the pod template should be preserved 
(and extended) for both drivers and executors.

*Scenario:*

I want to run functionality in the executor that exposes data on an additional 
port. In my case, this is monitoring data exposed by Spark's JMX metrics sink 
via the JMX prometheus exporter java agent 
https://github.com/prometheus/jmx_exporter -- the java agent opens an extra 
port inside the container, but for prometheus to detect and scrape the port, it 
must be exposed in the K8S pod resource.
(More background if desired: This seems to be the "classic" Spark 2 way to 
expose prometheus metrics. Spark 3 introduced a native equivalent servlet for 
the driver, but for the executor, only a rather limited set of metrics is 
forwarded via the driver, and that also follows a completely different naming 
scheme. So the JMX + exporter approach still turns out to be more useful for 
me, even in Spark 3)

Expected behavior:

I add the following to my pod template to expose the extra port opened by the 
JMX exporter java agent

spec:
  containers:
  - ...
    ports:
    - containerPort: 8090
      name: jmx-prometheus
      protocol: TCP

Observed behavior:

The port is exposed for driver pods but not for executor pods


*Corresponding code:*

driver pod creation just adds ports
[https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala]
 (currently line 115)

val driverContainer = new ContainerBuilder(pod.container)
...
  .addNewPort()
...
  .addNewPort()

while executor pod creation replaces the ports
[https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala]
 (currently line 211)

val executorContainer = new ContainerBuilder(pod.container)
...
  .withPorts(requiredPorts.asJava)


The current handling is incosistent and unnecessarily limiting. It seems that 
the executor creation could/should just as well preserve pods from the template 
and add extra required ports.


*Workaround:*

It is possible to work around this limitation by adding a full sidecar 
container to the executor pod spec which declares the port. Sidecar containers 
are left unchanged by pod template handling.
As all containers in a pod share the same network, it does not matter which 
container actually declares to expose the port.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to