[
https://issues.apache.org/jira/browse/SPARK-25162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
James Carter updated SPARK-25162:
---------------------------------
Description:
When creating Kubernetes scheduler 'in-cluster' using client mode, the value
for spark.driver.host can be derived from the IP address of the driver pod.
I observed that the value of _spark.driver.host_ defaulted to the value of
_spark.kubernetes.driver.pod.name_, which is not a valid hostname. This caused
the executors to fail to establish a connection back to the driver.
As a work around, in my configuration I pass the driver's pod name _and_ the
driver's ip address to ensure that executors can establish a connection with
the driver.
_spark.kubernetes.driver.pod.name_ := env.valueFrom.fieldRef.fieldPath:
metadata.name
_spark.driver.host_ := env.valueFrom.fieldRef.fieldPath: status.podIp
e.g.
Deployment:
{noformat}
env:
- name: DRIVER_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: DRIVER_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
{noformat}
Application Properties:
{noformat}
config[spark.kubernetes.driver.pod.name]: ${DRIVER_POD_NAME}
config[spark.driver.host]: ${DRIVER_POD_IP}
{noformat}
BasicExecutorFeatureStep.scala:
{code:java}
private val driverUrl = RpcEndpointAddress(
kubernetesConf.get("spark.driver.host"),
kubernetesConf.sparkConf.getInt("spark.driver.port", DEFAULT_DRIVER_PORT),
CoarseGrainedSchedulerBackend.ENDPOINT_NAME).toString
{code}
Ideally only _spark.kubernetes.driver.pod.name_ would need be provided in this
deployment scenario.
was:
When creating Kubernetes scheduler 'in-cluster' using client mode, the value
for spark.driver.host can be derived from the IP address of the driver pod.
I observed that the value of _spark.driver.host_ defaulted to the value of
_spark.kubernetes.driver.pod.name_, which is not a valid hostname. This caused
the executors to fail to establish a connection back to the driver.
As a work around, in my configuration I pass the driver's pod name _and_ the
driver's ip address to ensure that executors can establish a connection with
the driver.
_spark.kubernetes.driver.pod.name_ := env.valueFrom.fieldRef.fieldPath:
metadata.name
_spark.driver.host_ := env.valueFrom.fieldRef.fieldPath: status.podIp
Ideally only _spark.kubernetes.driver.pod.name_ need be provided in this
deployment scenario.
> Kubernetes 'in-cluster' client mode and value spark.driver.host
> ---------------------------------------------------------------
>
> Key: SPARK-25162
> URL: https://issues.apache.org/jira/browse/SPARK-25162
> Project: Spark
> Issue Type: Bug
> Components: Kubernetes
> Affects Versions: 2.4.0
> Environment: A java program, deployed to kubernetes, that establishes
> a Spark Context in client mode.
> Not using spark-submit.
> Kubernetes 1.10
> AWS EKS
>
>
> Reporter: James Carter
> Priority: Minor
>
> When creating Kubernetes scheduler 'in-cluster' using client mode, the value
> for spark.driver.host can be derived from the IP address of the driver pod.
> I observed that the value of _spark.driver.host_ defaulted to the value of
> _spark.kubernetes.driver.pod.name_, which is not a valid hostname. This
> caused the executors to fail to establish a connection back to the driver.
> As a work around, in my configuration I pass the driver's pod name _and_ the
> driver's ip address to ensure that executors can establish a connection with
> the driver.
> _spark.kubernetes.driver.pod.name_ := env.valueFrom.fieldRef.fieldPath:
> metadata.name
> _spark.driver.host_ := env.valueFrom.fieldRef.fieldPath: status.podIp
> e.g.
> Deployment:
> {noformat}
> env:
> - name: DRIVER_POD_NAME
> valueFrom:
> fieldRef:
> fieldPath: metadata.name
> - name: DRIVER_POD_IP
> valueFrom:
> fieldRef:
> fieldPath: status.podIP
> {noformat}
>
> Application Properties:
> {noformat}
> config[spark.kubernetes.driver.pod.name]: ${DRIVER_POD_NAME}
> config[spark.driver.host]: ${DRIVER_POD_IP}
> {noformat}
>
> BasicExecutorFeatureStep.scala:
> {code:java}
> private val driverUrl = RpcEndpointAddress(
> kubernetesConf.get("spark.driver.host"),
> kubernetesConf.sparkConf.getInt("spark.driver.port", DEFAULT_DRIVER_PORT),
> CoarseGrainedSchedulerBackend.ENDPOINT_NAME).toString
> {code}
>
> Ideally only _spark.kubernetes.driver.pod.name_ would need be provided in
> this deployment scenario.
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]