dcoliversun commented on code in PR #35886:
URL: https://github.com/apache/spark/pull/35886#discussion_r945429105
##########
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/DriverCommandFeatureStep.scala:
##########
@@ -84,25 +84,18 @@ private[spark] class DriverCommandFeatureStep(conf:
KubernetesDriverConf)
"variables instead.")
}
- val pythonEnvs =
- Seq(
- conf.get(PYSPARK_PYTHON)
- .orElse(environmentVariables.get(ENV_PYSPARK_PYTHON)).map { value =>
- new EnvVarBuilder()
- .withName(ENV_PYSPARK_PYTHON)
- .withValue(value)
- .build()
- },
- conf.get(PYSPARK_DRIVER_PYTHON)
- .orElse(conf.get(PYSPARK_PYTHON))
- .orElse(environmentVariables.get(ENV_PYSPARK_DRIVER_PYTHON))
- .orElse(environmentVariables.get(ENV_PYSPARK_PYTHON)).map { value =>
- new EnvVarBuilder()
- .withName(ENV_PYSPARK_DRIVER_PYTHON)
- .withValue(value)
- .build()
- }
- ).flatten
+ val pythonEnvs = {
+ KubernetesUtils.buildEnvVars(
+ Map(
+ ENV_PYSPARK_PYTHON -> conf.get(PYSPARK_PYTHON)
+ .orElse(environmentVariables.get(ENV_PYSPARK_PYTHON))
+ .orNull,
+ ENV_PYSPARK_DRIVER_PYTHON -> conf.get(PYSPARK_DRIVER_PYTHON)
+ .orElse(conf.get(PYSPARK_PYTHON))
+ .orElse(environmentVariables.get(ENV_PYSPARK_DRIVER_PYTHON))
+ .orElse(environmentVariables.get(ENV_PYSPARK_PYTHON))
+ .orNull))
+ }
Review Comment:
I believe new logic is identical with previous one.
### Old logic
https://github.com/apache/spark/blob/36dd531a93af55ce5c2bfd8d275814ccb2846962/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/DriverCommandFeatureStep.scala#L87-L105
In old logic, when both `PYSPARK_PYTHON` and `ENV_PYSPARK_PYTHON` is not
defined, spark pod has empty environment variables, which of keys are
`ENV_PYSPARK_PYTHON` and `ENV_PYSPARK_DRIVER_PYTHON`, and values are `NULL`.
The corresponding Pod yaml is as follows
```yaml
...
spec:
...
env:
- name: PYSPARK_PYTHON
- name: PYSPARK_DRIVER_PYTHON
```
Exec `echo` command, the result is as follows.
```shell
$ echo $PYSPARK_PYTHON
<BLANK LINE>
$ echo $PYSPARK_DRIVER_PYTHON
<BLANK LINE>
```
### New logic
When both `PYSPARK_PYTHON` and `ENV_PYSPARK_PYTHON` is not defined, no
envVars are set in Pod. It behaves the same as environment variable with `NULL`
value.
Existing code can also illustrate these, unit test about `auth secret
propagation` expects that `SPARK_CLASSPATH ` doesn't exists in executor
environment variables.
https://github.com/apache/spark/blob/36dd531a93af55ce5c2bfd8d275814ccb2846962/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStepSuite.scala#L266-L279
https://github.com/apache/spark/blob/36dd531a93af55ce5c2bfd8d275814ccb2846962/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStepSuite.scala#L500-L530
https://github.com/apache/spark/blob/36dd531a93af55ce5c2bfd8d275814ccb2846962/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala#L163-L170
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]