Github user mccheah commented on a diff in the pull request:
https://github.com/apache/spark/pull/20669#discussion_r175237767
--- Diff:
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -53,14 +53,10 @@ fi
case "$SPARK_K8S_CMD" in
driver)
CMD=(
- ${JAVA_HOME}/bin/java
- "${SPARK_JAVA_OPTS[@]}"
- -cp "$SPARK_CLASSPATH"
- -Xms$SPARK_DRIVER_MEMORY
- -Xmx$SPARK_DRIVER_MEMORY
- -Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS
- $SPARK_DRIVER_CLASS
- $SPARK_DRIVER_ARGS
+ "$SPARK_HOME/bin/spark-submit"
+ --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
+ --deploy-mode client
+ "$@"
--- End diff --
There's an argument that can be made for enforcing what arguments can be
passed here. For example, we can instead have the following command:
```
"$SPARK_HOME/bin/spark-submit"
--conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
--deploy-mode client
--properties-file $SPARK_DRIVER_PROPERTIES_FILE
--class $SPARK_DRIVER_MAIN_CLASS
spark-internal
```
And then instead of passing args to the Kubernetes container command, we
pass everything as environment variables. I think this is clearer and makes the
contract more obvious, in terms of what this driver container is expecting as
input. Thoughts @vanzin @ifilonenko?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]