Github user erikerlandson commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20669#discussion_r175532438
  
    --- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
    @@ -53,14 +53,10 @@ fi
     case "$SPARK_K8S_CMD" in
       driver)
         CMD=(
    -      ${JAVA_HOME}/bin/java
    -      "${SPARK_JAVA_OPTS[@]}"
    -      -cp "$SPARK_CLASSPATH"
    -      -Xms$SPARK_DRIVER_MEMORY
    -      -Xmx$SPARK_DRIVER_MEMORY
    -      -Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS
    -      $SPARK_DRIVER_CLASS
    -      $SPARK_DRIVER_ARGS
    +      "$SPARK_HOME/bin/spark-submit"
    +      --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
    +      --deploy-mode client
    +      "$@"
    --- End diff --
    
    as @mccheah mentioned, I included some logic on the current entrypoint.sh 
to allow Spark to work in cases such as an anonymous uid (another way to look 
at it is managing Spark's long-standing quirk of failing when it can't find a 
passwd entry).  Putting it in entrypoint.sh was a good way to make sure this 
happened regardless of how the actual CMD evolved.  A kind of defensive 
future-proofing, which is important for reference Dockerfiles.  It also 
provides execution via `tini` as "pid 1", which is considered good standard 
practice.
    
    All of this is done in part with the expectation that _most_ users are 
liable to just want to customize their images by building the reference 
dockerfiles and using those as base images for their own, without modifying the 
CMD or entrypoint.sh
    
    That said, I think that in terms of formally documenting a container API, 
entrypoint.sh may be a red herring. In theory, a user _should_ be able to build 
their own custom container from the ground up, up to and including a different 
entrypoint, or default entrypoint, etc.
    
    Part of the reason we went with an externalized CMD (instead of creating 
one in the backend code) was to allow maximum flexibility in how images were 
constructed.  The back-end provides certain information to the pod.  The "API" 
is a catalogue of this information, combined with any behaviors that the user's 
container _must_ implement.  API doc shouldn't assume the existence of 
entrypoint.sh


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to