rvesse opened a new pull request #23611: [SPARK-26685][K8S] Correct placement 
of ARG declaration
URL: https://github.com/apache/spark/pull/23611
 
 
   Latest Docker releases are stricter in their enforcement of build
   argument scope.  The location of the ARG spark_uid declaration in the
   Python and R Dockerfiles means the variable is out of scope by the time
   it is used in a USER declaration resulting in a container running as
   root rather than the default/configured UID.
   
   Also with some of the refactoring of the script that has happened since my 
PR that introduced the configurable UID it turns out the `-u <uid>` argument is 
not being properly passed to the Python and R image builds when those are opted 
into
   
   ## What changes were proposed in this pull request?
   
   This commit moves the ARG declaration to just before the argument is
   used such that it is in scope.
   
   ## How was this patch tested?
   
   Prior to the patch images are produced where the Python and R images ignore 
the default/configured UID:
   
   ```
   > docker run -it --entrypoint /bin/bash rvesse/spark-py:uid456
   bash-4.4# whoami
   root
   bash-4.4# id -u
   0
   bash-4.4# exit
   > docker run -it --entrypoint /bin/bash rvesse/spark:uid456
   bash-4.4$ id -u
   456
   bash-4.4$ exit
   ```
   
   Note that the Python image is still running as `root` having ignored the 
configured UID of 456 while the base image has the correct UID because the 
relevant `ARG` declaration is correctly in scope.
   
   After the patch the correct UID is observed:
   
   ```
   > docker run -it --entrypoint /bin/bash rvesse/spark-r:uid456
   bash-4.4$ id -u
   456
   bash-4.4$ exit
   exit
   > docker run -it --entrypoint /bin/bash rvesse/spark-py:uid456
   bash-4.4$ id -u
   456
   bash-4.4$ exit
   exit
   > docker run -it --entrypoint /bin/bash rvesse/spark:uid456
   bash-4.4$ id -u
   456
   bash-4.4$ exit
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to