zhengruifeng commented on PR #42842:
URL: https://github.com/apache/spark/pull/42842#issuecomment-1709349788

   > LGTM, but I think you should know (or maybe already know):
   > 
   > > Always combine RUN apt-get update with apt-get install in the same RUN 
statement.
   > 
   > This best practices is for to help reduce redundat layer between `update` 
and `install`, but for spark infra, there are a tradeoff between `image size` 
download cost and `image rebuild` cost.
   > 
   > Move frequently modified dependencies to front means: once the Dockerfile 
head front dependencies are modified, the following content will also be 
rebuild (due to infra cache invalidated). Finally, it impacts PRs which `change 
Dockerfile head lines` or `haven't rebase in time` CI will cost more time 
(about 1h).
   > 
   > TL;DR: move stable dependencies to top, keep frequently modified 
dependencies to end.
   
   thanks for the detailed explaination, since we rarely modify the apt-install 
(we frequently changes the pip-install), so I think it would be fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to