tvalentyn edited a comment on pull request #13399: URL: https://github.com/apache/beam/pull/13399#issuecomment-736125575
Thanks. It seems that caching may improve the startup time, and be useful for users who frequently launch the same pipeline. However I think caching may result in a difference in behavior. Questions: 1. Is it possible that caching will result in a stale image that users will perceive as undesirable and the behavior will be difficult to debug to users or support folks? For example, if a user pipeline depends on a latest version of a dependency X in pypi. Perhaps a dependency they control. They have a pipeline with a setup.py that has an open install_requires bound dep>=1.0.0 < 2. They run the pipeline, then push dependency to pypi and run the pipeline again, expecting a change in behavior. Kaniko will not rebuild the image in this case, right? What are your thoughts on that? 2. During runtime with prebuilding workflow enabled, how visible is it to the user that the cached layers are reused and not rebuilt? 3. I think we should document the prebuilding feature in Beam docs, and reflect the caching behavior and associated TTLs. What is a plan for that? 4. Would customizing the TTL or adding a no-cache option make sense? We are using default 2 weeks TTL, right? See: https://cloud.google.com/cloud-build/docs/kaniko-cache#configuring_the_cache_expiration_time. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
