[
https://issues.apache.org/jira/browse/BEAM-11312?focusedWorklogId=518212&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-518212
]
ASF GitHub Bot logged work on BEAM-11312:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 01/Dec/20 01:22
Start Date: 01/Dec/20 01:22
Worklog Time Spent: 10m
Work Description: y1chi edited a comment on pull request #13399:
URL: https://github.com/apache/beam/pull/13399#issuecomment-736154645
> Thanks. It seems that caching may improve the startup time, and be useful
for users who frequently launch the same pipeline. However I think caching may
result in a difference in behavior. Questions:
>
> 1. Is it possible that caching will result in a stale image that users
will perceive as undesirable and the behavior will be difficult to debug to
users or support folks? For example, if a user pipeline depends on a latest
version of a dependency X in pypi. Perhaps a dependency they control. They have
a pipeline with a setup.py that has an open install_requires bound dep>=1.0.0 <
2. They run the pipeline, then push dependency to pypi and run the pipeline
again, expecting a change in behavior. Kaniko will not rebuild the image in
this case, right? What are your thoughts on that?
I think kaniko cache works the same way as docker layer cache, that is to
say, if the locally downloaded artifacts changed(or requirements.txt, setup.py
changed) it will actually change the COPY step in the prebuilding workflow.
There will be no valid cache layer since the artifacts copy step and a new
image will be rebuilt. (also verified through my own experiment with changing
requirements.txt)
> 2. During runtime with prebuilding workflow enabled, how visible is it to
the user that the cached layers are reused and not rebuilt?
There will be log entries "No cached layer found for cmd ..." in the cloud
build log.
> 3. I think we should document the prebuilding feature in Beam docs, and
reflect the caching behavior and associated TTLs. What is a plan for that?
I do believe Emily will be working on documenting this as part of the custom
container next quarter and I can also help.
> 4. Would customizing the TTL or adding a no-cache option make sense? We
are using default 2 weeks TTL, right? See:
https://cloud.google.com/cloud-build/docs/kaniko-cache#configuring_the_cache_expiration_time.
I think default value makes sense, I didn't want to provide too many knobs
to users since it may become more confusing or rarely used, but we can always
provide additional flags for more advanced user to control it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 518212)
Time Spent: 2h 40m (was: 2.5h)
> When prebuilding workflow on CloudBuild fails, we should tell the user where
> to look for build logs, ideally a link, or a pointer where the build logs are.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: BEAM-11312
> URL: https://issues.apache.org/jira/browse/BEAM-11312
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-core
> Reporter: Valentyn Tymofieiev
> Assignee: Yichi Zhang
> Priority: P2
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
> Current error does not have context:
> ...
> /apache_beam/runners/portability/sdk_container_builder.py", line 242, in
> invoke_docker_build_and_push
> result = operation.result()
> File
> "/usr/local/google/home/valentyn/.pyenv/versions/py37prebuild/lib/python3.7/site-packages/google/api_core/future/polling.py",
> line 134, in result
> raise self._exception
> google.api_core.exceptions.Unknown: None Build failed;
> check build logs for details
--
This message was sent by Atlassian Jira
(v8.3.4#803005)