Yes, using the latest tag is problematic and can lead to unexpected
behavior.
Using a date/time or 2.17.0.dev-$USER tag would be better. The validates
container shell script uses a datetime
<https://github.com/apache/beam/blob/6551d0937ee31a8e310b63b222dbc750ec9331f8/sdks/python/container/run_validatescontainer.sh#L87>
tag, which allows a unique name if no two tests are run in the same second.

On Wed, Oct 2, 2019 at 10:05 AM Thomas Weise <t...@apache.org> wrote:

> Want to bump this thread.
>
> If the current behavior is to replace locally built image with the last
> published, then this is not only unexpected for developers but also
> problematic for the CI, where tests should run against what was built from
> source. Or am I missing something?
>
> Thanks,
> Thomas
>
>
> On Tue, Sep 24, 2019 at 7:08 PM Thomas Weise <t...@apache.org> wrote:
>
>> Hi Hannah,
>>
>> I believe this is unexpected from the developer perspective. When
>> building something locally, we do expect that to be used. We may need to
>> change to not pull when the image is available locally, at least when it is
>> a snapshot/master branch. Release images should be immutable anyways.
>>
>> Thomas
>>
>>
>> On Tue, Sep 24, 2019 at 4:13 PM Hannah Jiang <hannahji...@google.com>
>> wrote:
>>
>>> A minor update, with custom container, the pipeline would not fail, it
>>> throws out warning and moves on to `docker run` command.
>>>
>>> On Tue, Sep 24, 2019 at 4:05 PM Hannah Jiang <hannahji...@google.com>
>>> wrote:
>>>
>>>> Hi Brian
>>>>
>>>> If we pull docker images, it always downloads from remote repository,
>>>> which is expected behavior.
>>>> In case we want to run a local image and pull it only when the image is
>>>> not available at local, we can use `docker run` command directly, without
>>>> pulling it in advance. [1]
>>>> In case we want to pull images only when they are not available at
>>>> local, we can use `docker images -q` to check if images are existing at
>>>> local before pulling it.
>>>> Another option is re-tag your local image, pass your image to pipeline
>>>> and overwrite default one, but the code is still trying to pull, so if your
>>>> image is not pushed to the remote repository, it would fail.
>>>>
>>>> 1. https://github.com/docker/cli/pull/1498
>>>>
>>>> Hannah
>>>>
>>>> On Tue, Sep 24, 2019 at 11:56 AM Brian Hulette <bhule...@google.com>
>>>> wrote:
>>>>
>>>>> I'm working on a demo cross-language pipeline on a local flink cluster
>>>>> that relies on my python row coder PR [1]. The PR includes some changes to
>>>>> the Java worker code, so I need to build a Java SDK container locally and
>>>>> use that in the pipeline.
>>>>>
>>>>> Unfortunately, whenever I run the pipeline,
>>>>> the apachebeam/java_sdk:latest tag is moved off of my locally built image
>>>>> to a newly downloaded image with a creation date 2 weeks ago, and that
>>>>> image is used instead. It looks like the reason is we run `docker pull`
>>>>> before running the container [2]. As the comment says this should be a
>>>>> no-op if the image already exists, but that doesn't seem to be the case. 
>>>>> If
>>>>> I just run `docker pull apachebeam/java_sdk:latest` on my local machine it
>>>>> downloads the 2 week old image and happily informs me:
>>>>>
>>>>> Status: Downloaded newer image for apachebeam/java_sdk:latest
>>>>>
>>>>> Does anyone know how I can prevent `docker pull` from doing this? I
>>>>> can unblock myself for now just by commenting out the docker pull command,
>>>>> but I'd like to understand what is going on here.
>>>>>
>>>>> Thanks,
>>>>> Brian
>>>>>
>>>>> [1] https://github.com/apache/beam/pull/9188
>>>>> [2]
>>>>> https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerCommand.java#L80
>>>>>
>>>>

Reply via email to