Hi all,
I think it might be good to update the description of the beam docker
images and add some descriptive tags, because searching for "apache beam"
in docker hub does not turn up anything:
https://hub.docker.com/search?q=apache%20beam&type=image.

I clicked through 10 pages worth and couldn't find it.  Maybe I missed
something, but it clearly shouldn't be this hard.  I did eventually manage
to find it through the docs.

Also, googling "apache beam python docker" also does not yield anything
useful.  In fact, it turns up an *unofficial* apache beam docker hub image.

One thing I noticed is that images designated as "Official Images" get
listed first, so it would be good to get that done as well.

thanks!
chad




On Fri, Sep 6, 2019 at 1:21 PM Hannah Jiang <hannahji...@google.com> wrote:

> Hi team
>
> I haven't received any objections, so will proceed with settings mentioned
> in a previous email.
>
> A reminder to PMC members, please let me know your docker hub id if you
> want to be an admin.
>
> Thanks,
> Hannah
>
> On Thu, Sep 5, 2019 at 5:02 PM Ankur Goenka <goe...@google.com> wrote:
>
>> Please ignore the previous email. I was looking at the older document in
>> the mail thread.
>>
>> On Thu, Sep 5, 2019 at 4:58 PM Ankur Goenka <goe...@google.com> wrote:
>>
>>> I think sdk in the name is obsolete as they are all under sdks name
>>> space.
>>>
>>> On Thu, Sep 5, 2019 at 3:26 PM Hannah Jiang <hannahji...@google.com>
>>> wrote:
>>>
>>>> Hi Team
>>>>
>>>> Thanks for all the comments about beam containers.
>>>> After considering various opinions and investigating gcr and docker
>>>> hub, we decided to push images to docker hub.
>>>>
>>>> Each image will have two tags, {version}_rc and {version}. {version}
>>>> tag will be added after the release candidate image is verified.
>>>> Meanwhile, we will have* latest* tag for each repository, which always
>>>> points to the most recent verified release image, so users can pull it by
>>>> default.
>>>>
>>>> Docker hub doesn't support leveled repository, which means we should
>>>> follow *repository:tag* format.
>>>> it's too general if we use {language_version} as repository for SDK
>>>> images. (version is added when we support multiple versions.)
>>>> So I would like to include *sdk* to repository. Images generated at
>>>> local will also have the same name.
>>>> Here are some examples:
>>>>
>>>>    - python2.7_sdk:2.15.0
>>>>    - java_sdk:2.15.0_rc
>>>>    - go_sdk:latest
>>>>
>>>> I will proceed with this format if there is no strong opposition by
>>>> tomorrow noon(PST).
>>>>
>>>> *To PMC members*:
>>>> Permission control will follow the pypi model. All interested PMC
>>>> members will be added as admins and release managers will be granted push
>>>> permission.
>>>> Please let me know your *docker id* if you want to be added as an
>>>> admin.
>>>>
>>>> Thanks,
>>>> Hannah
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Sep 4, 2019 at 3:47 PM Thomas Weise <t...@apache.org> wrote:
>>>>
>>>>> This will greatly simplify trying out portable runners:
>>>>> https://beam.apache.org/documentation/runners/flink/#executing-a-beam-pipeline-on-a-flink-cluster
>>>>>
>>>>> Can't wait for following to disappear from the instructions page: 
>>>>> ./gradlew
>>>>> :sdks:python:container:docker
>>>>>
>>>>> On Wed, Sep 4, 2019 at 3:35 PM Thomas Weise <t...@apache.org> wrote:
>>>>>
>>>>>> Awesome, thank you!
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 4, 2019 at 3:22 PM Hannah Jiang <hannahji...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Thomas
>>>>>>>
>>>>>>> I created snapshot images from head as of around 2PM today.
>>>>>>> You can pull images from
>>>>>>> gcr.io/apache-beam-testing/beam/sdks/snapshot.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Hannah
>>>>>>>
>>>>>>> On Wed, Sep 4, 2019 at 1:41 PM Thomas Weise <t...@apache.org> wrote:
>>>>>>>
>>>>>>>> Hi Hannah,
>>>>>>>>
>>>>>>>> Thank you, I know how to build the containers locally, but not how
>>>>>>>> to publish them!
>>>>>>>>
>>>>>>>> The cwiki says "Publishing images to gcr.io/beam requires
>>>>>>>> permissions in apache-beam-testing project."
>>>>>>>>
>>>>>>>> Can I get access to the testing project (at least temporarily) and
>>>>>>>> what would I need to setup to run the publish target that is shown on 
>>>>>>>> cwiki?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Sep 4, 2019 at 11:06 AM Hannah Jiang <
>>>>>>>> hannahji...@google.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Thomas
>>>>>>>>>
>>>>>>>>> I haven't uploaded any snapshot images yet. Here is how you can
>>>>>>>>> create one from head.
>>>>>>>>> > cd [...]/beam/
>>>>>>>>> # For Python
>>>>>>>>> > ./gradlew :sdks:python:container:py{version}:docker *where
>>>>>>>>> version is {2,35,36,37}*
>>>>>>>>> # For Java
>>>>>>>>> > ./gradlew -p sdks/java/container docker
>>>>>>>>> # For Go
>>>>>>>>> > ./gradlew -p sdks/go/container docker
>>>>>>>>>
>>>>>>>>> The 2.15 one is just for testing, not a real 2.15.0, nor a
>>>>>>>>> snapshot from head.
>>>>>>>>>
>>>>>>>>> Please let me know if you have any questions.
>>>>>>>>> Hannah
>>>>>>>>>
>>>>>>>>> On Wed, Sep 4, 2019 at 10:57 AM Thomas Weise <t...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I actually found something in [1], but it is 2.15 unfortunately.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://console.cloud.google.com/gcr/images/apache-beam-testing/GLOBAL/beam/sdks/release/python2.7?gcrImageListsize=30
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 4, 2019 at 10:35 AM Thomas Weise <t...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks for working on this. Do you happen to have publicly
>>>>>>>>>>> accessible snapshots published for your testing currently (even 
>>>>>>>>>>> when the
>>>>>>>>>>> final location isn't sorted out)?
>>>>>>>>>>>
>>>>>>>>>>> I would like to use a 2.16 based Python SDK image for working on
>>>>>>>>>>> my downstream project, but could not find anything in
>>>>>>>>>>> gcr.io/apache-beam-testing/beam/sdks/rc/snapshot
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Thomas
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Aug 30, 2019 at 10:56 AM Valentyn Tymofieiev <
>>>>>>>>>>> valen...@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 27, 2019 at 3:35 PM Hannah Jiang <
>>>>>>>>>>>> hannahji...@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi team
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am working on improving docker container support for Beam.
>>>>>>>>>>>>> We would like to publish prebuilt containers for each release 
>>>>>>>>>>>>> version and
>>>>>>>>>>>>> daily snapshot. Current work focuses on release images only and 
>>>>>>>>>>>>> it would be
>>>>>>>>>>>>> part of the release process.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The release images will be pushed to GCR which is publicly
>>>>>>>>>>>>> accessible(pullable). We will use the following locations.
>>>>>>>>>>>>> *Repository*: gcr.io/beam
>>>>>>>>>>>>> *Project*: apache-beam-testing
>>>>>>>>>>>>> More details, including naming and tagging scheme, can be
>>>>>>>>>>>>> found at wiki
>>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/BEAM/%5BWIP%5D+SDKHarness+Container+Image+Release+Process>
>>>>>>>>>>>>>  which
>>>>>>>>>>>>> is written by several contributors.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would like to discuss these two questions.
>>>>>>>>>>>>> *1. How many tests do we need to run before pushing images to
>>>>>>>>>>>>> gcr*?
>>>>>>>>>>>>> Publishing artifacts is the last step of the release process,
>>>>>>>>>>>>> so at this moment, we already verified all codebase. In addition, 
>>>>>>>>>>>>> many
>>>>>>>>>>>>> Jenkins tests use containers, so it is already verified several 
>>>>>>>>>>>>> times. Do
>>>>>>>>>>>>> we need to run it again?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In a docker repository, one container image can have multiple
>>>>>>>>>>>> tags. One possibility is that  on the last step of the release 
>>>>>>>>>>>> process,
>>>>>>>>>>>> after sufficient testing,  we place a production tag on an image 
>>>>>>>>>>>> that was
>>>>>>>>>>>> already pushed with a dev tag.
>>>>>>>>>>>>
>>>>>>>>>>>> For example a dev tag may look like:
>>>>>>>>>>>> gcr.io/apache-beam/python37:2.16.0-RC4, and production tag may
>>>>>>>>>>>> look like:
>>>>>>>>>>>> gcr.io/apache-beam/python37:2.16.0 and both will refer to the
>>>>>>>>>>>> same image at the end.
>>>>>>>>>>>>
>>>>>>>>>>>> We should also plan what the process of updating the container
>>>>>>>>>>>> image will look like, if we need to release the image with
>>>>>>>>>>>> additional changes, and how we will test these changes before the 
>>>>>>>>>>>> final
>>>>>>>>>>>> push (or placing production tag).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *2. How many tests do we need to run to validate pushed
>>>>>>>>>>>>> images?*
>>>>>>>>>>>>> When we push the images, we assume the images would work and
>>>>>>>>>>>>> pass all the tests. After pushing, we should confirm the images 
>>>>>>>>>>>>> are
>>>>>>>>>>>>> pullable and useable. I suggest we run several tests on dataflow 
>>>>>>>>>>>>> with each
>>>>>>>>>>>>> pushed image. What do you think?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I think it makes sense to do -  Beam runners that use SDK
>>>>>>>>>>>> container images should have some continuously running tests, which
>>>>>>>>>>>> periodically check that all supported images  are pullable and 
>>>>>>>>>>>> still
>>>>>>>>>>>> compatible with the runner.
>>>>>>>>>>>>
>>>>>>>>>>>> This work can be refined later as we explore more during our
>>>>>>>>>>>>> release process.
>>>>>>>>>>>>> Please comment or edit the wiki page or reply to this email
>>>>>>>>>>>>> with your opinions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Hannah
>>>>>>>>>>>>>
>>>>>>>>>>>>

Reply via email to