Please ignore the previous email. I was looking at the older document in the mail thread.
On Thu, Sep 5, 2019 at 4:58 PM Ankur Goenka <goe...@google.com> wrote: > I think sdk in the name is obsolete as they are all under sdks name space. > > On Thu, Sep 5, 2019 at 3:26 PM Hannah Jiang <hannahji...@google.com> > wrote: > >> Hi Team >> >> Thanks for all the comments about beam containers. >> After considering various opinions and investigating gcr and docker hub, >> we decided to push images to docker hub. >> >> Each image will have two tags, {version}_rc and {version}. {version} tag >> will be added after the release candidate image is verified. >> Meanwhile, we will have* latest* tag for each repository, which always >> points to the most recent verified release image, so users can pull it by >> default. >> >> Docker hub doesn't support leveled repository, which means we should >> follow *repository:tag* format. >> it's too general if we use {language_version} as repository for SDK >> images. (version is added when we support multiple versions.) >> So I would like to include *sdk* to repository. Images generated at >> local will also have the same name. >> Here are some examples: >> >> - python2.7_sdk:2.15.0 >> - java_sdk:2.15.0_rc >> - go_sdk:latest >> >> I will proceed with this format if there is no strong opposition by >> tomorrow noon(PST). >> >> *To PMC members*: >> Permission control will follow the pypi model. All interested PMC members >> will be added as admins and release managers will be granted push >> permission. >> Please let me know your *docker id* if you want to be added as an admin. >> >> Thanks, >> Hannah >> >> >> >> >> >> >> >> >> On Wed, Sep 4, 2019 at 3:47 PM Thomas Weise <t...@apache.org> wrote: >> >>> This will greatly simplify trying out portable runners: >>> https://beam.apache.org/documentation/runners/flink/#executing-a-beam-pipeline-on-a-flink-cluster >>> >>> Can't wait for following to disappear from the instructions page: ./gradlew >>> :sdks:python:container:docker >>> >>> On Wed, Sep 4, 2019 at 3:35 PM Thomas Weise <t...@apache.org> wrote: >>> >>>> Awesome, thank you! >>>> >>>> >>>> On Wed, Sep 4, 2019 at 3:22 PM Hannah Jiang <hannahji...@google.com> >>>> wrote: >>>> >>>>> Hi Thomas >>>>> >>>>> I created snapshot images from head as of around 2PM today. >>>>> You can pull images from gcr.io/apache-beam-testing/beam/sdks/snapshot >>>>> . >>>>> >>>>> Thanks, >>>>> Hannah >>>>> >>>>> On Wed, Sep 4, 2019 at 1:41 PM Thomas Weise <t...@apache.org> wrote: >>>>> >>>>>> Hi Hannah, >>>>>> >>>>>> Thank you, I know how to build the containers locally, but not how to >>>>>> publish them! >>>>>> >>>>>> The cwiki says "Publishing images to gcr.io/beam requires >>>>>> permissions in apache-beam-testing project." >>>>>> >>>>>> Can I get access to the testing project (at least temporarily) and >>>>>> what would I need to setup to run the publish target that is shown on >>>>>> cwiki? >>>>>> >>>>>> Thanks, >>>>>> Thomas >>>>>> >>>>>> >>>>>> On Wed, Sep 4, 2019 at 11:06 AM Hannah Jiang <hannahji...@google.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Thomas >>>>>>> >>>>>>> I haven't uploaded any snapshot images yet. Here is how you can >>>>>>> create one from head. >>>>>>> > cd [...]/beam/ >>>>>>> # For Python >>>>>>> > ./gradlew :sdks:python:container:py{version}:docker *where >>>>>>> version is {2,35,36,37}* >>>>>>> # For Java >>>>>>> > ./gradlew -p sdks/java/container docker >>>>>>> # For Go >>>>>>> > ./gradlew -p sdks/go/container docker >>>>>>> >>>>>>> The 2.15 one is just for testing, not a real 2.15.0, nor a snapshot >>>>>>> from head. >>>>>>> >>>>>>> Please let me know if you have any questions. >>>>>>> Hannah >>>>>>> >>>>>>> On Wed, Sep 4, 2019 at 10:57 AM Thomas Weise <t...@apache.org> wrote: >>>>>>> >>>>>>>> I actually found something in [1], but it is 2.15 unfortunately. >>>>>>>> >>>>>>>> [1] >>>>>>>> https://console.cloud.google.com/gcr/images/apache-beam-testing/GLOBAL/beam/sdks/release/python2.7?gcrImageListsize=30 >>>>>>>> >>>>>>>> On Wed, Sep 4, 2019 at 10:35 AM Thomas Weise <t...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks for working on this. Do you happen to have publicly >>>>>>>>> accessible snapshots published for your testing currently (even when >>>>>>>>> the >>>>>>>>> final location isn't sorted out)? >>>>>>>>> >>>>>>>>> I would like to use a 2.16 based Python SDK image for working on >>>>>>>>> my downstream project, but could not find anything in >>>>>>>>> gcr.io/apache-beam-testing/beam/sdks/rc/snapshot >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Thomas >>>>>>>>> >>>>>>>>> On Fri, Aug 30, 2019 at 10:56 AM Valentyn Tymofieiev < >>>>>>>>> valen...@google.com> wrote: >>>>>>>>> >>>>>>>>>> On Tue, Aug 27, 2019 at 3:35 PM Hannah Jiang < >>>>>>>>>> hannahji...@google.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi team >>>>>>>>>>> >>>>>>>>>>> I am working on improving docker container support for Beam. We >>>>>>>>>>> would like to publish prebuilt containers for each release version >>>>>>>>>>> and >>>>>>>>>>> daily snapshot. Current work focuses on release images only and it >>>>>>>>>>> would be >>>>>>>>>>> part of the release process. >>>>>>>>>>> >>>>>>>>>>> The release images will be pushed to GCR which is publicly >>>>>>>>>>> accessible(pullable). We will use the following locations. >>>>>>>>>>> *Repository*: gcr.io/beam >>>>>>>>>>> *Project*: apache-beam-testing >>>>>>>>>>> More details, including naming and tagging scheme, can be found >>>>>>>>>>> at wiki >>>>>>>>>>> <https://cwiki.apache.org/confluence/display/BEAM/%5BWIP%5D+SDKHarness+Container+Image+Release+Process> >>>>>>>>>>> which >>>>>>>>>>> is written by several contributors. >>>>>>>>>>> >>>>>>>>>>> I would like to discuss these two questions. >>>>>>>>>>> *1. How many tests do we need to run before pushing images to >>>>>>>>>>> gcr*? >>>>>>>>>>> Publishing artifacts is the last step of the release process, so >>>>>>>>>>> at this moment, we already verified all codebase. In addition, many >>>>>>>>>>> Jenkins >>>>>>>>>>> tests use containers, so it is already verified several times. Do >>>>>>>>>>> we need >>>>>>>>>>> to run it again? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> In a docker repository, one container image can have multiple >>>>>>>>>> tags. One possibility is that on the last step of the release >>>>>>>>>> process, >>>>>>>>>> after sufficient testing, we place a production tag on an image >>>>>>>>>> that was >>>>>>>>>> already pushed with a dev tag. >>>>>>>>>> >>>>>>>>>> For example a dev tag may look like: >>>>>>>>>> gcr.io/apache-beam/python37:2.16.0-RC4, and production tag may >>>>>>>>>> look like: >>>>>>>>>> gcr.io/apache-beam/python37:2.16.0 and both will refer to the >>>>>>>>>> same image at the end. >>>>>>>>>> >>>>>>>>>> We should also plan what the process of updating the container >>>>>>>>>> image will look like, if we need to release the image with >>>>>>>>>> additional changes, and how we will test these changes before the >>>>>>>>>> final >>>>>>>>>> push (or placing production tag). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *2. How many tests do we need to run to validate pushed images?* >>>>>>>>>>> When we push the images, we assume the images would work and >>>>>>>>>>> pass all the tests. After pushing, we should confirm the images are >>>>>>>>>>> pullable and useable. I suggest we run several tests on dataflow >>>>>>>>>>> with each >>>>>>>>>>> pushed image. What do you think? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I think it makes sense to do - Beam runners that use SDK >>>>>>>>>> container images should have some continuously running tests, which >>>>>>>>>> periodically check that all supported images are pullable and still >>>>>>>>>> compatible with the runner. >>>>>>>>>> >>>>>>>>>> This work can be refined later as we explore more during our >>>>>>>>>>> release process. >>>>>>>>>>> Please comment or edit the wiki page or reply to this email with >>>>>>>>>>> your opinions. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Hannah >>>>>>>>>>> >>>>>>>>>>