Agreed that it makes sense to publish containers built at HEAD - I filed BEAM-10593 [1] to track that work.
[1] https://issues.apache.org/jira/browse/BEAM-10593 On Wed, Jul 15, 2020 at 12:31 PM Kenneth Knowles <[email protected]> wrote: > It makes sense to me that the snapshot should be everything needed for a > release. Definitely containers fit that. > > Kenn > > On Wed, Jul 15, 2020 at 11:37 AM Chamikara Jayalath <[email protected]> > wrote: > >> >> >> On Wed, Jul 15, 2020 at 11:17 AM Kyle Weaver <[email protected]> wrote: >> >>> Thanks everyone for the details. Seems like Java 11 support is farther >>> along than I had imagined :) >>> >>> > Is there any progress into getting >>> > back, any ticket people can follow if interested? >>> >>> https://issues.apache.org/jira/browse/BEAM-10049 >>> >>> > I understand that a user can publish their own versions of HEAD >>> containers but this does not work well when developing automated tests for >>> distributed runners. >>> >>> Why not? >>> >> >> I would say the benefits of having regularly published HEAD containers >> will be similar to the benefits of having daily Beam SNAPSHOT jars >> published. >> For example, >> (1) This will give a common container that all Beam Jenkins tests can >> refer to when running jobs for distributed runners, for example when >> running Dataflow jobs >> (2) This will allow users to easily check fixes to HEAD >> (3) This will allow users to easily run additional automated tests on >> Beam HEAD (for example, Google internal tests) >> >> For example, we recently started using published Java containers for >> Dataflow cross-language pipelines. But running the same tests on HEAD >> requires additional setup. >> >> Thanks, >> Cham >> >> >>> >>> On Wed, Jul 15, 2020 at 9:25 AM Chamikara Jayalath <[email protected]> >>> wrote: >>> >>>> Can we consider regularly publishing HEAD containers as well (for >>>> example, we publish SNAPSHOT jars daily) ? I understand that a user can >>>> publish their own versions of HEAD containers but this does not work well >>>> when developing automated tests for distributed runners. Apologies if this >>>> was discussed before. >>>> >>>> Thanks, >>>> Cham >>>> >>>> On Wed, Jul 15, 2020 at 12:43 AM Ismaël Mejía <[email protected]> >>>> wrote: >>>> >>>>> Thanks Robert for the explanation. Is there any progress into getting >>>>> back, any ticket people can follow if interested? >>>>> >>>>> On Wed, Jul 15, 2020 at 12:13 AM Robert Burke <[email protected]> >>>>> wrote: >>>>> > >>>>> > Disallowing the go containers was largely due to not having a simple >>>>> check on the go boot code's licenses which is required for containers >>>>> hosted under the main Apache namespace. >>>>> > >>>>> > A manual verification reveals it's only either Go's standard >>>>> library BSD license and GRPCs Apache v2 licenses. Not impossible but not >>>>> yet done by us. The JIRA issue has a link to the appropriate license >>>>> finder >>>>> for go packages. >>>>> > >>>>> > The amusing bit is that very similar Go boot code is included in the >>>>> Java and Python containers too, so we're only accidentally in compliance >>>>> with that there, if at all. >>>>> > >>>>> > >>>>> > >>>>> > On Tue, Jul 14, 2020, 2:22 PM Ismaël Mejía <[email protected]> >>>>> wrote: >>>>> >> >>>>> >> +1 for naming as python containers, and quick release so users can >>>>> try it. >>>>> >> >>>>> >> Not related to this tnread but I am also curious about the reasons >>>>> to remove the >>>>> >> go docker images, was this discussed/voted in the ML (maybe I >>>>> missed it) ? >>>>> >> >>>>> >> I don't think Beam has been historically a conservative project >>>>> about releasing >>>>> >> early in-progress versions and I have learnt to appreciate this >>>>> because it helps >>>>> >> for early user testing and bug reports which will be definitely a >>>>> must for Java >>>>> >> 11. >>>>> >> >>>>> >> We should read the ticket Kyle mentions with a grain of salt. Most >>>>> of the >>>>> >> sub-tasks in that ticket are NOT about allowing users to run >>>>> pipelines with Java >>>>> >> 11 but about been able to fully build and run the tests and the >>>>> source code >>>>> >> ofBeam with Java 11 which is a different goal (important but >>>>> probably less for >>>>> >> end users) and a task with lots of extra issues because of plugins >>>>> / dependent >>>>> >> systems etc. >>>>> >> >>>>> >> For the Java 11 harness what we need is to guarantee is that users >>>>> can run their >>>>> >> code without issues with Java 11 and we can do this now for example >>>>> by checking >>>>> >> that portable runners that support Java 11 pass ValidatesRunner >>>>> with the Java 11 >>>>> >> harness. Since some classic runners [1] already pass these tests, >>>>> it should be >>>>> >> relatively 'easy' to do so for portable runners. >>>>> >> >>>>> >> [1] >>>>> https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/ >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> On Sat, Jul 11, 2020 at 12:43 AM Ahmet Altay <[email protected]> >>>>> wrote: >>>>> >> > >>>>> >> > Related to the naming question, +1 and this will be similar to >>>>> the python container naming (e.g. beam_python3.7_sdk). >>>>> >> > >>>>> >> > On Fri, Jul 10, 2020 at 1:46 PM Pablo Estrada <[email protected]> >>>>> wrote: >>>>> >> >> >>>>> >> >> I agree with Kenn. Dataflow already has some publishing of >>>>> non-portable JAva 11 containers, so I think it'll be great to formalize >>>>> the >>>>> process for portable containers, and let users play with it, and know of >>>>> its availability. >>>>> >> >> Best >>>>> >> >> -P. >>>>> >> >> >>>>> >> >> On Fri, Jul 10, 2020 at 9:42 AM Kenneth Knowles <[email protected]> >>>>> wrote: >>>>> >> >>> >>>>> >> >>> To the initial question: I'm +1 on the rename. The container is >>>>> primarily something that the SDK should insert into the pipeline proto >>>>> during construction, and only user-facing in more specialized situations. >>>>> Given the state of Java and portability, it is a good time to get things >>>>> named properly and unambiguously. I think a brief announce to dev@ >>>>> and user@ when it happens is nice-to-have, but no need to give >>>>> advance warning. >>>>> >> >>> >>>>> >> >>> Kenn >>>>> >> >>> >>>>> >> >>> On Fri, Jul 10, 2020 at 7:58 AM Kenneth Knowles < >>>>> [email protected]> wrote: >>>>> >> >>>> >>>>> >> >>>> I believe Beam already has quite a few users that have forged >>>>> ahead and used Java 11 with various runners, pre-portability. Mostly I >>>>> believe the Java 11 limitations are with particular features (Schema >>>>> codegen) and extensions/IOs/transitive deps. >>>>> >> >>>> >>>>> >> >>>> When it comes to the container, I'd be interested in looking >>>>> at test coverage. The Flink & Spark portable ValidatesRunner suites use >>>>> EMBEDDED environment, so they don't exercise the container. The first >>>>> testing of the Java SDK harness container against the Python-based >>>>> Universal Local Runner is in pull request now [1]. Are there other test >>>>> suites to highlight? How hard would it be to run Flink & Spark against the >>>>> container(s) too? >>>>> >> >>>> >>>>> >> >>>> Kenn >>>>> >> >>>> >>>>> >> >>>> [1] https://github.com/apache/beam/pull/11792 (despite the >>>>> name ValidatesRunner, in this case it is validating both the runner and >>>>> harness, since we don't have a compliance test suite for SDK harnesses) >>>>> >> >>>> >>>>> >> >>>> On Fri, Jul 10, 2020 at 7:54 AM Tyson Hamilton < >>>>> [email protected]> wrote: >>>>> >> >>>>> >>>>> >> >>>>> What do we consider 'ready'? >>>>> >> >>>>> >>>>> >> >>>>> Maybe the only required outstanding bugs are supporting the >>>>> direct runner (BEAM-10085), core tests (BEAM-10081), IO tests (BEAM-10084) >>>>> to start with? Notably this would exclude failing tests like those for GCP >>>>> core, GCPIOs, Dataflow runner, Spark runner, Flink runner, Samza. >>>>> >> >>>>> >>>>> >> >>>>> >>>>> >> >>>>> On Thu, Jul 9, 2020 at 4:44 PM Kyle Weaver < >>>>> [email protected]> wrote: >>>>> >> >>>>>> >>>>> >> >>>>>> My main question is, are we confident the Java 11 container >>>>> is ready to release? AFAIK there are still a number of issues blocking >>>>> full >>>>> Java 11 support (cf [1]; not sure how many of these, if any, affect the >>>>> SDK >>>>> harness specifically though.) >>>>> >> >>>>>> >>>>> >> >>>>>> For comparison, we recently decided to stop publishing Go >>>>> SDK containers until the Go SDK is considered mature [2]. In the meantime, >>>>> those who want to use the Go SDK can build their own container images from >>>>> source. >>>>> >> >>>>>> >>>>> >> >>>>>> Do we already have a Gradle task to build Java 11 >>>>> containers? If not, this would be a good intermediate step, letting users >>>>> opt-in to Java 11 without us overpromising support. >>>>> >> >>>>> >>>>> >> >>>>> >>>>> >> >>>>> We do not. From what I can tell, the build.gradele [1] for >>>>> the Java container is only for the one version. There is a docker file >>>>> used >>>>> for Jenkins tests. >>>>> >> >>>>> >>>>> >> >>>>> [1] >>>>> https://github.com/apache/beam/blob/master/sdks/java/container/build.gradle >>>>> >> >>>>> >>>>> >> >>>>>> >>>>> >> >>>>>> >>>>> >> >>>>>> When we eventually do the renaming, we can add a note to >>>>> CHANGES.md [3]. >>>>> >> >>>>>> >>>>> >> >>>>>> [1] https://issues.apache.org/jira/browse/BEAM-10090 >>>>> >> >>>>>> [2] https://issues.apache.org/jira/browse/BEAM-9685 >>>>> >> >>>>>> [3] https://github.com/apache/beam/blob/master/CHANGES.md >>>>> >> >>>>>> >>>>> >> >>>>>> On Thu, Jul 9, 2020 at 3:44 PM Emily Ye <[email protected]> >>>>> wrote: >>>>> >> >>>>>>> >>>>> >> >>>>>>> Hi all, >>>>> >> >>>>>>> >>>>> >> >>>>>>> I'm getting ramped up on contributing and was looking into >>>>> adding the Java 11 harness container to releases ( >>>>> https://issues.apache.org/jira/browse/BEAM-8106) - should I rename >>>>> the current java container so we have two new images `beam_java8_sdk` and >>>>> `beam_java11_sdk` or hold off on renaming? If we do rename it, what steps >>>>> should I take to announce/document the change? >>>>> >> >>>>>>> >>>>> >> >>>>>>> Thanks, >>>>> >> >>>>>>> Emily >>>>> >>>>
