+1 (non-binding) Validated wordcount with Python 3.7.8 and Flink 1.10.0 (both loopback and using the Docker image). Also Python 3.7.8 loopback with an embedded Spark cluster.
On Thu, Sep 10, 2020 at 2:32 PM Daniel Oliveira <[email protected]> wrote: > By the way, most of the validation so far has covered Direct runner and > Dataflow, but Flink and Spark still have little validation, so if anyone > can help with those it will help speed up the release. > > On Thu, Sep 10, 2020 at 2:12 PM Daniel Oliveira <[email protected]> > wrote: > >> So I tracked the --temp_location issue down to >> https://github.com/apache/beam/pull/12203 and asked @Pablo Estrada >> <[email protected]> and @Chamikara Jayalath <[email protected]> about >> it. It's not exactly a bug, but an intended change in requirements for >> WriteToBigQuery, so the only fix I'll need to do is update the test script >> with the appropriate flag, which should be easy. It also won't require >> building a new release candidate. >> >> There is a possibility that user pipelines will break if they're using >> BigQuery with the Python Direct Runner, so I'll add a note to the changelog >> about it, but I don't think the change is significant enough to need >> anything beyond that. >> >> On Thu, Sep 10, 2020 at 1:47 PM Chamikara Jayalath <[email protected]> >> wrote: >> >>> +1 (non-binding) >>> >>> Thanks, >>> Cham >>> >>> On Thu, Sep 10, 2020 at 11:26 AM Ahmet Altay <[email protected]> wrote: >>> >>>> +1 - validated py3 quickstarts. The problem I mentioned earlier is >>>> resolved. >>>> >>>> On Wed, Sep 9, 2020 at 7:46 PM Daniel Oliveira <[email protected]> >>>> wrote: >>>> >>>>> Good news: According to >>>>> https://ci-beam.apache.org/job/beam_PostRelease_Python_Candidate/188/consoleFull >>>>> the >>>>> Streaming Wordcount quickstart work for Dataflow with Python 2.7. So it >>>>> looks like the container issue might be fixed. >>>>> >>>>> Bad news: That same Jenkins job failed on "Running HourlyTeamScore >>>>> example with DirectRunner" because it's missing a --temp_location flag, >>>>> despite using the DirectRunner. This looks like a bug, but I'm still >>>>> investigating whether it'll need another cherry-pick and RC to fix or if >>>>> the validation script just needs to be updated. I'll update the thread if >>>>> I >>>>> find anything. >>>>> >>>> >>>> Probably it does not require a cherry-pick. We have not validated that >>>> workflow in the past few releases. >>>> >>>> >>>>> >>>>> On Wed, Sep 9, 2020 at 4:58 PM Daniel Oliveira <[email protected]> >>>>> wrote: >>>>> >>>>>> The Dataflow Python Batch worker issue should be fixed now. I tried >>>>>> verifying it myself via the rc validation script, but I've been having >>>>>> some >>>>>> trouble with the GCP authentication so if someone else can validate it, >>>>>> that would be a big help. >>>>>> >>>>>> On Tue, Sep 8, 2020 at 5:51 PM Robert Bradshaw <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I verified the signatures and all the artifacts are correct, and >>>>>>> tested a wheel in a fresh virtual environment. It'd be good to see the >>>>>>> Dataflow issue confirmed as fixed though. >>>>>>> >>>>>>> On Tue, Sep 8, 2020 at 5:17 PM Valentyn Tymofieiev < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> This error comes from the Dataflow Python Batch worker. >>>>>>>> >>>>>>>> Streaming workflows use sdk worker, which is provided by >>>>>>>> apache-beam library, so the versions will match. >>>>>>>> >>>>>>>> The error should be fixed by setting the correct Dataflow worker >>>>>>>> version in Dataflow containers, and does not affect Beam RC. >>>>>>>> >>>>>>>> On Tue, Sep 8, 2020 at 4:52 PM Ahmet Altay <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> -1 - I validated py3 quickstarts on dataflow and direct runner. I >>>>>>>>> ran into 1 issue with batch workflows on dataflow: >>>>>>>>> >>>>>>>>> "RuntimeError: Beam SDK base version 2.24.0 does not match >>>>>>>>> Dataflow Python worker version 2.24.0.dev. Please check Dataflow >>>>>>>>> worker startup logs and make sure that correct version of Beam SDK is >>>>>>>>> installed." >>>>>>>>> >>>>>>>>> It seems like the batch worker needs to be rebuild. Not sure why >>>>>>>>> the streaming worker did not fail (does it have the correct version? >>>>>>>>> or >>>>>>>>> does it not have the same check?) >>>>>>>>> >>>>>>>>> Ahmet >>>>>>>>> >>>>>>>>> On Fri, Sep 4, 2020 at 1:33 PM Valentyn Tymofieiev < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Dataflow containers are also available now. >>>>>>>>>> >>>>>>>>>> On Thu, Sep 3, 2020 at 11:47 PM Daniel Oliveira < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> This should fix the BigQueryIO regression that Pablo caught. >>>>>>>>>>> >>>>>>>>>>> As before, Dataflow containers are not yet ready. I or someone >>>>>>>>>>> else will chime in on the thread once it's ready. >>>>>>>>>>> >>>>>>>>>>> On Thu, Sep 3, 2020 at 11:39 PM Daniel Oliveira < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi everyone, >>>>>>>>>>>> Please review and vote on the release candidate #3 for the >>>>>>>>>>>> version 2.24.0, as follows: >>>>>>>>>>>> [ ] +1, Approve the release >>>>>>>>>>>> [ ] -1, Do not approve the release (please provide specific >>>>>>>>>>>> comments) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The complete staging area is available for your review, which >>>>>>>>>>>> includes: >>>>>>>>>>>> * JIRA release notes [1], >>>>>>>>>>>> * the official Apache source release to be deployed to >>>>>>>>>>>> dist.apache.org [2], which is signed with the key with >>>>>>>>>>>> fingerprint D0E7B69D911ADA3C0482BAA1C4E6B2F8C71D742F [3], >>>>>>>>>>>> * all artifacts to be deployed to the Maven Central Repository >>>>>>>>>>>> [4], >>>>>>>>>>>> * source code tag "v2.24.0-RC3" [5], >>>>>>>>>>>> * website pull request listing the release [6], publishing the >>>>>>>>>>>> API reference manual [7], and the blog post [8]. >>>>>>>>>>>> * Java artifacts were built with Maven 3.6.3 and OpenJDK 1.8.0. >>>>>>>>>>>> * Python artifacts are deployed along with the source release >>>>>>>>>>>> to the dist.apache.org [2]. >>>>>>>>>>>> * Validation sheet with a tab for 2.24.0 release to help with >>>>>>>>>>>> validation [9]. >>>>>>>>>>>> * Docker images published to Docker Hub [10]. >>>>>>>>>>>> >>>>>>>>>>>> The vote will be open for at least 72 hours. It is adopted by >>>>>>>>>>>> majority approval, with at least 3 PMC affirmative votes. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Release Manager >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12347146 >>>>>>>>>>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.24.0/ >>>>>>>>>>>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS >>>>>>>>>>>> [4] >>>>>>>>>>>> https://repository.apache.org/content/repositories/orgapachebeam-1110/ >>>>>>>>>>>> [5] https://github.com/apache/beam/tree/v2.24.0-RC3 >>>>>>>>>>>> [6] https://github.com/apache/beam/pull/12743 >>>>>>>>>>>> [7] https://github.com/apache/beam-site/pull/607 >>>>>>>>>>>> [8] https://github.com/apache/beam/pull/12745 >>>>>>>>>>>> [9] >>>>>>>>>>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1432428331 >>>>>>>>>>>> [10] https://hub.docker.com/search?q=apache%2Fbeam&type=image >>>>>>>>>>>> >>>>>>>>>>>>
