Re: [VOTE] Release 2.26.0, release candidate #1

2020-12-11 Thread Jean-Baptiste Onofre
+1 (binding)

Sorry for the delay.

Regards
JB

> Le 10 déc. 2020 à 17:40, Tyson Hamilton  a écrit :
> 
> +1 from me. I validated Nexmark performance tests.
> 
> On Tue, Dec 8, 2020 at 7:53 PM Robert Burke  > wrote:
> I'm +1 on RC1 based on the 7 tests I know I can check successfully.  I'll be 
> trying more tomorrow, but remember that release validation requires the 
> community to validate it meets our standards, and I can't do it alone.
> 
> Remember you can participate in the release validation by reviewing parts of 
> the documentation being published as well, not just by running the Pyhton and 
> Java artifacts.
> 
>  If you have contributed new python or java docs into this release, they'll 
> appear in the to be published docs.
> 
> Cheers,
> Robert Burke
> 2.26.0 release manager
> 
> On Mon, Dec 7, 2020, 6:25 PM Robert Burke  > wrote:
> Turns out no changes required affecting the dataflow artifacts this time 
> around, so Dataflow is cleared for testing.
> 
> Cheers.
> Robert Burke
> 2.26.0 Release Manager
> 
> On Mon, Dec 7, 2020, 6:03 PM Robert Burke  > wrote:
> 
> Robert Burke mailto:r...@google.com>>
> Thu, Dec 3, 8:01 PM (4 days ago)
> 
> to dev
> Hi everyone,
> Please review and vote on the release candidate #1 for the version 2.26.0, as 
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> 
> 
> Reviewers are encouraged to test their own use cases with the release 
> candidate, and vote +1
>  if no issues are found.
> 
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org 
>  [2], which is signed with the key with fingerprint 
> A52F5C83BAE26160120EC25F3D56ACFBFB2975E1 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "v2.26.0-RC1" [5],
> * website pull request listing the release [6], publishing the API reference 
> manual [7], and the blog post [8].
> * Java artifacts were built with Maven 3.6.0 and OpenJDK 1.8.0_275.
> * Python artifacts are deployed along with the source release to the 
> dist.apache.org  [2].
> * Validation sheet with a tab for 2.26.0 release to help with validation [9].
> * Docker images published to Docker Hub [10].
> 
> The vote will be open for at least 72 hours (10th ~6pm PST). It is adopted by 
> majority approval, with at least 3 PMC affirmative votes.
> 
> Thanks,
> Robert Burke
> 2.26.0 Release Manager
> 
> [1] 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12348833
>  
> 
> [2] https://dist.apache.org/repos/dist/dev/beam/2.26.0/ 
> 
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS 
> 
> [4] https://repository.apache.org/content/repositories/org 
>  apache beam-1144/
> [5] https://github.com/apache/beam/tree/v2.26.0-RC1 
> 
> [6] https://github.com/apache/beam/pull/13481 
> 
> [7] https://github.com/apache/beam-site/pull/609 
> 
> [8] https://github.com/apache/beam/pull/13482 
> 
> [9] 
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=475997301
>  
> 
> [10] https://hub.docker.com/search?q=apache%2Fbeam=image 
> 
> 
> PS. New Dataflow artifacts likely need to be built and published but this 
> doesn't block vetting the remainder of the RC at this time. Thank you for 
> your patience.
> 



[ANNOUNCE] Beam 2.26.0 Released

2020-12-11 Thread Robert Burke
The Apache Beam team is pleased to announce the release of version 2.26.0.

Apache Beam is an open source unified programming model to define and
execute data processing pipelines, including ETL, batch and stream
(continuous) processing.
See https://beam.apache.org

You can download the release here:
https://beam.apache.org/get-started/downloads/

This release includes bug fixes, features, and improvements detailed on the
Beam blog: https://beam.apache.org/blog/beam-2.26.0/

Thank you to everyone who contributed to this release, and we hope you
enjoy using Beam 2.26.0
Robert Burke
2.26.0 Release Manager


Re: Usability regression using SDF Unbounded Source wrapper + DirectRunner

2020-12-11 Thread Boyuan Zhang
> From what I've seen, the direct runner initiates a checkpoint after every
element output.
That seems like the 1 second limit kicks in before the output reaches 100
elements.

I think the original purpose for DirectRunner to use a small limit on
issuing checkpoint requests is for exercising SDF better in a small data
set. But it brings overhead on a larger set owing to too many checkpoints.
It would be ideal to make this limit configurable from pipeline but the
easiest approach is that we figure out a number for most common cases. Do
you think we raise the limit to 1000 elements or every 5 seconds will help?

On Fri, Dec 11, 2020 at 2:22 PM Steve Niemitz  wrote:

> From what I've seen, the direct runner initiates a checkpoint after every
> element output.
>
> On Fri, Dec 11, 2020 at 5:19 PM Boyuan Zhang  wrote:
>
>> Hi Antonio,
>>
>> Thanks for the details! Which version of Beam SDK are you using? And are
>> you using --experiments=beam_fn_api with DirectRunner to launch your
>> pipeline?
>>
>> For ReadFromKafkaDoFn.processElement(), it will take a Kafka
>> topic+partition as input element and a KafkaConsumer will be assigned to
>> this topic+partition then poll records continuously. The Kafka consumer
>> will resume reading and return from the process fn when
>>
>>- There are no available records currently(this is a feature of SDF
>>which calls SDF self-initiated checkpoint)
>>- The OutputAndTimeBoundedSplittableProcessElementInvoker issues
>>checkpoint request to ReadFromKafkaDoFn for getting partial results. The
>>checkpoint frequency for DirectRunner is every 100 output records or every
>>1 seconds.
>>
>> It seems like either the self-initiated checkpoint or DirectRunner issued
>> checkpoint gives you the performance regression since there is overhead
>> when rescheduling residuals. In your case, it's more like that the
>> checkpoint behavior of OutputAndTimeBoundedSplittableProcessElementInvoker
>> gives you 200 elements a batch. I want to understand what kind of
>> performance regression you are noticing? Is it slower to output the same
>> amount of records?
>>
>> On Fri, Dec 11, 2020 at 1:31 PM Antonio Si  wrote:
>>
>>> Hi Boyuan,
>>>
>>> This is Antonio. I reported the KafkaIO.read() performance issue on the
>>> slack channel a few days ago.
>>>
>>> I am not sure if this is helpful, but I have been doing some debugging
>>> on the SDK KafkaIO performance issue for our pipeline and I would like to
>>> provide some observations.
>>>
>>> It looks like in my case the ReadFromKafkaDoFn.processElement()  was
>>> invoked within the same thread and every time kafaconsumer.poll() is
>>> called, it returns some records, from 1 up to 200 records. So, it will
>>> proceed to run the pipeline steps. Each kafkaconsumer.poll() takes about
>>> 0.8ms. So, in this case, the polling and running of the pipeline are
>>> executed sequentially within a single thread. So, after processing a batch
>>> of records, it will need to wait for 0.8ms before it can process the next
>>> batch of records again.
>>>
>>> Any suggestions would be appreciated.
>>>
>>> Hope that helps.
>>>
>>> Thanks and regards,
>>>
>>> Antonio.
>>>
>>> On 2020/12/04 19:17:46, Boyuan Zhang  wrote:
>>> > Opened https://issues.apache.org/jira/browse/BEAM-11403 for tracking.
>>> >
>>> > On Fri, Dec 4, 2020 at 10:52 AM Boyuan Zhang 
>>> wrote:
>>> >
>>> > > Thanks for the pointer, Steve! I'll check it out. The execution
>>> paths for
>>> > > UnboundedSource and SDF wrapper are different. It's highly possible
>>> that
>>> > > the regression either comes from the invocation path for SDF
>>> wrapper, or
>>> > > the implementation of SDF wrapper itself.
>>> > >
>>> > > On Fri, Dec 4, 2020 at 6:33 AM Steve Niemitz 
>>> wrote:
>>> > >
>>> > >> Coincidentally, someone else in the ASF slack mentioned [1]
>>> yesterday
>>> > >> that they were seeing significantly reduced performance using
>>> KafkaIO.Read
>>> > >> w/ the SDF wrapper vs the unbounded source.  They mentioned they
>>> were using
>>> > >> flink 1.9.
>>> > >>
>>> > >> https://the-asf.slack.com/archives/C9H0YNP3P/p1607057900393900
>>> > >>
>>> > >> On Thu, Dec 3, 2020 at 1:56 PM Boyuan Zhang 
>>> wrote:
>>> > >>
>>> > >>> Hi Steve,
>>> > >>>
>>> > >>> I think the major performance regression comes from
>>> > >>> OutputAndTimeBoundedSplittableProcessElementInvoker[1], which will
>>> > >>> checkpoint the DoFn based on time/output limit and use
>>> timers/state to
>>> > >>> reschedule works.
>>> > >>>
>>> > >>> [1]
>>> > >>>
>>> https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/OutputAndTimeBoundedSplittableProcessElementInvoker.java
>>> > >>>
>>> > >>> On Thu, Dec 3, 2020 at 9:40 AM Steve Niemitz 
>>> > >>> wrote:
>>> > >>>
>>> >  I have a pipeline that reads from pubsub, does some aggregation,
>>> and
>>> >  writes to various places.  Previously, in older versions of beam,
>>> when
>>> >  running this in the 

Re: Usability regression using SDF Unbounded Source wrapper + DirectRunner

2020-12-11 Thread Steve Niemitz
>From what I've seen, the direct runner initiates a checkpoint after every
element output.

On Fri, Dec 11, 2020 at 5:19 PM Boyuan Zhang  wrote:

> Hi Antonio,
>
> Thanks for the details! Which version of Beam SDK are you using? And are
> you using --experiments=beam_fn_api with DirectRunner to launch your
> pipeline?
>
> For ReadFromKafkaDoFn.processElement(), it will take a Kafka
> topic+partition as input element and a KafkaConsumer will be assigned to
> this topic+partition then poll records continuously. The Kafka consumer
> will resume reading and return from the process fn when
>
>- There are no available records currently(this is a feature of SDF
>which calls SDF self-initiated checkpoint)
>- The OutputAndTimeBoundedSplittableProcessElementInvoker issues
>checkpoint request to ReadFromKafkaDoFn for getting partial results. The
>checkpoint frequency for DirectRunner is every 100 output records or every
>1 seconds.
>
> It seems like either the self-initiated checkpoint or DirectRunner issued
> checkpoint gives you the performance regression since there is overhead
> when rescheduling residuals. In your case, it's more like that the
> checkpoint behavior of OutputAndTimeBoundedSplittableProcessElementInvoker
> gives you 200 elements a batch. I want to understand what kind of
> performance regression you are noticing? Is it slower to output the same
> amount of records?
>
> On Fri, Dec 11, 2020 at 1:31 PM Antonio Si  wrote:
>
>> Hi Boyuan,
>>
>> This is Antonio. I reported the KafkaIO.read() performance issue on the
>> slack channel a few days ago.
>>
>> I am not sure if this is helpful, but I have been doing some debugging on
>> the SDK KafkaIO performance issue for our pipeline and I would like to
>> provide some observations.
>>
>> It looks like in my case the ReadFromKafkaDoFn.processElement()  was
>> invoked within the same thread and every time kafaconsumer.poll() is
>> called, it returns some records, from 1 up to 200 records. So, it will
>> proceed to run the pipeline steps. Each kafkaconsumer.poll() takes about
>> 0.8ms. So, in this case, the polling and running of the pipeline are
>> executed sequentially within a single thread. So, after processing a batch
>> of records, it will need to wait for 0.8ms before it can process the next
>> batch of records again.
>>
>> Any suggestions would be appreciated.
>>
>> Hope that helps.
>>
>> Thanks and regards,
>>
>> Antonio.
>>
>> On 2020/12/04 19:17:46, Boyuan Zhang  wrote:
>> > Opened https://issues.apache.org/jira/browse/BEAM-11403 for tracking.
>> >
>> > On Fri, Dec 4, 2020 at 10:52 AM Boyuan Zhang 
>> wrote:
>> >
>> > > Thanks for the pointer, Steve! I'll check it out. The execution paths
>> for
>> > > UnboundedSource and SDF wrapper are different. It's highly possible
>> that
>> > > the regression either comes from the invocation path for SDF wrapper,
>> or
>> > > the implementation of SDF wrapper itself.
>> > >
>> > > On Fri, Dec 4, 2020 at 6:33 AM Steve Niemitz 
>> wrote:
>> > >
>> > >> Coincidentally, someone else in the ASF slack mentioned [1] yesterday
>> > >> that they were seeing significantly reduced performance using
>> KafkaIO.Read
>> > >> w/ the SDF wrapper vs the unbounded source.  They mentioned they
>> were using
>> > >> flink 1.9.
>> > >>
>> > >> https://the-asf.slack.com/archives/C9H0YNP3P/p1607057900393900
>> > >>
>> > >> On Thu, Dec 3, 2020 at 1:56 PM Boyuan Zhang 
>> wrote:
>> > >>
>> > >>> Hi Steve,
>> > >>>
>> > >>> I think the major performance regression comes from
>> > >>> OutputAndTimeBoundedSplittableProcessElementInvoker[1], which will
>> > >>> checkpoint the DoFn based on time/output limit and use timers/state
>> to
>> > >>> reschedule works.
>> > >>>
>> > >>> [1]
>> > >>>
>> https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/OutputAndTimeBoundedSplittableProcessElementInvoker.java
>> > >>>
>> > >>> On Thu, Dec 3, 2020 at 9:40 AM Steve Niemitz 
>> > >>> wrote:
>> > >>>
>> >  I have a pipeline that reads from pubsub, does some aggregation,
>> and
>> >  writes to various places.  Previously, in older versions of beam,
>> when
>> >  running this in the DirectRunner, messages would go through the
>> pipeline
>> >  almost instantly, making it very easy to debug locally, etc.
>> > 
>> >  However, after upgrading to beam 2.25, I noticed that it could
>> take on
>> >  the order of 5-10 minutes for messages to get from the pubsub read
>> step to
>> >  the next step in the pipeline (deserializing them, etc).  The
>> subscription
>> >  being read from has on the order of 100,000 elements/sec arriving
>> in it.
>> > 
>> >  Setting --experiments=use_deprecated_read fixes it, and makes the
>> >  pipeline behave as it did before.
>> > 
>> >  It seems like the SDF implementation in the DirectRunner here is
>> >  causing some kind of issue, either buffering a very large amount
>> of data
>> >  

Re: Usability regression using SDF Unbounded Source wrapper + DirectRunner

2020-12-11 Thread Boyuan Zhang
Hi Antonio,

Thanks for the details! Which version of Beam SDK are you using? And are
you using --experiments=beam_fn_api with DirectRunner to launch your
pipeline?

For ReadFromKafkaDoFn.processElement(), it will take a Kafka
topic+partition as input element and a KafkaConsumer will be assigned to
this topic+partition then poll records continuously. The Kafka consumer
will resume reading and return from the process fn when

   - There are no available records currently(this is a feature of SDF
   which calls SDF self-initiated checkpoint)
   - The OutputAndTimeBoundedSplittableProcessElementInvoker issues
   checkpoint request to ReadFromKafkaDoFn for getting partial results. The
   checkpoint frequency for DirectRunner is every 100 output records or every
   1 seconds.

It seems like either the self-initiated checkpoint or DirectRunner issued
checkpoint gives you the performance regression since there is overhead
when rescheduling residuals. In your case, it's more like that the
checkpoint behavior of OutputAndTimeBoundedSplittableProcessElementInvoker
gives you 200 elements a batch. I want to understand what kind of
performance regression you are noticing? Is it slower to output the same
amount of records?

On Fri, Dec 11, 2020 at 1:31 PM Antonio Si  wrote:

> Hi Boyuan,
>
> This is Antonio. I reported the KafkaIO.read() performance issue on the
> slack channel a few days ago.
>
> I am not sure if this is helpful, but I have been doing some debugging on
> the SDK KafkaIO performance issue for our pipeline and I would like to
> provide some observations.
>
> It looks like in my case the ReadFromKafkaDoFn.processElement()  was
> invoked within the same thread and every time kafaconsumer.poll() is
> called, it returns some records, from 1 up to 200 records. So, it will
> proceed to run the pipeline steps. Each kafkaconsumer.poll() takes about
> 0.8ms. So, in this case, the polling and running of the pipeline are
> executed sequentially within a single thread. So, after processing a batch
> of records, it will need to wait for 0.8ms before it can process the next
> batch of records again.
>
> Any suggestions would be appreciated.
>
> Hope that helps.
>
> Thanks and regards,
>
> Antonio.
>
> On 2020/12/04 19:17:46, Boyuan Zhang  wrote:
> > Opened https://issues.apache.org/jira/browse/BEAM-11403 for tracking.
> >
> > On Fri, Dec 4, 2020 at 10:52 AM Boyuan Zhang  wrote:
> >
> > > Thanks for the pointer, Steve! I'll check it out. The execution paths
> for
> > > UnboundedSource and SDF wrapper are different. It's highly possible
> that
> > > the regression either comes from the invocation path for SDF wrapper,
> or
> > > the implementation of SDF wrapper itself.
> > >
> > > On Fri, Dec 4, 2020 at 6:33 AM Steve Niemitz 
> wrote:
> > >
> > >> Coincidentally, someone else in the ASF slack mentioned [1] yesterday
> > >> that they were seeing significantly reduced performance using
> KafkaIO.Read
> > >> w/ the SDF wrapper vs the unbounded source.  They mentioned they were
> using
> > >> flink 1.9.
> > >>
> > >> https://the-asf.slack.com/archives/C9H0YNP3P/p1607057900393900
> > >>
> > >> On Thu, Dec 3, 2020 at 1:56 PM Boyuan Zhang 
> wrote:
> > >>
> > >>> Hi Steve,
> > >>>
> > >>> I think the major performance regression comes from
> > >>> OutputAndTimeBoundedSplittableProcessElementInvoker[1], which will
> > >>> checkpoint the DoFn based on time/output limit and use timers/state
> to
> > >>> reschedule works.
> > >>>
> > >>> [1]
> > >>>
> https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/OutputAndTimeBoundedSplittableProcessElementInvoker.java
> > >>>
> > >>> On Thu, Dec 3, 2020 at 9:40 AM Steve Niemitz 
> > >>> wrote:
> > >>>
> >  I have a pipeline that reads from pubsub, does some aggregation, and
> >  writes to various places.  Previously, in older versions of beam,
> when
> >  running this in the DirectRunner, messages would go through the
> pipeline
> >  almost instantly, making it very easy to debug locally, etc.
> > 
> >  However, after upgrading to beam 2.25, I noticed that it could take
> on
> >  the order of 5-10 minutes for messages to get from the pubsub read
> step to
> >  the next step in the pipeline (deserializing them, etc).  The
> subscription
> >  being read from has on the order of 100,000 elements/sec arriving
> in it.
> > 
> >  Setting --experiments=use_deprecated_read fixes it, and makes the
> >  pipeline behave as it did before.
> > 
> >  It seems like the SDF implementation in the DirectRunner here is
> >  causing some kind of issue, either buffering a very large amount of
> data
> >  before emitting it in a bundle, or something else.  Has anyone else
> run
> >  into this?
> > 
> > >>>
> >
>


Re: Usability regression using SDF Unbounded Source wrapper + DirectRunner

2020-12-11 Thread Antonio Si
Hi Boyuan,

This is Antonio. I reported the KafkaIO.read() performance issue on the slack 
channel a few days ago.

I am not sure if this is helpful, but I have been doing some debugging on the 
SDK KafkaIO performance issue for our pipeline and I would like to provide some 
observations.

It looks like in my case the ReadFromKafkaDoFn.processElement()  was invoked 
within the same thread and every time kafaconsumer.poll() is called, it returns 
some records, from 1 up to 200 records. So, it will proceed to run the pipeline 
steps. Each kafkaconsumer.poll() takes about 0.8ms. So, in this case, the 
polling and running of the pipeline are executed sequentially within a single 
thread. So, after processing a batch of records, it will need to wait for 0.8ms 
before it can process the next batch of records again.

Any suggestions would be appreciated.

Hope that helps.

Thanks and regards,

Antonio.

On 2020/12/04 19:17:46, Boyuan Zhang  wrote: 
> Opened https://issues.apache.org/jira/browse/BEAM-11403 for tracking.
> 
> On Fri, Dec 4, 2020 at 10:52 AM Boyuan Zhang  wrote:
> 
> > Thanks for the pointer, Steve! I'll check it out. The execution paths for
> > UnboundedSource and SDF wrapper are different. It's highly possible that
> > the regression either comes from the invocation path for SDF wrapper, or
> > the implementation of SDF wrapper itself.
> >
> > On Fri, Dec 4, 2020 at 6:33 AM Steve Niemitz  wrote:
> >
> >> Coincidentally, someone else in the ASF slack mentioned [1] yesterday
> >> that they were seeing significantly reduced performance using KafkaIO.Read
> >> w/ the SDF wrapper vs the unbounded source.  They mentioned they were using
> >> flink 1.9.
> >>
> >> https://the-asf.slack.com/archives/C9H0YNP3P/p1607057900393900
> >>
> >> On Thu, Dec 3, 2020 at 1:56 PM Boyuan Zhang  wrote:
> >>
> >>> Hi Steve,
> >>>
> >>> I think the major performance regression comes from
> >>> OutputAndTimeBoundedSplittableProcessElementInvoker[1], which will
> >>> checkpoint the DoFn based on time/output limit and use timers/state to
> >>> reschedule works.
> >>>
> >>> [1]
> >>> https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/OutputAndTimeBoundedSplittableProcessElementInvoker.java
> >>>
> >>> On Thu, Dec 3, 2020 at 9:40 AM Steve Niemitz 
> >>> wrote:
> >>>
>  I have a pipeline that reads from pubsub, does some aggregation, and
>  writes to various places.  Previously, in older versions of beam, when
>  running this in the DirectRunner, messages would go through the pipeline
>  almost instantly, making it very easy to debug locally, etc.
> 
>  However, after upgrading to beam 2.25, I noticed that it could take on
>  the order of 5-10 minutes for messages to get from the pubsub read step 
>  to
>  the next step in the pipeline (deserializing them, etc).  The 
>  subscription
>  being read from has on the order of 100,000 elements/sec arriving in it.
> 
>  Setting --experiments=use_deprecated_read fixes it, and makes the
>  pipeline behave as it did before.
> 
>  It seems like the SDF implementation in the DirectRunner here is
>  causing some kind of issue, either buffering a very large amount of data
>  before emitting it in a bundle, or something else.  Has anyone else run
>  into this?
> 
> >>>
> 


[GitHub] [beam-site] lostluck merged pull request #609: Publish 2.26.0 release

2020-12-11 Thread GitBox


lostluck merged pull request #609:
URL: https://github.com/apache/beam-site/pull/609


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[RESULT] [VOTE] Release 2.26.0, release candidate #1

2020-12-11 Thread Robert Burke
I'm happy to announce that we have unanimously approved this release.

There are 7 approving votes, 3 of which are binding:
* Robert Bradshaw
* Pablo Estrada
* Ahmet Altay

There are no disapproving votes.

Thanks everyone! I'll be working on finalizing the release this afternoon

Robert Burke
2.26.0 Release Manager


[GitHub] [beam-site] pabloem commented on pull request #609: Publish 2.26.0 release

2020-12-11 Thread GitBox


pabloem commented on pull request #609:
URL: https://github.com/apache/beam-site/pull/609#issuecomment-743380879


   LGTM! Feel free to merge whenever is appropriate.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [VOTE] Release 2.26.0, release candidate #1

2020-12-11 Thread Ahmet Altay
+1 (binding)

On Thu, Dec 10, 2020 at 6:05 PM Robert Burke  wrote:

> Hello All!
> The minimum vote time has elapsed with six +1 votes, two of which are
> binding, which is not enough for me to start finalization processes.
>
> I'm happy to extend the period until the sooner of the 2.27.0 branch is
> cut and it's first RC is built (sometime on/after the 16th according to the
> schedule) or we get an additional binding vote for this RC.
>
> Binding votes may only be cast by Apache Beam PMC members, but all
> validation is incredibly valuable in ensuring the release meets the
> community's standards.
>
> Thank you all for your patience and participation so far!
> Robert Burke
> 2.26.0 Release Manager
>
> On Thu, 10 Dec 2020 at 14:59, Pablo Estrada  wrote:
>
>> +1 (binding)
>> I've built and ran basic tests for existing Dataflow templates.
>> Best
>> -P.
>>
>> On Thu, Dec 10, 2020 at 11:31 AM Robert Bradshaw 
>> wrote:
>>
>>> +1 (binding). I've verified the release artifacts and signatures, and
>>> validated some simple pipelines with a freshly installed wheel.
>>>
>>> On Thu, Dec 10, 2020 at 10:00 AM Brian Hulette 
>>> wrote:
>>>
 +1 (non-binding). Ran a Python pipeline using DataframeTransform on
 DirectRunner and DataflowRunner (woohoo!)

 On Thu, Dec 10, 2020 at 8:48 AM Chamikara Jayalath <
 chamik...@google.com> wrote:

> +1 (non-binding). Validated multi-language pipelines for Kafka/SQL.
>
> Thanks,
> Cham
>
> On Thu, Dec 10, 2020 at 8:41 AM Tyson Hamilton 
> wrote:
>
>> +1 from me. I validated Nexmark performance tests.
>>
>> On Tue, Dec 8, 2020 at 7:53 PM Robert Burke 
>> wrote:
>>
>>> I'm +1 on RC1 based on the 7 tests I know I can check successfully.
>>> I'll be trying more tomorrow, but remember that release validation 
>>> requires
>>> the community to validate it meets our standards, and I can't do it 
>>> alone.
>>>
>>> Remember you can participate in the release validation by reviewing
>>> parts of the documentation being published as well, not just by running 
>>> the
>>> Pyhton and Java artifacts.
>>>
>>>  If you have contributed new python or java docs into this release,
>>> they'll appear in the to be published docs.
>>>
>>> Cheers,
>>> Robert Burke
>>> 2.26.0 release manager
>>>
>>> On Mon, Dec 7, 2020, 6:25 PM Robert Burke 
>>> wrote:
>>>
 Turns out no changes required affecting the dataflow artifacts this
 time around, so Dataflow is cleared for testing.

 Cheers.
 Robert Burke
 2.26.0 Release Manager

 On Mon, Dec 7, 2020, 6:03 PM Robert Burke  wrote:

>
> Robert Burke 
> Thu, Dec 3, 8:01 PM (4 days ago)
> to dev
> Hi everyone,
> Please review and vote on the release candidate #1 for the version
> 2.26.0, as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific
> comments)
>
>
> Reviewers are encouraged to test their own use cases with the
> release candidate, and vote +1
>  if no issues are found.
>
> The complete staging area is available for your review, which
> includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to
> dist.apache.org [2], which is signed with the key with
> fingerprint A52F5C83BAE26160120EC25F3D56ACFBFB2975E1 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "v2.26.0-RC1" [5],
> * website pull request listing the release [6], publishing the API
> reference manual [7], and the blog post [8].
> * Java artifacts were built with Maven 3.6.0 and OpenJDK 1.8.0_275.
> * Python artifacts are deployed along with the source release to
> the dist.apache.org [2].
> * Validation sheet with a tab for 2.26.0 release to help with
> validation [9].
> * Docker images published to Docker Hub [10].
>
> The vote will be open for at least 72 hours (10th ~6pm PST). It is
> adopted by majority approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Robert Burke
> 2.26.0 Release Manager
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12348833
> [2] https://dist.apache.org/repos/dist/dev/beam/2.26.0/
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> [4] https://repository.apache.org/content/repositories/org apache
> beam-1144/
> [5] https://github.com/apache/beam/tree/v2.26.0-RC1
> [6] https://github.com/apache/beam/pull/13481
> [7] https://github.com/apache/beam-site/pull/609
> [8]