from:"Maximilian Michels"

Thanks Pedro! Great to see the program! This is going to be an exciting 
event.


Forwarding to the dev mailing list, in case people didn't see this here.

-Max

On 29.07.20 20:25, Pedro Galvan wrote:

Hello!

Just a quick message to let everybody know that we have published the 
program for the Beam Digital Summit. It is available at 
https://2020.beamsummit.org/program


With more than 30 talks and workshops covering all the scope from 
introductory sessions to advanced scenarios and use cases, we hope  that 
everybody will find useful content at Beam Digital Summit.


Beam Digital Summit will broadcast through the Crowdcast platform. It is 
a free event but you need to register. Please visit 
https://www.crowdcast.io/e/beamsummit 
 to register.



--
*Pedro Galvan*
Beam Digital Summit Team

Re: Monitoring performance for releases

Thanks! I'm following up with this PR to display the Flink Pardo 
streaming data: https://github.com/apache/beam/pull/12408


Streaming data appears to be missing for Dataflow. We can revise the 
Jenkins jobs to add those.


-Max

On 29.07.20 17:01, Tyson Hamilton wrote:

Max,

The runner dimension are present when hovering over a particular graph. 
For some more info, the load test configurations can be found here [1]. 
I didn't get a chance to look into them but there are tests for all the 
runners there, possibly not for every loadtest.


[1]: https://github.com/apache/beam/tree/master/.test-infra/jenkins

-Tyson

On Wed, Jul 29, 2020 at 3:46 AM Maximilian Michels <mailto:m...@apache.org>> wrote:


Looks like the permissions won't be necessary because backup data gets
loaded into the local InfluxDb instance which makes writing queries
locally possible.

On 29.07.20 12:21, Maximilian Michels wrote:
 > Thanks Michał!
 >
 > It is a bit tricky to verify the exported query works if I don't
have
 > access to the data stored in InfluxDb.
 >
 > ==> Could somebody give me permissions to max.mich...@gmail.com
<mailto:max.mich...@gmail.com> for
 > apache-beam-testing such that I can setup a ssh port-forwarding
from the
 > InfluxDb pod to my machine? I do have access to see the pods but
that is
 > not enough.
 >
 >> I think that the only test data is from Python streaming tests,
which
 >> are not implemented right now (check out
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python

 >> )
 >
 > Additionally, there is an entire dimension missing: Runners. I'm
 > assuming this data is for Dataflow?
 >
 > -Max
 >
 > On 29.07.20 11:55, Michał Walenia wrote:
 >> Hi there,
 >>
 >>  > Indeed the Python load test data appears to be missing:
 >>  >
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python

 >>
 >>
 >> I think that the only test data is from Python streaming tests,
which
 >> are not implemented right now (check out
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python

 >> )
 >>
 >> As for updating the dashboards, the manual for doing this is here:
 >>

https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards

 >>
 >>
 >> I hope this helps,
 >>
 >> Michal
 >>
 >> On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels
mailto:m...@apache.org>
 >> <mailto:m...@apache.org <mailto:m...@apache.org>>> wrote:
 >>
 >>     Indeed the Python load test data appears to be missing:
 >>
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python

 >>
 >>
 >>     How do we typically modify the dashboards?
 >>
 >>     It looks like we need to edit this json file:
 >>
 >>

https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81

 >>
 >>
 >>     I found some documentation on the deployment:
 >>
 >>
https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring
 >>
 >>
 >>     +1 for alerting or weekly emails including performance
numbers for
 >>     fixed
 >>     intervals (1d, 1w, 1m, previous release).
 >>
 >>     +1 for linking the dashboards in the release guide to allow
for a
 >>     comparison as part of the release process.
 >>
 >>     As a first step, consolidating all the data seems like the most
 >>     pressing
 >>     problem to solve.
 >>
 >>     @Kamil I could need some advice regarding how to proceed
updating the
 >>     dashboards.
 >>
 >>     -Max
 >>
 >>     On 22.07.20 20:20, Robert Bradshaw wrote:
 >>  > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise
mailto:t...@apache.org>
 >>     <mailto:t...@apache.org <mailto:t...@apache.org>>
 >>  > <mailto:t...@apache.org <mailto:t...@apache.org>
<mailto:t...@apache.org <mailto:t...@apache.org>>>> wrote:
 >>  >
 >>  >     It appears that there is coverage missing in the Grafana
 >>     dashboards
 >>

Re: Monitoring performance for releases

Looks like the permissions won't be necessary because backup data gets 
loaded into the local InfluxDb instance which makes writing queries 
locally possible.


On 29.07.20 12:21, Maximilian Michels wrote:

Thanks Michał!

It is a bit tricky to verify the exported query works if I don't have 
access to the data stored in InfluxDb.


==> Could somebody give me permissions to max.mich...@gmail.com for 
apache-beam-testing such that I can setup a ssh port-forwarding from the 
InfluxDb pod to my machine? I do have access to see the pods but that is 
not enough.


I think that the only test data is from Python streaming tests, which 
are not implemented right now (check out 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python 
)


Additionally, there is an entire dimension missing: Runners. I'm 
assuming this data is for Dataflow?


-Max

On 29.07.20 11:55, Michał Walenia wrote:

Hi there,

 > Indeed the Python load test data appears to be missing:
 > 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python 



I think that the only test data is from Python streaming tests, which 
are not implemented right now (check out 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python 
)


As for updating the dashboards, the manual for doing this is here: 
https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards 



I hope this helps,

Michal

On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels <mailto:m...@apache.org>> wrote:


    Indeed the Python load test data appears to be missing:

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python 



    How do we typically modify the dashboards?

    It looks like we need to edit this json file:

https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81 



    I found some documentation on the deployment:

https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring



    +1 for alerting or weekly emails including performance numbers for
    fixed
    intervals (1d, 1w, 1m, previous release).

    +1 for linking the dashboards in the release guide to allow for a
    comparison as part of the release process.

    As a first step, consolidating all the data seems like the most
    pressing
    problem to solve.

    @Kamil I could need some advice regarding how to proceed updating the
    dashboards.

    -Max

    On 22.07.20 20:20, Robert Bradshaw wrote:
 > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise mailto:t...@apache.org>
 > <mailto:t...@apache.org <mailto:t...@apache.org>>> wrote:
 >
 >     It appears that there is coverage missing in the Grafana
    dashboards
 >     (it could also be that I just don't find it).
 >
 >     For example:
 >

https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056 


 >
 >     The GBK and ParDo tests have a selection for {batch,
    streaming} and
 >     SDK. No coverage for streaming and python? There is also no
    runner
 >     option currently.
 >
 >     We have seen repeated regressions with streaming, Python,
    Flink. The
 >     test has been contributed. It would be great if the results
    can be
 >     covered as part of release verification.
 >
 >
 > Even better would be if we can use these dashboards (plus
    alerting or
 > similar?) to find issues before release verification. It's much
    easier
 > to fix things earlier.
 >
 >
 >     Thomas
 >
 >
 >
 >     On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski
 >     mailto:kamil.wasilew...@polidea.com>
    <mailto:kamil.wasilew...@polidea.com
    <mailto:kamil.wasilew...@polidea.com>>>
 >     wrote:
 >
 >             The prerequisite is that we have all the stats in one
    place.
 >             They seem
 >             to be scattered across 
http://metrics.beam.apache.org and

 > https://apache-beam-testing.appspot.com.
 >
 >             Would it be possible to consolidate the two, i.e. 
use the

 >             Grafana-based
 >             dashboard to load the legacy stats?
 >
 >
 >         I'm pretty sure that all dashboards have been moved to
 > http://metrics.beam.apache.org. Let me know if I missed
 >         something during the migration.
 >
 >         I think we should turn off
 > https://apache-beam-testing.appspot.com in the near future. New
 >         Grafana-based dashboards have been working seamlessly for
    some
 >         time now and there's no point in ma

Re: Monitoring performance for releases


Thanks Michał!

It is a bit tricky to verify the exported query works if I don't have 
access to the data stored in InfluxDb.


==> Could somebody give me permissions to max.mich...@gmail.com for 
apache-beam-testing such that I can setup a ssh port-forwarding from the 
InfluxDb pod to my machine? I do have access to see the pods but that is 
not enough.



I think that the only test data is from Python streaming tests, which are not 
implemented right now (check out 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python
 )


Additionally, there is an entire dimension missing: Runners. I'm 
assuming this data is for Dataflow?


-Max

On 29.07.20 11:55, Michał Walenia wrote:

Hi there,

 > Indeed the Python load test data appears to be missing:
 > 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python


I think that the only test data is from Python streaming tests, which 
are not implemented right now (check out 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python 
)


As for updating the dashboards, the manual for doing this is here: 
https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards


I hope this helps,

Michal

On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels <mailto:m...@apache.org>> wrote:


Indeed the Python load test data appears to be missing:

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python

How do we typically modify the dashboards?

It looks like we need to edit this json file:

https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81

I found some documentation on the deployment:
https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring


+1 for alerting or weekly emails including performance numbers for
fixed
intervals (1d, 1w, 1m, previous release).

+1 for linking the dashboards in the release guide to allow for a
comparison as part of the release process.

As a first step, consolidating all the data seems like the most
pressing
problem to solve.

@Kamil I could need some advice regarding how to proceed updating the
dashboards.

-Max

On 22.07.20 20:20, Robert Bradshaw wrote:
 > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise mailto:t...@apache.org>
 > <mailto:t...@apache.org <mailto:t...@apache.org>>> wrote:
 >
 >     It appears that there is coverage missing in the Grafana
dashboards
 >     (it could also be that I just don't find it).
 >
 >     For example:
 >
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
 >
 >     The GBK and ParDo tests have a selection for {batch,
streaming} and
 >     SDK. No coverage for streaming and python? There is also no
runner
 >     option currently.
 >
 >     We have seen repeated regressions with streaming, Python,
Flink. The
 >     test has been contributed. It would be great if the results
can be
 >     covered as part of release verification.
 >
 >
 > Even better would be if we can use these dashboards (plus
alerting or
 > similar?) to find issues before release verification. It's much
easier
 > to fix things earlier.
 >
 >
 >     Thomas
 >
 >
 >
 >     On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski
 >     mailto:kamil.wasilew...@polidea.com>
<mailto:kamil.wasilew...@polidea.com
<mailto:kamil.wasilew...@polidea.com>>>
 >     wrote:
 >
 >             The prerequisite is that we have all the stats in one
place.
 >             They seem
 >             to be scattered across http://metrics.beam.apache.org and
 > https://apache-beam-testing.appspot.com.
 >
 >             Would it be possible to consolidate the two, i.e. use the
 >             Grafana-based
 >             dashboard to load the legacy stats?
 >
 >
 >         I'm pretty sure that all dashboards have been moved to
 > http://metrics.beam.apache.org. Let me know if I missed
 >         something during the migration.
 >
 >         I think we should turn off
 > https://apache-beam-testing.appspot.com in the near future. New
 >         Grafana-based dashboards have been working seamlessly for
some
 >         time now and there's no point in maintaining the older
solution.
 >         We'd also avoid ambiguity in where the stats should be
looked for.
 >
 >         Kamil
 >
 >         On Tue, Jul 21,

Re: Use concrete instances of ExternalTransformBuilder in ExternalTransformRegistrar?

2020-07-28 Thread Maximilian Michels


Replacing

  Class

with

  ExternalTransformBuilder

sounds reasonable to me. Looks like an oversight that we introduced the 
unnecessary class indirection.


-Max

On 27.07.20 20:45, Chamikara Jayalath wrote:
Brian's suggestion makes sense to me. I don't know of a specific reason 
regarding why we choose the Class type in the registrar instead of 
instance types. +Maximilian Michels <mailto:m...@apache.org> +Robert 
Bradshaw <mailto:rober...@google.com> may have more context.


Thanks,
Cham

On Mon, Jul 27, 2020 at 10:48 AM Kenneth Knowles <mailto:k...@apache.org>> wrote:




On Mon, Jul 27, 2020 at 10:47 AM Kenneth Knowles mailto:k...@apache.org>> wrote:

On Sun, Jul 26, 2020 at 8:50 PM Kenneth Knowles mailto:k...@apache.org>> wrote:

Rawtypes are a legacy compatibility feature that breaks type
checking (and further analyses)


Noting for the benefit of the thread that this is not
hypothetical. Fixing the rawtypes in this API surfaced
nullability issues according to spotbugs.


Additionally notable that Spotbugs operates on post-compile
bytecode, not source.

Kenn


Kenn



and harms readability. They should be forbidden in new code.
Class literals for generic types are quite inconvenient for
this, especially when placed in a heterogeneous map using
wildcard parameters [1].

So making either the change Brian proposes or something
similar is desirable, to avoid forcing inconvenience on
users of the API, and to just simplify and clarify it.

Kenn

[1]

https://github.com/apache/beam/pull/12376/files#diff-2fa38a7f8d24217f1f7bde0f5c7dbb40R495

Kenn

On Fri, Jul 24, 2020 at 11:04 AM Brian Hulette
mailto:bhule...@google.com>> wrote:

Hi all,
I've been working with +Scott Lukas
<mailto:slu...@google.com> on using the new schema io
interfaces [1] in cross-language. This means adding a
general-purpose ExternalTransformRegistrar [2,3] that
will register all SchemaIOProvider implementations via
ServiceLoader.

We've run into an issue though -
ExternalTransformRegistrar is supposed to return a
`Map>`. This makes it very
challenging (impossible?) for us to create a
general-purpose ExternalTransformBuilder that defers to
SchemaIOProvider. Ideally we would instead return a
Map (i.e. a concrete
instance rather than a class object), so that we could
just register different instances of a class like:

class SchemaIOBuilder extends ExternalTransformBuilder {
   private SchemaIOProvider provider;
   PTransform buildExternal(ConfigT
configuration) {
     // Use provider to create PTransform
   }
}

I think it would be possible to change the
ExternalTransformRegistrar interface so it has a single
method, Map
knownBuilders(). It could even be done in a
backwards-compatible way if we keep the old method and
provide a default implementation of the new method that
builds instances.

However, I'm curious if there's some strong reason for
using Class as the
return type for knownBuilders that I'm missing. Does
anyone know why we chose that?

Thanks,
Brian

[1] https://s.apache.org/beam-schema-io
[2]

https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ExternalTransformRegistrar.java
[3]

https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ExternalTransformBuilder.java

Re: Monitoring performance for releases

2020-07-27 Thread Maximilian Michels

Indeed the Python load test data appears to be missing:
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python

How do we typically modify the dashboards?

It looks like we need to edit this json file:
https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81

I found some documentation on the deployment:
https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring

+1 for alerting or weekly emails including performance numbers for fixed
intervals (1d, 1w, 1m, previous release).

+1 for linking the dashboards in the release guide to allow for a
comparison as part of the release process.

As a first step, consolidating all the data seems like the most pressing
problem to solve.

@Kamil I could need some advice regarding how to proceed updating the
dashboards.

-Max

On 22.07.20 20:20, Robert Bradshaw wrote:
On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise <mailto:t...@apache.org>> wrote:

It appears that there is coverage missing in the Grafana dashboards
(it could also be that I just don't find it).

For example:
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056

The GBK and ParDo tests have a selection for {batch, streaming} and
SDK. No coverage for streaming and python? There is also no runner
option currently.

We have seen repeated regressions with streaming, Python, Flink. The
test has been contributed. It would be great if the results can be
covered as part of release verification.

Even better would be if we can use these dashboards (plus alerting or
similar?) to find issues before release verification. It's much easier
to fix things earlier.

Thomas

On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski
mailto:kamil.wasilew...@polidea.com>>
wrote:

The prerequisite is that we have all the stats in one place.
They seem
to be scattered across http://metrics.beam.apache.org and
https://apache-beam-testing.appspot.com.

Would it be possible to consolidate the two, i.e. use the
Grafana-based
dashboard to load the legacy stats?

I'm pretty sure that all dashboards have been moved to
http://metrics.beam.apache.org. Let me know if I missed
something during the migration.

I think we should turn off
https://apache-beam-testing.appspot.com in the near future. New
Grafana-based dashboards have been working seamlessly for some
time now and there's no point in maintaining the older solution.
We'd also avoid ambiguity in where the stats should be looked for.

Kamil

On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels
mailto:m...@apache.org>> wrote:

> It doesn't support https. I had to add an exception to
the HTTPS Everywhere extension for "metrics.beam.apache.org
<http://metrics.beam.apache.org>".

*facepalm* Thanks Udi! It would always hang on me because I
use HTTPS
Everywhere.

> To be explicit, I am supporting the idea of reviewing the
release guide but not changing the release process for the
already in-progress release.

I consider the release guide immutable for the process of a
release.
Thus, a change to the release guide can only affect new
upcoming
releases, not an in-process release.

> +1 and I think we can also evaluate whether flaky tests
should be reviewed as release blockers or not. Some flaky
tests would be hiding real issues our users could face.

Flaky tests are also worth to take into account when
releasing, but a
little harder to find because may just happen to pass during
building
the release. It is possible though if we strictly capture
flaky tests
via JIRA and mark them with the Fix Version for the release.

> We keep accumulating dashboards and
> tests that few people care about, so it is probably worth
that we use
> them or get a way to alert us of regressions during the
release cycle
> to catch this even before the RCs.

+1 The release guide should be explicit about which
performance test
results to evaluate.

The prerequisite is that we have all the stats in one place.
They seem
to be scattered across http://metrics.beam.apache.org and
https://apache-beam-testing.appspot.com.

Would it be possible to consolidate the two, i.e. use the
Gr

Re: [VOTE] Make Apache Beam 2.24.0 the final release supporting Python 2.

2020-07-24 Thread Maximilian Michels


+1

On 24.07.20 18:54, Pablo Estrada wrote:

+1 - thank you Valentyn!
-P.

On Thu, Jul 23, 2020 at 1:29 PM Chamikara Jayalath > wrote:


+1

On Thu, Jul 23, 2020 at 1:15 PM Brian Hulette mailto:bhule...@google.com>> wrote:

+1

On Thu, Jul 23, 2020 at 1:05 PM Robert Bradshaw
mailto:rober...@google.com>> wrote:

[X] +1: Remove Python 2 support in Apache Beam 2.25.0.

According to our six-week release cadence, 2.24.0 (the last
release to support Python 2) will be cut mid-August, and the
first release not supporting Python 2 would be expected to
land sometime in October. This seems a reasonable timeline
to me.


On Thu, Jul 23, 2020 at 12:53 PM Valentyn Tymofieiev
mailto:valen...@google.com>> wrote:

Hi everyone,

Please vote whether to make Apache Beam 2.24.0 the final
release supporting Python 2 as follows.

[ ] +1: Remove Python 2 support in Apache Beam 2.25.0.
[ ] -1: Continue to support Python 2 in Apache Beam, and
reconsider at a later date.

The Beam community has pledged to sunset Python 2
support at some point in 2020[1,2]. A recent
discussion[3] on dev@  proposes to outline a specific
version after which Beam developers no longer have to
maintain Py2 support, which is a motivation for this vote.

If this vote is approved we will announce Apache Beam
2.24.0 as our final release to support Python 2 and
discontinue Python 2 support starting from 2.25.0
(inclusive).

This is a procedural vote [4] that will follow the
majority approval rules and will be open for at least 72
hours.

Thanks,
Valentyn

[1]

https://lists.apache.org/thread.html/634f7346b607e779622d0437ed0eca783f474dea8976adf41556845b%40%3Cdev.beam.apache.org%3E
[2] https://python3statement.org/
[3]

https://lists.apache.org/thread.html/r0d5c309a7e3107854f4892ccfeb1a17c0cec25dfce188678ab8df072%40%3Cdev.beam.apache.org%3E
[4] https://www.apache.org/foundation/voting.html

Re: [BROKEN] Please add "Fix Version" when resolving or closing Jiras

2020-07-23 Thread Maximilian Michels

I can close/resolve but it will show "Resolution: Unresolved":

On 23.07.20 02:30, Brian Hulette wrote:
Is setting the Resolution broken as well? I realized I've been closing 
jiras with Resolution "Unresolved" and I can't actually change it to 
"Fixed".

On Tue, Jul 21, 2020 at 7:19 AM Maximilian Michels <mailto:m...@apache.org>> wrote:

Also, a friendly reminder to always close the JIRA issue after
merging a
fix. It's easy to forget.

On 20.07.20 21:04, Kenneth Knowles wrote:
> Hi all,
>
> In working on our Jira automation, I've messed up our Jira
workflow. It
> will no longer prompt you to fill in "Fix Version" when you
resolve or
> close an issue. I will be working with infra to restore this. In
the
> meantime, please try to remember to add a Fix Version to each
issue that
> you close, so that we get automated detailed release notes.
>
> Kenn

Re: Errorprone plugin fails for release branches <2.22.0

2020-07-22 Thread Maximilian Michels

On the SparkRunner page, we advise users to download Beam sources and build JobService. So I think it would be better just to add a note there about this issue with old branches. 

Why is that? Don't we publish the Spark job server jar?

-Max

On 21.07.20 18:20, Alexey Romanenko wrote:
On the SparkRunner page, we advise users to download Beam sources and 
build JobService. So I think it would be better just to add a note there 
about this issue with old branches.

On 20 Jul 2020, at 22:29, Kenneth Knowles <mailto:k...@apache.org>> wrote:

I think it is fine to fix it in branches. I do not see too much value 
in fixing it except in branches you know you are going to use.

The "Downloads" page is for users and only mentioned the voted source 
releases, maven central, and pypi. There is nothing to do with GitHub 
or ongoing branches there. I don't think un-published cherrypicks to 
branches matter to users. Did you mean some other place?

Kenn

On Mon, Jul 20, 2020 at 9:44 AM Alexey Romanenko 
mailto:aromanenko@gmail.com>> wrote:

Then, would it be ok to fix it in branches (the question is how
many branches we should fix?) with additional commit and mention
that on “Downloads" page?

On 8 Jul 2020, at 21:24, Kenneth Knowles mailto:k...@apache.org>> wrote:

On Wed, Jul 8, 2020 at 12:07 PM Kyle Weaver mailto:kcwea...@google.com>> wrote:

> To fix on previous release branches, we would need to make
a new release, is it not? Since hashes would change..

Would it be alright to patch the release branches on Github
and leave the released source as-is? Github release branches
themselves aren't release artifacts, so I think it should be
okay to patch them without making a new release.

Yea. There are tags for the exact hashes that RCs were built
from. The release branch is fine to get new commits, and then if
anyone wants to build a patch release they will get those commits.

Kenn

On Wed, Jul 8, 2020 at 11:59 AM Pablo Estrada
mailto:pabl...@google.com>> wrote:

Ah that's annoying that a dependency would be removed
from maven. I thought that was not meant to happen? This
must be an issue happening for many other projects...
Why is errorprone a dependency anyway?

To fix on previous release branches, we would need to
make a new release, is it not? Since hashes would change..

On Wed, Jul 8, 2020 at 10:21 AM Alexey Romanenko
mailto:aromanenko@gmail.com>> wrote:

Hi Max,

I’m +1 for back porting as well but that seems quite
complicated since we distribute release source code
from https://archive.apache.org/
Perhaps, we should just warn users about this issue
and how to workaround it.

Any other ideas?

> On 8 Jul 2020, at 11:46, Maximilian Michels
mailto:m...@apache.org>> wrote:
>
> Hi Alexey,
>
> I also came across this issue when building a
custom Beam version. I applied the same fix
(https://github.com/apache/beam/pull/11527) which you
have mentioned.
>
> It appears that the Maven dependencies changed or
are no longer available which causes the missing
class files.
>
> +1 for backporting the fix to the release branches.
>
> Cheers,
> Max
>
> On 08.07.20 11:36, Alexey Romanenko wrote:
>> Hello,
>> Some days ago I noticed that I can’t build the
project from old release branches . For example, I
wanted to build and run Spark Job Server from
“release-2.20.0” branch and it failed:
>> ./gradlew :runners:spark:job-server:runShadow
—stacktrace
>> * Exception is:
>> org.gradle.api.tasks.TaskExecutionException:
Execution failed for task ':model:pipeline:compileJava’.
>> …
>> Caused by: org.gradle.internal.UncheckedException:
java.lang.ClassNotFoundException:
com.google.errorprone.ErrorProneCompiler$Builder
>> …
>> I experienced the same issue for “release-2.19.0”
and  “release-2.21.0” branches, I didn’t check older
branches but seems it’s a global issue for
“net.ltgt.gradle:gradle-errorprone-plugin:0.0.13".
>>

Re: Monitoring performance for releases

2020-07-21 Thread Maximilian Michels

It doesn't support https. I had to add an exception to the HTTPS Everywhere extension for 
"metrics.beam.apache.org".

*facepalm* Thanks Udi! It would always hang on me because I use HTTPS 
Everywhere.

To be explicit, I am supporting the idea of reviewing the release guide but not 
changing the release process for the already in-progress release.

I consider the release guide immutable for the process of a release. 
Thus, a change to the release guide can only affect new upcoming 
releases, not an in-process release.

+1 and I think we can also evaluate whether flaky tests should be reviewed as 
release blockers or not. Some flaky tests would be hiding real issues our users 
could face.

Flaky tests are also worth to take into account when releasing, but a 
little harder to find because may just happen to pass during building 
the release. It is possible though if we strictly capture flaky tests 
via JIRA and mark them with the Fix Version for the release.

We keep accumulating dashboards and
tests that few people care about, so it is probably worth that we use
them or get a way to alert us of regressions during the release cycle
to catch this even before the RCs.

+1 The release guide should be explicit about which performance test 
results to evaluate.

The prerequisite is that we have all the stats in one place. They seem 
to be scattered across http://metrics.beam.apache.org and 
https://apache-beam-testing.appspot.com.

Would it be possible to consolidate the two, i.e. use the Grafana-based 
dashboard to load the legacy stats?

For the evaluation during the release process, I suggest to use a 
standardized set of performance tests for all runners, e.g.:

- Nexmark
- ParDo (Classic/Portable)
- GroupByKey
- IO

-Max

On 21.07.20 01:23, Ahmet Altay wrote:

On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía <mailto:ieme...@gmail.com>> wrote:

+1

This is not in the release guide and we should probably re evaluate if
this should be a release blocking reason.
Of course exceptionally a performance regression could be motivated by
a correctness fix or a worth refactor, so we should consider this.

+1 and I think we can also evaluate whether flaky tests should be 
reviewed as release blockers or not. Some flaky tests would be hiding 
real issues our users could face.

To be explicit, I am supporting the idea of reviewing the release guide 
but not changing the release process for the already in-progress release.

We have been tracking and fixing performance regressions multiple
times found simply by checking the nexmark tests including on the
ongoing 2.23.0 release so value is there. Nexmark does not cover yet
python and portable runners so we are probably still missing many
issues and it is worth to work on this. In any case we should probably
decide what validations matter. We keep accumulating dashboards and
tests that few people care about, so it is probably worth that we use
them or get a way to alert us of regressions during the release cycle
to catch this even before the RCs.

I agree. And if we cannot use dashboards/tests in a meaningful way, IMO 
we can remove them. There is not much value to maintain them if they do 
not provide important signals.

On Fri, Jul 10, 2020 at 9:30 PM Udi Meiri mailto:eh...@google.com>> wrote:
 >
 > On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels
mailto:m...@apache.org>> wrote:
 >>
 >> Not yet, I just learned about the migration to a new frontend,
including
 >> a new backend (InfluxDB instead of BigQuery).
 >>
 >> >  - Are the metrics available on metrics.beam.apache.org
<http://metrics.beam.apache.org>?
 >>
 >> Is http://metrics.beam.apache.org online? I was never able to
access it.
 >
 >
 > It doesn't support https. I had to add an exception to the HTTPS
Everywhere extension for "metrics.beam.apache.org
<http://metrics.beam.apache.org>".
 >
 >>
 >>
 >> >  - What is the feature delta between usinig
metrics.beam.apache.org <http://metrics.beam.apache.org> (much
better UI) and using apache-beam-testing.appspot.com
<http://apache-beam-testing.appspot.com>?
 >>
 >> AFAIK it is an ongoing migration and the delta appears to be high.
 >>
 >> >  - Can we notice regressions faster than release cadence?
 >>
 >> Absolutely! A report with the latest numbers including
statistics about
 >> the growth of metrics would be useful.
 >>
 >> >  - Can we get automated alerts?
 >>
 >> I think we could setup a Jenkins job to do this.
 >>
 >> -Max
 >>
 >> On 09.07.20 20:26, Kenneth Kno

Re: [BROKEN] Please add "Fix Version" when resolving or closing Jiras

2020-07-21 Thread Maximilian Michels

Also, a friendly reminder to always close the JIRA issue after merging a 
fix. It's easy to forget.


On 20.07.20 21:04, Kenneth Knowles wrote:

Hi all,

In working on our Jira automation, I've messed up our Jira workflow. It 
will no longer prompt you to fill in "Fix Version" when you resolve or 
close an issue. I will be working with infra to restore this. In the 
meantime, please try to remember to add a Fix Version to each issue that 
you close, so that we get automated detailed release notes.


Kenn

Re: Errorprone plugin fails for release branches <2.22.0

2020-07-21 Thread Maximilian Michels

We can't change releases, as they are voted and signed.

+1 for updating the branches. That will give people the option to build 
recent Beam versions.

Note that it is not uncommon for source files to stop compiling. It's 
often the case when build tools or the underlying platform change. 
Although unfortunate, it is not in our interest to maintain old Beam 
versions, but since fixing the mentioned issue is easy, it makes sense 
in this particular instance.

-Max

On 20.07.20 22:29, Kenneth Knowles wrote:
I think it is fine to fix it in branches. I do not see too much value in 
fixing it except in branches you know you are going to use.

The "Downloads" page is for users and only mentioned the voted source 
releases, maven central, and pypi. There is nothing to do with GitHub or 
ongoing branches there. I don't think un-published cherrypicks to 
branches matter to users. Did you mean some other place?

Kenn

On Mon, Jul 20, 2020 at 9:44 AM Alexey Romanenko 
mailto:aromanenko@gmail.com>> wrote:

Then, would it be ok to fix it in branches (the question is how many
branches we should fix?) with additional commit and mention that on
“Downloads" page?

On 8 Jul 2020, at 21:24, Kenneth Knowles mailto:k...@apache.org>> wrote:

On Wed, Jul 8, 2020 at 12:07 PM Kyle Weaver mailto:kcwea...@google.com>> wrote:

> To fix on previous release branches, we would need to make a
new release, is it not? Since hashes would change..

Would it be alright to patch the release branches on Github
and leave the released source as-is? Github release branches
themselves aren't release artifacts, so I think it should be
okay to patch them without making a new release.

Yea. There are tags for the exact hashes that RCs were built from.
The release branch is fine to get new commits, and then if anyone
wants to build a patch release they will get those commits.

Kenn

On Wed, Jul 8, 2020 at 11:59 AM Pablo Estrada
mailto:pabl...@google.com>> wrote:

Ah that's annoying that a dependency would be removed from
maven. I thought that was not meant to happen? This must
be an issue happening for many other projects...
Why is errorprone a dependency anyway?

To fix on previous release branches, we would need to make
a new release, is it not? Since hashes would change..

On Wed, Jul 8, 2020 at 10:21 AM Alexey Romanenko
mailto:aromanenko@gmail.com>> wrote:

Hi Max,

I’m +1 for back porting as well but that seems quite
complicated since we distribute release source code
from https://archive.apache.org/
Perhaps, we should just warn users about this issue
and how to workaround it.

Any other ideas?

        > On 8 Jul 2020, at 11:46, Maximilian Michels
mailto:m...@apache.org>> wrote:
>
> Hi Alexey,
>
> I also came across this issue when building a custom
Beam version. I applied the same fix
(https://github.com/apache/beam/pull/11527) which you
have mentioned.
>
> It appears that the Maven dependencies changed or
are no longer available which causes the missing class
files.
>
> +1 for backporting the fix to the release branches.
>
> Cheers,
> Max
>
> On 08.07.20 11:36, Alexey Romanenko wrote:
>> Hello,
>> Some days ago I noticed that I can’t build the
project from old release branches . For example, I
wanted to build and run Spark Job Server from
“release-2.20.0” branch and it failed:
>> ./gradlew :runners:spark:job-server:runShadow
—stacktrace
>> * Exception is:
>> org.gradle.api.tasks.TaskExecutionException:
Execution failed for task ':model:pipeline:compileJava’.
>> …
>> Caused by: org.gradle.internal.UncheckedException:
java.lang.ClassNotFoundException:
com.google.errorprone.ErrorProneCompiler$Builder
>> …
>> I experienced the same issue for “release-2.19.0”
and  “release-2.21.0” branches, I didn’t check older
branches but seems it’s a global issue for
“net.ltgt.gradle:gradle-errorprone-plugin:0.0.13".
>> This is already known issu

Re: No space left on device - beam-jenkins 1 and 7

2020-07-20 Thread Maximilian Michels

+1 for scheduling it via a cron job if it won't lead to test failures 
while running. Not a Jenkins expert but maybe there is the notion of 
running exclusively while no other tasks are running?


-Max

On 17.07.20 21:49, Tyson Hamilton wrote:

FYI there was a job introduced to do this in Jenkins: beam_Clean_tmp_directory

Currently it needs to be run manually. I'm seeing some out of disk related 
errors in precommit tests currently, perhaps we should schedule this job with 
cron?


On 2020/03/11 19:31:13, Heejong Lee  wrote:

Still seeing no space left on device errors on jenkins-7 (for example:
https://builds.apache.org/job/beam_PreCommit_PythonLint_Commit/2754/)


On Fri, Mar 6, 2020 at 7:11 PM Alan Myrvold  wrote:


Did a one time cleanup of tmp files owned by jenkins older than 3 days.
Agree that we need a longer term solution.

Passing recent tests on all executors except jenkins-12, which has not
scheduled recent builds for the past 13 days. Not scheduling:
https://builds.apache.org/computer/apache-beam-jenkins-12/builds

Recent passing builds:
https://builds.apache.org/computer/apache-beam-jenkins-1/builds

https://builds.apache.org/computer/apache-beam-jenkins-2/builds

https://builds.apache.org/computer/apache-beam-jenkins-3/builds

https://builds.apache.org/computer/apache-beam-jenkins-4/builds

https://builds.apache.org/computer/apache-beam-jenkins-5/builds

https://builds.apache.org/computer/apache-beam-jenkins-6/builds

https://builds.apache.org/computer/apache-beam-jenkins-7/builds

https://builds.apache.org/computer/apache-beam-jenkins-8/builds

https://builds.apache.org/computer/apache-beam-jenkins-9/builds

https://builds.apache.org/computer/apache-beam-jenkins-10/builds

https://builds.apache.org/computer/apache-beam-jenkins-11/builds

https://builds.apache.org/computer/apache-beam-jenkins-13/builds

https://builds.apache.org/computer/apache-beam-jenkins-14/builds

https://builds.apache.org/computer/apache-beam-jenkins-15/builds

https://builds.apache.org/computer/apache-beam-jenkins-16/builds


On Fri, Mar 6, 2020 at 11:54 AM Ahmet Altay  wrote:


+Alan Myrvold  is doing a one time cleanup. I agree
that we need to have a solution to automate this task or address the root
cause of the buildup.

On Thu, Mar 5, 2020 at 2:47 AM Michał Walenia 
wrote:


Hi there,
it seems we have a problem with Jenkins workers again. Nodes 1 and 7
both fail jobs with "No space left on device".
Who is the best person to contact in these cases (someone with access
permissions to the workers).

I also noticed that such errors are becoming more and more frequent
recently and I'd like to discuss how can this be remedied. Can a cleanup
task be automated on Jenkins somehow?

Regards
Michal

--

Michał Walenia
Polidea  | Software Engineer

M: +48 791 432 002 <+48791432002>
E: michal.wale...@polidea.com

Unique Tech
Check out our projects!

Re: [VOTE] Release 2.23.0, release candidate #1

2020-07-20 Thread Maximilian Michels

@Valentyn: Thank you for your transparency in the release process and 
for considering pending cherry-pick requests. No blockers from my side.

-Max

On 18.07.20 01:11, Ahmet Altay wrote:
Thank you Valentyn. Being a release manager is difficult. It requires 
balancing between stability, following the process, regressions, 
timelines. Thank you for following the process, thank you for asking the 
right questions, thank you for doing the release.

On Fri, Jul 17, 2020 at 3:59 PM Robert Bradshaw > wrote:

Thank you, Valentyn!

On Fri, Jul 17, 2020 at 3:25 PM Chamikara Jayalath
mailto:chamik...@google.com>> wrote:
 >
 >
 >
 > On Fri, Jul 17, 2020 at 3:01 PM Valentyn Tymofieiev
mailto:valen...@google.com>> wrote:
 >>
 >> As a general rule, fixes pertaining to new functionality are not
a good candidate for a cherry-pick.
 >>
 >> A case for an exception can be made for polishing features
related to major wide announcements with a hard deadline, which
appears to be the case for xlang on Dataflow.
 >>
 >> I will prepare an RC2 with xlang fixes and consider other
low-risk additions from issues that were brought to my attention.
 >
 >
 > Thanks Valentyn.
 >
 >>
 >>
 >> Thanks
 >>
 >>
 >> On Fri, Jul 17, 2020 at 10:36 AM Chamikara Jayalath
mailto:chamik...@google.com>> wrote:
 >>>
 >>>
 >>>
 >>> On Fri, Jul 17, 2020 at 10:01 AM Robert Bradshaw
mailto:rober...@google.com>> wrote:

  Taking a step back, the goal of avoiding cherry-picks is to reduce
  risk and increase the velocity of our releases, as otherwise the
  release manager gets inundated by a never ending list of features
  people want to get in that puts the releases further and further
  behind (increasing the desire to get features in in a vicious
cycle).
  On the flip side, the reason we have a release process with
candidates
  and voting (as opposed to just declaring a commit id every N
weeks to
  be "the release") is to give us the flexibility to achieve a
level of
  quality and polish that may not ever occur in HEAD itself.

  With regards to this specific cross-langauge fix, the
motivation is
  that those working on it at Google want to widely publish this
feature
  as newly available on Dataflow. The question to answer here
(Cham) is
  whether this bug is debilitating enough that were it not to be
in the
  release we would want to hold off advertising this (and related)
  features until the next release. (In my understanding, it
would result
  in a poor enough user experience that it is.)
 >>>
 >>>
 >>> Yes, I think we will have to either hold off on widely
publishing the feature or list this as a potential issue that will
be fixed in the next release for anybody who tries cross-language
pipelines and runs into this.
 >>> Note that we are getting in a Python Kafka example [1]. So
users will potentially try this out anyways.
 >>>
 >>> [1] https://github.com/apache/beam/pull/12188
 >>>
 >>>

  On the other hand, there's the question of the cost of getting
this
  fix into the release. The change is simple and well contained,
so I
  think the risk is low (and, in particular, the cost to include
it here
  is low enough that it's worth the value provided above).

  Looking at the other proposals,
  https://github.com/apache/beam/pull/12196 also seems to meet
this bar
  (there are possible xlang correctness issues at play here), as
does
  https://github.com/apache/beam/pull/12175 (mostly due to its
  simplicity and the fact that doing it later would be a backwards
  compatible change). I'm on the fence about
  https://github.com/apache/beam/pull/12171 (if an RC2 is in the
works
  anyway), and IMHO the others are less compelling as having to
be done
  now.
 >>>
 >>>
 >>> +1
 >>>

  (On the question of a point release, IMHO anything worth
considering
  for an x.y.1 release definitely meets the bar for inclusion
into an RC
  of an ongoing release.)

  - Robert

  On Thu, Jul 16, 2020 at 8:00 PM Chamikara Jayalath
mailto:chamik...@google.com>> wrote:
  >
  >
  >
  > On Thu, Jul 16, 2020 at 7:46 PM Chamikara Jayalath
mailto:chamik...@google.com>> wrote:
  >>
  >>
  >>
  >> On Thu, Jul 16, 2020 at 7:28 PM Valentyn Tymofieiev
mailto:valen...@google.com>> wrote:
  >>>
  >>>
  >>>
  >>> On Thu, Jul 16,

Re:

2020-07-10 Thread Maximilian Michels


Welcome Emily! Looking forward to your questions.

Cheers,
Max

On 08.07.20 20:07, Emily Ye wrote:

Greetings, dev@beam! Just wanted to introduce myself - I'm a SWE at Google who 
will be contributing to Beam going forward. I'm pretty new to the data 
processing space but I'm excited to learn, and will probably be asking lots of 
questions here. Looking forward to getting to know the community!

- Emily

Re: Monitoring performance for releases

2020-07-09 Thread Maximilian Michels

Not yet, I just learned about the migration to a new frontend, including 
a new backend (InfluxDB instead of BigQuery).



 - Are the metrics available on metrics.beam.apache.org?


Is http://metrics.beam.apache.org online? I was never able to access it.


 - What is the feature delta between usinig metrics.beam.apache.org (much 
better UI) and using apache-beam-testing.appspot.com?


AFAIK it is an ongoing migration and the delta appears to be high.


 - Can we notice regressions faster than release cadence?


Absolutely! A report with the latest numbers including statistics about 
the growth of metrics would be useful.



 - Can we get automated alerts?


I think we could setup a Jenkins job to do this.

-Max

On 09.07.20 20:26, Kenneth Knowles wrote:

Questions:

  - Are the metrics available on metrics.beam.apache.org 
<http://metrics.beam.apache.org>?
  - What is the feature delta between usinig metrics.beam.apache.org 
<http://metrics.beam.apache.org> (much better UI) and using 
apache-beam-testing.appspot.com <http://apache-beam-testing.appspot.com>?

  - Can we notice regressions faster than release cadence?
  - Can we get automated alerts?

Kenn

On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels <mailto:m...@apache.org>> wrote:


Hi,

We recently saw an increase in latency migrating from Beam 2.18.0 to
2.21.0 (Python SDK with Flink Runner). This proofed very hard to debug
and it looks like each version in between the two versions let to
increased latency.

This is not the first time we saw issues when migrating, another
time we
had a decline in checkpointing performance and thus added a
checkpointing test [1] and dashboard [2] (see checkpointing widget).

That makes me wonder if we should monitor performance (throughput /
latency) for basic use cases as part of the release testing. Currently,
our release guide [3] mentions running examples but not evaluating the
performance. I think it would be good practice to check relevant charts
with performance measurements as part of of the release process. The
release guide should reflect that.

WDYT?

-Max

PS: Of course, this requires tests and metrics to be available. This PR
adds latency measurements to the load tests [4].


[1] https://github.com/apache/beam/pull/11558
[2]
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
[3] https://beam.apache.org/contribute/release-guide/
[4] https://github.com/apache/beam/pull/12065

Monitoring performance for releases

2020-07-09 Thread Maximilian Michels


Hi,

We recently saw an increase in latency migrating from Beam 2.18.0 to 
2.21.0 (Python SDK with Flink Runner). This proofed very hard to debug 
and it looks like each version in between the two versions let to 
increased latency.


This is not the first time we saw issues when migrating, another time we 
had a decline in checkpointing performance and thus added a 
checkpointing test [1] and dashboard [2] (see checkpointing widget).


That makes me wonder if we should monitor performance (throughput / 
latency) for basic use cases as part of the release testing. Currently, 
our release guide [3] mentions running examples but not evaluating the 
performance. I think it would be good practice to check relevant charts 
with performance measurements as part of of the release process. The 
release guide should reflect that.


WDYT?

-Max

PS: Of course, this requires tests and metrics to be available. This PR 
adds latency measurements to the load tests [4].



[1] https://github.com/apache/beam/pull/11558
[2] 
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056

[3] https://beam.apache.org/contribute/release-guide/
[4] https://github.com/apache/beam/pull/12065

Re: RequiresStableInput on Spark runner

2020-07-08 Thread Maximilian Michels

Correct, for batch we rely on re-running the entire job which will
produce stable input within each run.

For streaming, the Flink Runner buffers all input to a
@RequiresStableInput DoFn until a checkpoint is complete, only then it
processes the buffered data. Dataflow effectively does the same by going
through the Shuffle service which produces a consistent result.

-Max

On 08.07.20 11:08, Jozef Vilcek wrote:

My last question was more towards the graph translation for batch mode.

Should DoFn with @RequiresStableInput be translated/expanded in some
specific way (e.g. DoFn -> Reshuffle + DoFn) or is it not needed for batch?
Most runners fail in the presence of @RequiresStableInput for both batch
and streaming. I can not find a fail for Flink and Dataflow, but at the
same time, I can not find what those runners do with such DoFn.

On Tue, Jul 7, 2020 at 9:18 PM Kenneth Knowles > wrote:

I hope someone who knows better than me can respond.

A long time ago, the SparkRunner added a call to materialize() at
every GroupByKey. This was to mimic Dataflow, since so many of the
initial IO transforms relied on using shuffle to create stable inputs.

The overall goal is to be able to remove these extra calls to
materialize() and only include them when @RequiresStableInput.

The intermediate state is to analyze whether input is already stable
from materialize() and add another materialize() only if it is not
stable.

I don't know the current state of the SparkRunner. This may already
have changed.

Kenn

On Thu, Jul 2, 2020 at 10:24 PM Jozef Vilcek mailto:jozo.vil...@gmail.com>> wrote:

I was trying to look for references on how other runners handle
@RequiresStableInput for batch cases, however I was not able to
find any.
In Flink I can see added support for streaming case and in
Dataflow I see that support for the feature was turned off
https://github.com/apache/beam/pull/8065

It seems to me that @RequiresStableInput is ignored for the
batch case and the runner relies on being able to recompute the
whole job in the worst case scenario.
Is this assumption correct?
Could I just change SparkRunner to crash on @RequiresStableInput
annotation for streaming mode and ignore it in batch?

On Wed, Jul 1, 2020 at 10:27 AM Jozef Vilcek
mailto:jozo.vil...@gmail.com>> wrote:

We have a component which we use in streaming and batch
jobs. Streaming we run on FlinkRunner and batch on
SparkRunner. Recently we needed to add @RequiresStableInput
to taht component because of streaming use-case. But now
batch case crash on SparkRunner with

Caused by: java.lang.UnsupportedOperationException: Spark runner
currently doesn't support @RequiresStableInput annotation.
at
org.apache.beam.runners.core.construction.UnsupportedOverrideFactory.getReplacementTransform(UnsupportedOverrideFactory.java:58)
at
org.apache.beam.sdk.Pipeline.applyReplacement(Pipeline.java:556)
at org.apache.beam.sdk.Pipeline.replace(Pipeline.java:292)
at org.apache.beam.sdk.Pipeline.replaceAll(Pipeline.java:210)
at
org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:168)
at
org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:90)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:315)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:301)
at
com.sizmek.dp.dsp.pipeline.driver.PipelineDriver$$anonfun$1.apply(PipelineDriver.scala:42)
at
com.sizmek.dp.dsp.pipeline.driver.PipelineDriver$$anonfun$1.apply(PipelineDriver.scala:35)
at scala.util.Try$.apply(Try.scala:192)
at
com.dp.pipeline.driver.PipelineDriver$class.main(PipelineDriver.scala:35)

We are using Beam 2.19.0. Is the @RequiresStableInput
problematic to support for both streaming and batch
use-case? What are the options here?
https://issues.apache.org/jira/browse/BEAM-5358

Re: Errorprone plugin fails for release branches <2.22.0

2020-07-08 Thread Maximilian Michels


Hi Alexey,

I also came across this issue when building a custom Beam version. I 
applied the same fix (https://github.com/apache/beam/pull/11527) which 
you have mentioned.


It appears that the Maven dependencies changed or are no longer 
available which causes the missing class files.


+1 for backporting the fix to the release branches.

Cheers,
Max

On 08.07.20 11:36, Alexey Romanenko wrote:

Hello,

Some days ago I noticed that I can’t build the project from old release 
branches . For example, I wanted to build and run Spark Job Server from 
“release-2.20.0” branch and it failed:


./gradlew :runners:spark:job-server:runShadow —stacktrace

* Exception is:
org.gradle.api.tasks.TaskExecutionException: Execution failed for task 
':model:pipeline:compileJava’.

…
Caused by: org.gradle.internal.UncheckedException: 
java.lang.ClassNotFoundException: 
com.google.errorprone.ErrorProneCompiler$Builder

…


I experienced the same issue for “release-2.19.0” and  “release-2.21.0” 
branches, I didn’t check older branches but seems it’s a global issue 
for “net.ltgt.gradle:gradle-errorprone-plugin:0.0.13".


This is already known issue and it was fixed for 2.22.0 [1] a while ago. 
By applying a fix from [2] on top of previous branch, for example, 
“release-2.20.0” branch I’ve managed to build it. Though, the problem 
for old branches (<2.22.0) is still there - it’s not possible to build 
them right after checkout without applying the fix.


So, there are two questions:

1. Is anyone aware why the old static version of 
gradle-errorprone-plugin fails for the branches that were successfully 
built before?
2. Do we have to fix it for release branches <2.22.0 (either cherry-pick 
the fix for 2.22.0 or somehow else if it’s possible)?


[1] https://issues.apache.org/jira/browse/BEAM-10263
[2] https://github.com/apache/beam/pull/11527

Re: Error in FlinkRunnerTest.test_external_transforms

2020-06-30 Thread Maximilian Michels


Is this a flake? I don't see this in master atm.

-Max

On 27.06.20 02:59, Alex Amato wrote:

Hi,

I was wondering if this is something wrong with my PR 
 or an issue in master.

Thanks for your help.

Seeing this in my PR's presubmit
https://ci-beam.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Commit/5382/

Logs 



==
ERROR: test_external_transforms (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
 Timed out after 60 seconds. 
   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Commit/src/sdks/python/apache_beam/runners/portability/flink_runner_test.py",
 line 204, in test_external_transforms

 assert_that(res, equal_to([i for i in range(1, 10)]))
# Thread: 
   File "apache_beam/pipeline.py", line 547, in __exit__
 self.run().wait_until_finish()

# Thread: 
   File "apache_beam/runners/portability/portable_runner.py", line 543, in 
wait_until_finish
 self._observe_state(message_thread)
   File "apache_beam/runners/portability/portable_runner.py", line 552, in 
_observe_state

 for state_response in self._state_stream:
# Thread: <_Worker(Thread-110, started daemon 140197924693760)>
   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
 line 413, in next
 return self._next()

   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
 line 697, in _next
# Thread: <_MainThread(MainThread, started 140200366741248)>
 _common.wait(self._state.condition.wait, _response_ready)
   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_common.py",
 line 138, in wait
 _wait_once(wait_fn, MAXIMUM_WAIT_TIMEOUT, spin_cb)

   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_common.py",
 line 103, in _wait_once
 wait_fn(timeout=timeout)
# Thread: 
   File "/usr/lib/python2.7/threading.py", line 359, in wait
 _sleep(delay)
   File "apache_beam/runners/portability/portable_runner_test.py", line 82, in 
handler
 raise BaseException(msg)
BaseException: Timed out after 60 seconds.


# Thread: <_Worker(Thread-18, started daemon 140198537066240)>

# Thread: 

--
# Thread: <_Worker(Thread-19, started daemon 140198528673536)>

Ran 82 tests in 461.409s

FAILED (errors=1, skipped=15)

Re: Is there an easy way to figure out why my build failed?

2020-06-30 Thread Maximilian Michels


Hi Alex,

Fully agree with you that it can be hard to find the cause for a failing 
build. You basically need to know the exact keyword to grep for. The 
reason is that Jenkins does not understand all build logs to display the 
error directly in the UI.


I often do the following for large logs:

  $ curl 
https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/12017/consoleText 
| less


Then I can use '/' to search in the log quickly without my browser 
slowing down.


In the linked build log, I searched for ' FAILED':

  09:18:26 > Task :sdks:java:io:rabbitmq:test FAILED
  09:18:26
  09:18:26 FAILURE: Build failed with an exception.
  09:18:26
  09:18:26 * What went wrong:
  09:18:26 Execution failed for task ':sdks:java:io:rabbitmq:test'.
  09:18:26 > Process 'Gradle Test Executor 110' finished with non-zero 
 exit value 143
  09:18:26   This problem might be caused by incorrect test process 
configuration.
  09:18:26   Please refer to the test execution section in the User 
Manual at 
https://docs.gradle.org/5.2.1/userguide/java_testing.html#sec:test_execution


Now, it appears that the rabbitmq tests are timing out but I'm not sure 
the issue if with rabbitmq because I'm also seeing:


  Build timed out (after 120 minutes). Marking the build as aborted.
  Build was aborted
  Recording test results

So maybe some other test slowed down the build and when it reached 
rabbitmq it was killed. That can probably tested by running the build 
multiple times.


-Max

On 30.06.20 19:47, Alex Amato wrote:
Often I see the build failing, but on the next page there are no 
warnings and no errors.


Then when you dive into the full log, it slows down the browser and 
there is no obvious ctrl-f keyword to find the error ("error" yields 
over 100 results, and the error isn't always at the bottom). Is there a 
faster/better way to do it?


There is a log about the build timing out, but I don't really know what 
timed out or where to look next.


Is 120 min a long enough time? Did something recently happen? If so Can 
we increase the timeout until we debug the regression?


https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/12017/

https://issues.apache.org/jira/browse/BEAM-10390

Thanks, I would appreciate any ideas :)
Alex

Re: [ANNOUNCE] New committer: Aizhamal Nurmamat kyzy

2020-06-30 Thread Maximilian Michels


Congrats Aizhamal!

On 30.06.20 17:34, Jan Lukavský wrote:

Congratulations Aizhamal!

On 6/30/20 1:35 PM, Alexey Romanenko wrote:

Congratulations!
Well deserved and thank you for your hard work, Aizhamal!

On 30 Jun 2020, at 13:31, Reza Rokni > wrote:


Congratulations !

On Tue, Jun 30, 2020 at 2:46 PM Michał Walenia 
mailto:michal.wale...@polidea.com>> wrote:


Congratulations, Aizhamal! :)

On Tue, Jun 30, 2020 at 8:41 AM Tobiasz Kędzierski
mailto:tobiasz.kedzier...@polidea.com>> wrote:

Congratulations Aizhamal! :)

On Mon, Jun 29, 2020 at 11:50 PM Austin Bennett
mailto:whatwouldausti...@gmail.com>> wrote:

Congratulations, @Aizhamal Nurmamat kyzy
 !

On Mon, Jun 29, 2020 at 2:32 PM Valentyn Tymofieiev
mailto:valen...@google.com>> wrote:

Congratulations and big thank you for all the hard
work on Beam, Aizhamal!

On Mon, Jun 29, 2020 at 9:56 AM Kenneth Knowles
mailto:k...@apache.org>> wrote:

Please join me and the rest of the Beam PMC in
welcoming a new committer: Aizhamal Nurmamat kyzy

Over the last 15 months or so, Aizhamal has
driven many efforts in the Beam community and
contributed to others. Aizhamal started by
helping with the Beam newsletter [1] then
continued by contributing to meetup planning [2]
[3] and Beam Summit planning [4]. Aizhamal
created Beam's system for managing social media
[5] and contributed many tweets, coordinated the
vote and design of Beam's mascot [6] [7], drove
migration of Beam's site to a more i18n-friendly
infrastructure [8], kept on top of Beam's
enrollment in Season of Docs [9], and even
organized remote Beam Webinars during the
pandemic [10].

In consideration of Aizhamal's contributions, the
Beam PMC trusts her with
the responsibilities of a Beam committer [11].

Thank you, Aizhamal, for your contributions and
looking forward to many more!

Kenn, on behalf of the Apache Beam PMC

[1]

https://lists.apache.org/thread.html/447ae9fdf580ad88522aabc8a0f3703c51acd8885578bb422389a4b0%40%3Cdev.beam.apache.org%3E
[2]

https://lists.apache.org/thread.html/ebeeae53a64dca8bb491e26b8254d247226e6d770e33dbc9428202df%40%3Cdev.beam.apache.org%3E
[3]

https://lists.apache.org/thread.html/rc31d3d57b39e6cf12ea3b6da0e884f198f8cbef9a73f6a50199e0e13%40%3Cdev.beam.apache.org%3E
[4]

https://lists.apache.org/thread.html/99815d5cd047e302b0ef4b918f2f6db091b8edcf430fb62e4eeb1060%40%3Cdev.beam.apache.org%3E
[5]

https://lists.apache.org/thread.html/babceeb52624fd4dd129c259db8ee9017cb68cba069b68fca7480c41%40%3Cdev.beam.apache.org%3E
[6]

https://lists.apache.org/thread.html/60aa4b149136e6aa4643749731f4b5a041ae4952e7b7e57654888bed%40%3Cdev.beam.apache.org%3E
[7]

https://lists.apache.org/thread.html/r872ba2860319cbb5ca20de953c43ed7d750155ca805cfce3b70085b0%40%3Cdev.beam.apache.org%3E

[8]

https://lists.apache.org/thread.html/rfab4cc1411318c3f4667bee051df68f37be11846ada877f3576c41a9%40%3Cdev.beam.apache.org%3E

[9]

https://lists.apache.org/thread.html/r4df2e596751e263a83300818776fbb57cb1e84171c474a9fd016ec10%40%3Cdev.beam.apache.org%3E
[10]

https://lists.apache.org/thread.html/r81b93d700fedf3012b9f02f56b5d693ac4c1aac1568edf9e0767b15f%40%3Cuser.beam.apache.org%3E
[11]

https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer



-- 
Michał Walenia

Polidea  | Software Engineer

M: +48 791 432 002 
E: michal.wale...@polidea.com 

Unique Tech
Check out our projects!

Re: Individual Parallelism support for Flink Runner

2020-06-29 Thread Maximilian Michels

We could allow parameterizing transforms by using transform identifiers 
from the pipeline, e.g.



  options = ['--parameterize=MyTransform;parallelism=5']
  with Pipeline.create(PipelineOptions(options)) as p:
p | Create(1, 2, 3) | 'MyTransform' >> ParDo(..)


Those hints should always be optional, such that a pipeline continues to 
run on all runners.


-Max

On 28.06.20 14:30, Reuven Lax wrote:
However such a parameter would be specific to a single transform, 
whereas maxNumWorkers is a global parameter today.


On Sat, Jun 27, 2020 at 10:31 PM Daniel Collins > wrote:


I could imagine for example, a 'parallelismHint' field in the base
parameters that could be set to maxNumWorkers when running on
dataflow or an equivalent parameter when running on flink. It would
be useful to get a default value for the sharding in the Reshuffle
changes here https://github.com/apache/beam/pull/11919, but more
generally to have some decent guess on how to best shard work. Then
it would be runner-agnostic; you could set it to something like
numCpus on the local runner for instance.

On Sat, Jun 27, 2020 at 2:04 AM Reuven Lax mailto:re...@google.com>> wrote:

It's an interesting question - this parameter is clearly very
runner specific (e.g. it would be meaningless for the Dataflow
runner, where parallelism is not a static constant). How should
we go about passing runner-specific options per transform?

On Fri, Jun 26, 2020 at 1:14 PM Akshay Iyangar
mailto:aiyan...@godaddy.com>> wrote:

Hi beam community,

__ __

So I had brought this issue in our slack channel but I guess
this warrants a deeper discussion and if we do go about what
is the POA for it.

__ __

So basically currently for Flink Runner we don’t support
operator level parallelism which native Flink provides OOTB.
So I was wondering what the community feels about having
some way to pass parallelism for individual operators esp.
  for some of the existing IO’s 

__ __

Wanted to know what people think of this.

__ __

Thanks 

Akshay I

Re: Remove EOL'd Runners

2020-06-09 Thread Maximilian Michels

Thanks of the heads-up, Tyson! It's a sensible decision to remove
unsupported runners.

-Max

On 09.06.20 16:51, Tyson Hamilton wrote:
> Hi All,
> 
> As part of the Fixit [1] I'd like to remove EOL'd runners, Apex and Gearpump, 
> as described in BEAM- [2]. This will be a big PR I think and didn't want 
> anyone to be surprised. There is already some agreement in the linked Jira 
> issue. If there are no objections I'll get started later today or tomorrow.
> 
> -Tyson
> 
> 
> [1]: 
> https://lists.apache.org/thread.html/r9ddc77a8fee58ad02f68e2d9a7f054aab3e55717cc88ad1d5bc49311%40%3Cdev.beam.apache.org%3E
> [2]: https://issues.apache.org/jira/browse/BEAM-
>

Re: SQL Windowing

2020-05-28 Thread Maximilian Michels

Thanks for the quick reply Brian! I've filed a JIRA for option (a):
https://jira.apache.org/jira/browse/BEAM-10143

Makes sense to define DATETIME as a logical type. I'll check out your
PR. We could work around this for now by doing a cast, e.g.:

  TUMBLE(CAST(f_timestamp AS DATETIME), INTERVAL '30' MINUTE)

Note that we may have to do a more sophisticated cast to convert the
Python micros into a DATETIME.

-Max

On 28.05.20 19:18, Brian Hulette wrote:
> Hey Max,
> Thanks for kicking the tires on SqlTransform in Python :)
> 
> We don't have any tests of windowing and Sql in Python yet, so I'm not
> that surprised you're running into issues here. Portable schemas don't
> support the DATETIME type, because we decided not to define it as one of
> the atomic types [1] and hope to add support via a logical type instead
> (see BEAM-7554 [2]). This was the motivation for the MillisInstant PR I
> put up, and the ongoing discussion [3].
> Regardless, that should only be an obstacle for option (b), where you'd
> need to have a DATETIME in the input and/or output PCollection of the
> SqlTransform. In theory option (a) should be possible, so I'd consider
> that a bug - can you file a jira for it?
> 
> Brian
> 
> [1] 
> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/schema.proto#L58
> [2] https://issues.apache.org/jira/browse/BEAM-7554
> [3] 
> https://lists.apache.org/thread.html/r2e05355b74fb5b8149af78ade1e3539ec08371a9a4b2b9e45737e6be%40%3Cdev.beam.apache.org%3E
> 
> On Thu, May 28, 2020 at 9:45 AM Maximilian Michels  <mailto:m...@apache.org>> wrote:
> 
> Hi,
> 
> I'm using the SqlTransform as an external transform from within a Python
> pipeline. The SQL docs [1] mention that you can either (a) window the
> input or (b) window in the SQL query.
> 
> Option (a):
> 
>   input
>       | "Window >> beam.WindowInto(window.FixedWindows(30))
>       | "Aggregate" >>
>       SqlTransform("""Select field, count(field) from PCOLLECTION
>                       WHERE ...
>                       GROUP BY field
>                    """)
> 
> This results in an exception:
> 
>   Caused by: java.lang.ClassCastException:
>   org.apache.beam.sdk.transforms.windowing.IntervalWindow cannot be cast
>   to org.apache.beam.sdk.transforms.windowing.GlobalWindow
> 
> => Is this a bug?
> 
> 
> Let's try Option (b):
> 
>   input
>       | "Aggregate & Window" >>
>       SqlTransform("""Select field, count(field) from PCOLLECTION
>                       WHERE ...
>                       GROUP BY field,
>                                TUMBLE(f_timestamp, INTERVAL '30' MINUTE)
>                    """)
> 
> The issue that I'm facing here is that the timestamp is already assigned
> to my values but is not exposed as a field. So I need to use a DoFn to
> extract the timestamp as a new field:
> 
>   class GetTimestamp(beam.DoFn):
>     def process(self, event, timestamp=beam.DoFn.TimestampParam):
>       yield TimestampedRow(..., timestamp)
> 
>   input
>       | "Extract timestamp" >>
>       beam.ParDo(GetTimestamp())
>       | "Aggregate & Window" >>
>       SqlTransform("""Select field, count(field) from PCOLLECTION
>                       WHERE ...
>                       GROUP BY field,
>                                TUMBLE(f_timestamp, INTERVAL '30' MINUTE)
>                    """)
> 
> => It would be very convenient if there was a reserved field name which
> would point to the timestamp of an element. Maybe there is?
> 
> 
> -Max
> 
> 
> [1]
> 
> https://beam.apache.org/documentation/dsls/sql/extensions/windowing-and-triggering/
>

SQL Windowing

2020-05-28 Thread Maximilian Michels

Hi,

I'm using the SqlTransform as an external transform from within a Python
pipeline. The SQL docs [1] mention that you can either (a) window the
input or (b) window in the SQL query.

Option (a):

  input
  | "Window >> beam.WindowInto(window.FixedWindows(30))
  | "Aggregate" >>
  SqlTransform("""Select field, count(field) from PCOLLECTION
  WHERE ...
  GROUP BY field
   """)

This results in an exception:

  Caused by: java.lang.ClassCastException:
  org.apache.beam.sdk.transforms.windowing.IntervalWindow cannot be cast
  to org.apache.beam.sdk.transforms.windowing.GlobalWindow

=> Is this a bug?


Let's try Option (b):

  input
  | "Aggregate & Window" >>
  SqlTransform("""Select field, count(field) from PCOLLECTION
  WHERE ...
  GROUP BY field,
   TUMBLE(f_timestamp, INTERVAL '30' MINUTE)
   """)

The issue that I'm facing here is that the timestamp is already assigned
to my values but is not exposed as a field. So I need to use a DoFn to
extract the timestamp as a new field:

  class GetTimestamp(beam.DoFn):
def process(self, event, timestamp=beam.DoFn.TimestampParam):
  yield TimestampedRow(..., timestamp)

  input
  | "Extract timestamp" >>
  beam.ParDo(GetTimestamp())
  | "Aggregate & Window" >>
  SqlTransform("""Select field, count(field) from PCOLLECTION
  WHERE ...
  GROUP BY field,
   TUMBLE(f_timestamp, INTERVAL '30' MINUTE)
   """)

=> It would be very convenient if there was a reserved field name which
would point to the timestamp of an element. Maybe there is?


-Max


[1]
https://beam.apache.org/documentation/dsls/sql/extensions/windowing-and-triggering/

Re: What's the purpose of version=2.20.0-RC2 in gradle.properties?

2020-05-28 Thread Maximilian Michels

> I would expect the release branch to have the next -SNAPSHOT version (not the 
> case currently):

Why would the release branch have the next version? It is created for
the sole purpose of releasing the current version. For example, the
release branch for 2.21.0 would have the version 2.21.0-SNAPSHOT. If we
were to release 2.21.1 or 2.22.0, we would create a new branch where the
same logic applies.

The release branch having a -SNAPSHOT version makes perfect sense
because it is a snapshot of what is going to be released (still subject
to changes). Contrary to what I said before, I don't think we should
remove the snapshot suffix from the release branch.

However, as pointed out, the source release and its tag should have a
non-snapshot version.

-Max

On 27.05.20 05:02, Thomas Weise wrote:
> 
> 
> I think the "set_version.sh" script could be called in the release
> scripts to remove the -SNAPSHOT suffix on the release branch.
> 
> 
> I would expect the release branch to have the next -SNAPSHOT version
> (not the case currently):
> 
> https://github.com/apache/beam/blob/release-2.20.0/gradle.properties#L26
> 
> Release tag and the source archive should have the actually released
> version (not -RC):
> 
> https://github.com/apache/beam/blob/v2.20.0/gradle.properties#L26
> 
> 
>  
> 
> Btw, in case you haven't seen it, here is our release guide:
> https://beam.apache.org/contribute/release-guide/
> 
> -Max
> 
> On 26.05.20 19:02, Jacek Laskowski wrote:
> > Hi Max,
> >
> >> I think you bring up a good point, for the sake of release build
> > reproducibility, we may want to remove the snapshot suffix for the
> > source release.
> >
> > Wish I could be as clear as yourself with this. Yes, that's what I've
> > been bothered about. Is there a JIRA issue for this already? I've
> never
> > been good at releases but certainly could help a bit here and there
> > since I'm interested in having reproducible builds (from the tags).
> >
> > Pozdrawiam,
> > Jacek Laskowski
> > 
> > https://about.me/JacekLaskowski
> > "The Internals Of" Online Books <https://books.japila.pl/>
> > Follow me on https://twitter.com/jaceklaskowski
> >
> > <https://twitter.com/jaceklaskowski>
> >
> >
> > On Tue, May 26, 2020 at 5:37 PM Maximilian Michels  <mailto:m...@apache.org>
> > <mailto:m...@apache.org <mailto:m...@apache.org>>> wrote:
> >
> >     If you really want to work with the source code, I'd recommend
> using the
> >     released source code:
> >     https://beam.apache.org/get-started/downloads/#releases
> >
> >     Even there the version in gradle.properties says
> x.y.z-SNAPSHOT. You may
> >     want to remove the -SNAPSHOT suffix. I understand that this is
> confusing
> >     but that's how our release tooling currently works; it removes the
> >     snapshot suffix during publishing the artifacts.
> >
> >     I think you bring up a good point, for the sake of release build
> >     reproducibility, we may want to remove the snapshot suffix for the
> >     source release.
> >
> >     Best,
> >     Max
> >
> >     On 26.05.20 17:20, Kyle Weaver wrote:
> >     >> When we release the version, the RC suffix is dropped.
> >     >
> >     > I think this might not actually be true, at least for the
> git tag,
> >     since
> >     > we just copy the tag from the accepted RC without changing
> anything.
> >     > However, it might not matter because RC2 artifacts should be
> identical
> >     > to the final release artifacts.
> >     >
> >     >> In other words, how to check out the sources of Beam 2.20.0
> and build
> >     > them to get the released artifacts?
> >     >
> >     > As Max said, we build and publish artifacts (Jars, Docker
> containers,
> >     > Python wheels, etc.) for each release, so it usually isn't
> >     necessary to
> >     > build them oneself unless you are testing on head or other
> >     unreleased code.
> >     >
> >     > On Tue, May 26, 2020 at 6:02 AM Jacek Laskowski
> mailto:ja...@japila.pl>
> >     <mailto:ja...@japila.pl <mailto:ja...@japila.pl>>
> >     > <mailto:ja...@japila.pl &l

Re: What's the purpose of version=2.20.0-RC2 in gradle.properties?

2020-05-26 Thread Maximilian Michels

Don't think so. Feel free to create one.

We already have a script which updates the version to a non-snapshot
version:
https://github.com/apache/beam/blob/master/release/src/main/scripts/set_version.sh

However, it seems that this is merely a variant of this script which we
use to cut the release branch:
https://github.com/apache/beam/blob/master/release/src/main/scripts/cut_release_branch.sh

I think the "set_version.sh" script could be called in the release
scripts to remove the -SNAPSHOT suffix on the release branch.

Btw, in case you haven't seen it, here is our release guide:
https://beam.apache.org/contribute/release-guide/

-Max

On 26.05.20 19:02, Jacek Laskowski wrote:
> Hi Max,
> 
>> I think you bring up a good point, for the sake of release build
> reproducibility, we may want to remove the snapshot suffix for the
> source release.
> 
> Wish I could be as clear as yourself with this. Yes, that's what I've
> been bothered about. Is there a JIRA issue for this already? I've never
> been good at releases but certainly could help a bit here and there
> since I'm interested in having reproducible builds (from the tags).
> 
> Pozdrawiam,
> Jacek Laskowski
> 
> https://about.me/JacekLaskowski
> "The Internals Of" Online Books <https://books.japila.pl/>
> Follow me on https://twitter.com/jaceklaskowski
> 
> <https://twitter.com/jaceklaskowski>
> 
> 
> On Tue, May 26, 2020 at 5:37 PM Maximilian Michels  <mailto:m...@apache.org>> wrote:
> 
> If you really want to work with the source code, I'd recommend using the
> released source code:
> https://beam.apache.org/get-started/downloads/#releases
> 
> Even there the version in gradle.properties says x.y.z-SNAPSHOT. You may
> want to remove the -SNAPSHOT suffix. I understand that this is confusing
> but that's how our release tooling currently works; it removes the
> snapshot suffix during publishing the artifacts.
> 
> I think you bring up a good point, for the sake of release build
> reproducibility, we may want to remove the snapshot suffix for the
> source release.
> 
> Best,
> Max
> 
> On 26.05.20 17:20, Kyle Weaver wrote:
> >> When we release the version, the RC suffix is dropped.
> >
> > I think this might not actually be true, at least for the git tag,
> since
> > we just copy the tag from the accepted RC without changing anything.
> > However, it might not matter because RC2 artifacts should be identical
> > to the final release artifacts.
> >
> >> In other words, how to check out the sources of Beam 2.20.0 and build
> > them to get the released artifacts?
> >
> > As Max said, we build and publish artifacts (Jars, Docker containers,
> > Python wheels, etc.) for each release, so it usually isn't
> necessary to
> > build them oneself unless you are testing on head or other
> unreleased code.
> >
> > On Tue, May 26, 2020 at 6:02 AM Jacek Laskowski  <mailto:ja...@japila.pl>
> > <mailto:ja...@japila.pl <mailto:ja...@japila.pl>>> wrote:
> >
> >     Hi Max,
> >
> >     > You probably want to work with the release artifacts, instead of
> >     cloning
> >     > the development branch.
> >
> >     I'm not sure I understand.
> >
> >     I did the following to work with the sources of v2.20.0. Am
> >     I missing something?
> >
> >     git fetch --all --tags --prune
> >     git checkout -b v2.20.0 v2.20.0
> >
> >     The last commit on the branch
> >     is 9f0cb649d39ee6236ea27f111acb4b66591a80ec that matches the repo.
> >
> >   
>  
> https://github.com/apache/beam/commit/9f0cb649d39ee6236ea27f111acb4b66591a80ec
> >
> >     commit 9f0cb649d39ee6236ea27f111acb4b66591a80ec (HEAD -> v2.20.0,
> >     tag: v2.20.0-RC2, tag: v2.20.0)
> >     Author: amaliujia  <mailto:ruw...@google.com> <mailto:ruw...@google.com
> <mailto:ruw...@google.com>>>
> >     Date:   Wed Apr 8 14:38:47 2020 -0700
> >
> >         [Gradle Release Plugin] - pre tag commit:  'v2.20.0-RC2'.
> >
> >      gradle.properties | 2 +-
> >      1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >     That commit introduced the RC2:
> >
> >     -version=2.20.0-SNAPSHOT
> >     +version=2.20.0-RC2
> >
> >     Why is there no 2.20.0 only commit? One that wo

Re: What's the purpose of version=2.20.0-RC2 in gradle.properties?

2020-05-26 Thread Maximilian Michels

If you really want to work with the source code, I'd recommend using the
released source code:
https://beam.apache.org/get-started/downloads/#releases

Even there the version in gradle.properties says x.y.z-SNAPSHOT. You may
want to remove the -SNAPSHOT suffix. I understand that this is confusing
but that's how our release tooling currently works; it removes the
snapshot suffix during publishing the artifacts.

I think you bring up a good point, for the sake of release build
reproducibility, we may want to remove the snapshot suffix for the
source release.

Best,
Max

On 26.05.20 17:20, Kyle Weaver wrote:
>> When we release the version, the RC suffix is dropped.
> 
> I think this might not actually be true, at least for the git tag, since
> we just copy the tag from the accepted RC without changing anything.
> However, it might not matter because RC2 artifacts should be identical
> to the final release artifacts.
> 
>> In other words, how to check out the sources of Beam 2.20.0 and build
> them to get the released artifacts?
> 
> As Max said, we build and publish artifacts (Jars, Docker containers,
> Python wheels, etc.) for each release, so it usually isn't necessary to
> build them oneself unless you are testing on head or other unreleased code.
> 
> On Tue, May 26, 2020 at 6:02 AM Jacek Laskowski  <mailto:ja...@japila.pl>> wrote:
> 
> Hi Max,
> 
> > You probably want to work with the release artifacts, instead of
> cloning
> > the development branch.
> 
> I'm not sure I understand.
> 
> I did the following to work with the sources of v2.20.0. Am
> I missing something?
> 
> git fetch --all --tags --prune
> git checkout -b v2.20.0 v2.20.0
> 
> The last commit on the branch
> is 9f0cb649d39ee6236ea27f111acb4b66591a80ec that matches the repo.
> 
> 
> https://github.com/apache/beam/commit/9f0cb649d39ee6236ea27f111acb4b66591a80ec
> 
> commit 9f0cb649d39ee6236ea27f111acb4b66591a80ec (HEAD -> v2.20.0,
> tag: v2.20.0-RC2, tag: v2.20.0)
> Author: amaliujia mailto:ruw...@google.com>>
> Date:   Wed Apr 8 14:38:47 2020 -0700
> 
>     [Gradle Release Plugin] - pre tag commit:  'v2.20.0-RC2'.
> 
>  gradle.properties | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> That commit introduced the RC2:
> 
> -version=2.20.0-SNAPSHOT
> +version=2.20.0-RC2
> 
> Why is there no 2.20.0 only commit? One that would be like this for
> Spark 2.4.5 [1] or Kafka 2.5.0 [2]?
> 
> [1] 
> https://github.com/apache/spark/commit/cee4ecbb16917fa85f02c635925e2687400aa56b
> [2] 
> https://github.com/apache/kafka/commit/66563e712b0b9f84f673b262f2fb87c03110084d
> 
> In other words, how to check out the sources of Beam 2.20.0 and
> build them to get the released artifacts?
> 
> Pozdrawiam,
> Jacek Laskowski
> 
> https://about.me/JacekLaskowski
> "The Internals Of" Online Books <https://books.japila.pl/>
> Follow me on https://twitter.com/jaceklaskowski
> 
> <https://twitter.com/jaceklaskowski>
> 
> 
> On Mon, May 25, 2020 at 12:00 PM Maximilian Michels  <mailto:m...@apache.org>> wrote:
> 
> Hi Jacek,
> 
> The Gradle property is the source of truth for the Beam version.
> When we
> release the version, the RC suffix is dropped.
> 
> The use of snapshot versions is normal during the development
> process.
> You probably want to work with the release artifacts, instead of
> cloning
> the development branch.
> 
> -Max
> 
> On 24.05.20 12:45, Jacek Laskowski wrote:
> > Hi,
> >
> > I git cloned https://github.com/apache/beam/tree/v2.20.0 and
> > found version=2.20.0-RC2 in gradle.properties. What's the
> purpose of the
> > version property?
> >
> > (The main reason I'm asking is that I try to find out why
> gradle / IDEA
> > attaches 2.20.0-SNAPSHOT dependencies to projects. How is that
> possible
> > that any of the two would ever consider SNAPSHOT as a dependency?)
> >
> > Pozdrawiam,
> > Jacek Laskowski
> > 
> > https://about.me/JacekLaskowski
> > "The Internals Of" Online Books <https://books.japila.pl/>
> > Follow me on https://twitter.com/jaceklaskowski
> >
> > <https://twitter.com/jaceklaskowski>
>

[BEAM-10054] Pipeline stalls with DirectRunner

2020-05-26 Thread Maximilian Michels

Could somebody familiar with the Python SDK take a look at this problem?
It manifests in the Direct Runner stalling execution.

Tests are passing but I'm unsure about the context of the commit which
introduced the change (linked in the PR):
https://github.com/apache/beam/pull/11777

Thanks,
Max

Re: What's the purpose of version=2.20.0-RC2 in gradle.properties?

2020-05-25 Thread Maximilian Michels

Hi Jacek,

The Gradle property is the source of truth for the Beam version. When we
release the version, the RC suffix is dropped.

The use of snapshot versions is normal during the development process.
You probably want to work with the release artifacts, instead of cloning
the development branch.

-Max

On 24.05.20 12:45, Jacek Laskowski wrote:
> Hi,
> 
> I git cloned https://github.com/apache/beam/tree/v2.20.0 and
> found version=2.20.0-RC2 in gradle.properties. What's the purpose of the
> version property?
> 
> (The main reason I'm asking is that I try to find out why gradle / IDEA
> attaches 2.20.0-SNAPSHOT dependencies to projects. How is that possible
> that any of the two would ever consider SNAPSHOT as a dependency?)
> 
> Pozdrawiam,
> Jacek Laskowski
> 
> https://about.me/JacekLaskowski
> "The Internals Of" Online Books 
> Follow me on https://twitter.com/jaceklaskowski
> 
>

Re: Event Calendar?

2020-05-21 Thread Maximilian Michels

Would it make sense to combine it with the Apache Beam release calendar?

https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com=America%2FLos_Angeles

On 20.05.20 19:27, Tyson Hamilton wrote:
> +1 a calendar would be nice. 
> 
> On Tue, May 19, 2020 at 3:51 PM Austin Bennett
> mailto:whatwouldausti...@gmail.com>> wrote:
> 
> Hi All,
> 
> As we have events more often that are more accessible (digital),
> wondering whether others see a value of adding a calendar to the
> website?  
> 
> Perhaps related, is it worth
> updating https://beam.apache.org/community/in-person/ <- to
> something that isn't 'in-person' since doing things in-person is
> perhaps (hopefully not completely) a vestige of the past.  
> 
> Cheers,
> Austin
>

Re: New Dates For Beam Summit Digital 2020

2020-05-21 Thread Maximilian Michels

Thanks Matthias!

We realized we want this to be a much more community-driven process.
That's why we are planning to be more transparent to give everyone a
chance to get involved in the summit. Given that we now have more time,
this will be much more feasible.

Cheers,
Max

On 18.05.20 20:00, Matthias Baetens wrote:
> Dear Beam community,
> 
> A few weeks ago, we announced the dates for the Beam Digital Summit and
> we know the community received this news with excitement. This is a
> great opportunity to create and share content about streaming analytics
> and the solutions that teams around the world have created using Apache
> Beam and its ecosystem. 
> 
> We have chosen August 24-28th as the new dates. While this has been a
> difficult decision, we think it’s the right decision to ensure we
> produce the best possible event. We encourage you to send your talk
> proposals, anything from use cases, lightning talks, or workshop ideas.
> 
> Based on this change, the CFP will remain open until June 15th. We would
> love to hear about what you are doing with Beam, how to improve it, and
> how to strengthen our community.
> 
> We thank you for your understanding! See you soon!
> 
> -Griselda Cuevas, Brittany Hermann, Maximilian Michels, Austin Bennett,
> Matthias Baetens, Alex Van Boxel
>

Re: Transparency to Beam Digital Summit Planning

2020-05-21 Thread Maximilian Michels

+1 for making the notes publicly available. This list is free to join by
anyone.

On 21.05.20 00:17, Austin Bennett wrote:
> Should the link/meeting notes be publicly available?  Not just available
> to individuals plus all of @google?  
> 
> 
> 
> On Wed, May 20, 2020 at 2:06 PM Brittany Hermann  > wrote:
> 
> Hi folks,
> 
> I wanted to provide a few different ways of transparency to you
> during the planning of the Beam Digital Summit. 
> 
> 1) *Beam Summit Status Reports:* I will be sending out weekly Beam
> Summit Status Reports which will include the goals, attendees,
> topics discussed, and decisions made every Wednesday. 
> 
> 2) *Community Guests on Committee Planning Calls:* We would like to
> invite you to join as a guest to these planning calls. This would
> allow for observation of the planning process and to see if there
> are ways for future collaboration on promotions, etc. for the event.
> If you are interested in joining the first bi-weekly meeting
> starting next week, please reach out to me and I will send the
> invite with call-in information directly to you. 
> 
> In the meantime, I have attached this week's Beam Summit Status
> report below. 
> 
> 
> https://docs.google.com/document/d/1_jLhKvW5MTtkHOZDJyzCTSLUDiD4RjlJmU35rXV-3n0/edit?usp=sharing
> 
> Have a great rest of your week! 
> 
> -- 
> 
>   
> 
> Brittany Hermann
> 
> Open Source Program Manager (Provided by Adecco Staffing)
> 
> 1190 Bordeaux Drive , Building 4, Sunnyvale, CA 94089
> 
> 
> 
>

Re: Running NexMark Tests

2020-05-19 Thread Maximilian Michels

Looks like an accidental change to me. Running with either version, 1.9
or 1.10 works, but this should be changed back to using the latest version.

Do you mind creating a PR?

Thanks,
Max

On 19.05.20 13:02, Sruthi Sree Kumar wrote:
> On the documentation, the version of Flink runner is changed to 1.9
> which was 1.10(latest)
> before 
> https://github.com/apache/beam/commit/1d2700818474c008eaa324ac1b5c49c9d2857298#diff-0e75160f4b09a1a300671557930589d9.
> 
> Is this an accidental change or is there any particular reason for this
> downgrade of version?
> 
> Regards,
> Sruthi
> 
> On Tue, May 12, 2020 at 7:21 PM Maximilian Michels  <mailto:m...@apache.org>> wrote:
> 
> A heads-up if anybody else sees this, we have removed the flag:
> https://jira.apache.org/jira/browse/BEAM-9900
> 
> Further contributions are very welcome :)
> 
> -Max
> 
> On 11.05.20 17:05, Sruthi Sree Kumar wrote:
> > I have opened a PR with the documentation change.
> > https://github.com/apache/beam/pull/11662
> >
> > Regards,
> > Sruthi
> >
> > On 2020/04/21 20:22:17, Ismaël Mejía  <mailto:ieme...@gmail.com>> wrote:
> >> You need to instruct the Flink runner to shutdown the the source
> >> otherwise it will stay waiting.
> >> You can this by adding the extra
> >> argument`--shutdownSourcesOnFinalWatermark=true`
> >> And if that works and you want to open a PR to update our
> >> documentation that would be greatly appreciated.
> >>
> >> Regards,
> >> Ismaël
> >>
> >>
> >> On Tue, Apr 21, 2020 at 10:04 PM Sruthi Sree Kumar
> >>  <mailto:sruthisreekumar2...@gmail.com>> wrote:
> >>>
> >>> Hello,
> >>>
> >>> I am trying to run nexmark queries using flink runner streaming.
> Followed the documentation and used the command
> >>> ./gradlew :sdks:java:testing:nexmark:run \
> >>>
> >>>     -Pnexmark.runner=":runners:flink:1.10" \
> >>>     -Pnexmark.args="
> >>>         --runner=FlinkRunner
> >>>         --suite=SMOKE
> >>>         --streamTimeout=60
> >>>         --streaming=true
> >>>         --manageResources=false
> >>>         --monitorJobs=true
> >>>         --flinkMaster=[local]"
> >>>
> >>>
> >>> But after the events are read from the source, there is no
> further progress and the job is always stuck at 99%. Is there any
> configuration that I am missing?
> >>>
> >>> Regards,
> >>> Sruthi
> >>
>

Re: TextIO. Writing late files

2020-05-19 Thread Maximilian Michels

> This is still confusing to me - why would the messages be dropped as late in 
> this case?

Since you previously mentioned that the bug is due to the pane info
missing, I just pointed out that the WriteFiles logic is expected to
drop the pane info.

@Jose Would it make sense to file a JIRA and summarize all the findings
here?

@Jozef What you describe in
https://www.mail-archive.com/dev@beam.apache.org/msg20186.html is
expected because Flink does not do a GroupByKey on Reshuffle but just
redistributes the elements.

Thanks,
Max

On 18.05.20 21:59, Jose Manuel wrote:
> Hi Reuven, 
> 
> I can try to explaining what I guess. 
> 
> - There is a source which is reading data entries and updating the
> watermark.
> - Then, data entries are grouped and stored in files. 
> - The window information of these data entries are used to emit
> filenames. Data entries's window and timestamp. PaneInfo is empty.
> - When a second window is applied to filenames, if allowlateness is zero
> of lower than the spent time in the previous reading/writing, the
> filenames are discarded as late.
> 
> I guess, the key is in 
> https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/LateDataDroppingDoFnRunner.java#L168
> 
> My assumption is global watermark (or source watermark, I am not sure
> about the name) is used to evaluate the filenames, what are in an
> already emitted window.
> 
> Thanks
> Jose
> 
> 
> El lun., 18 may. 2020 a las 18:37, Reuven Lax ( <mailto:re...@google.com>>) escribió:
> 
> This is still confusing to me - why would the messages be dropped as
> late in this case?
> 
> On Mon, May 18, 2020 at 6:14 AM Maximilian Michels  <mailto:m...@apache.org>> wrote:
> 
> All runners which use the Beam reference implementation drop the
> PaneInfo for
> WriteFilesResult#getPerDestinationOutputFilenames(). That's
> why we can observe this behavior not only in Flink but also Spark.
> 
> The WriteFilesResult is returned here:
> 
> https://github.com/apache/beam/blob/d773f8ca7a4d63d01472b5eaef8b67157d60f40e/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java#L363
> 
> GatherBundlesPerWindow will discard the pane information because all
> buffered elements are emitted in the FinishBundle method which
> always
> has a NO_FIRING (unknown) pane info:
> 
> https://github.com/apache/beam/blob/d773f8ca7a4d63d01472b5eaef8b67157d60f40e/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java#L895
> 
> So this seems expected behavior. We would need to preserve the
> panes in
> the Multimap buffer.
> 
> -Max
> 
> On 15.05.20 18:34, Reuven Lax wrote:
> > Lateness should never be introduced inside a pipeline -
> generally late
> > data can only come from a source.  If data was not dropped as late
> > earlier in the pipeline, it should not be dropped after the
> file write.
> > I suspect that this is a bug in how the Flink runner handles the
> > Reshuffle transform, but I'm not sure what the exact bug is.
> >
> > Reuven
> >
> > On Fri, May 15, 2020 at 2:23 AM Jozef Vilcek
> mailto:jozo.vil...@gmail.com>
> > <mailto:jozo.vil...@gmail.com <mailto:jozo.vil...@gmail.com>>>
> wrote:
> >
> >     Hi Jose,
> >
> >     thank you for putting the effort to get example which
> >     demonstrate your problem. 
> >
> >     You are using a streaming pipeline and it seems that
> watermark in
> >     downstream already advanced further, so when your File
> pane arrives,
> >     it is already late. Since you define that lateness is not
> tolerated,
> >     it is dropped.
> >     I myself never had requirement to specify zero allowed
> lateness for
> >     streaming. It feels dangerous. Do you have a specific use
> case?
> >     Also, in may cases, after windowed files are written, I
> usually
> >     collect them into global window and specify a different
> triggering
> >     policy for collecting them. Both cases are why I never
> came across
> >     this situation.
> >
> >     I do not have an explanation if it is a bug or not. I
> would guess
> >     that watermark can advance further, e.g. because

Re: TextIO. Writing late files

2020-05-18 Thread Maximilian Michels

All runners which use the Beam reference implementation drop the
PaneInfo for WriteFilesResult#getPerDestinationOutputFilenames(). That's
why we can observe this behavior not only in Flink but also Spark.

The WriteFilesResult is returned here:
https://github.com/apache/beam/blob/d773f8ca7a4d63d01472b5eaef8b67157d60f40e/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java#L363

GatherBundlesPerWindow will discard the pane information because all
buffered elements are emitted in the FinishBundle method which always
has a NO_FIRING (unknown) pane info:
https://github.com/apache/beam/blob/d773f8ca7a4d63d01472b5eaef8b67157d60f40e/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java#L895

So this seems expected behavior. We would need to preserve the panes in
the Multimap buffer.

-Max

On 15.05.20 18:34, Reuven Lax wrote:
> Lateness should never be introduced inside a pipeline - generally late
> data can only come from a source.  If data was not dropped as late
> earlier in the pipeline, it should not be dropped after the file write.
> I suspect that this is a bug in how the Flink runner handles the
> Reshuffle transform, but I'm not sure what the exact bug is.
> 
> Reuven
> 
> On Fri, May 15, 2020 at 2:23 AM Jozef Vilcek  > wrote:
> 
> Hi Jose,
> 
> thank you for putting the effort to get example which
> demonstrate your problem. 
> 
> You are using a streaming pipeline and it seems that watermark in
> downstream already advanced further, so when your File pane arrives,
> it is already late. Since you define that lateness is not tolerated,
> it is dropped.
> I myself never had requirement to specify zero allowed lateness for
> streaming. It feels dangerous. Do you have a specific use case?
> Also, in may cases, after windowed files are written, I usually
> collect them into global window and specify a different triggering
> policy for collecting them. Both cases are why I never came across
> this situation.
> 
> I do not have an explanation if it is a bug or not. I would guess
> that watermark can advance further, e.g. because elements can be
> processed in arbitrary order. Not saying this is the case.
> It needs someone with better understanding of how watermark advance
> is / should be handled within pipelines. 
> 
> 
> P.S.: you can add `.withTimestampFn()` to your generate sequence, to
> get more stable timing, which is also easier to reason about:
> 
> Dropping element at 1970-01-01T00:00:19.999Z for key
> ... window:[1970-01-01T00:00:15.000Z..1970-01-01T00:00:20.000Z)
> since too far behind inputWatermark:1970-01-01T00:00:24.000Z;
> outputWatermark:1970-01-01T00:00:24
> .000Z
> 
>            instead of
> 
> Dropping element at 2020-05-15T08:52:34.999Z for key ...
> window:[2020-05-15T08:52:30.000Z..2020-05-15T08:52:35.000Z) since
> too far behind inputWatermark:2020-05-15T08:52:39.318Z;
> outputWatermark:2020-05-15T08:52:39.318Z
> 
> 
> 
> 
> In my
> 
> 
> 
> On Thu, May 14, 2020 at 10:47 AM Jose Manuel  > wrote:
> 
> Hi again, 
> 
> I have simplify the example to reproduce the data loss. The
> scenario is the following:
> 
> - TextIO write files. 
> - getPerDestinationOutputFilenames emits file names 
> - File names are processed by a aggregator (combine, distinct,
> groupbyKey...) with a window **without allowlateness** 
> - File names are discarded as late
> 
> Here you can see the data loss in the picture
> in 
> https://github.com/kiuby88/windowing-textio/blob/master/README.md#showing-data-loss
> 
> Please, follow README to run the pipeline and find log traces
> that say data are dropped as late.
> Remember, you can run the pipeline with another
> window's  lateness values (check README.md)
> 
> Kby.
> 
> El mar., 12 may. 2020 a las 17:16, Jose Manuel
> (mailto:kiuby88@gmail.com>>) escribió:
> 
> Hi,
> 
> I would like to clarify that while TextIO is writing every
> data are in the files (shards). The losing happens when file
> names emitted by getPerDestinationOutputFilenames are
> processed by a window.
> 
> I have created a pipeline to reproduce the scenario in which
> some filenames are loss after the
> getPerDestinationOutputFilenames. Please, note I tried to
> simplify the code as much as possible, but the scenario is
> not easy to reproduce.
> 
> Please check this project
> https://github.com/kiuby88/windowing-textio
> Check readme to build and run
> (https://github.com/kiuby88/windowing-textio#build-and-run)
> Project contains only a class with the
>

Re: Running NexMark Tests

2020-05-12 Thread Maximilian Michels

A heads-up if anybody else sees this, we have removed the flag:
https://jira.apache.org/jira/browse/BEAM-9900

Further contributions are very welcome :)

-Max

On 11.05.20 17:05, Sruthi Sree Kumar wrote:
> I have opened a PR with the documentation change.
> https://github.com/apache/beam/pull/11662
> 
> Regards,
> Sruthi
> 
> On 2020/04/21 20:22:17, Ismaël Mejía  wrote: 
>> You need to instruct the Flink runner to shutdown the the source
>> otherwise it will stay waiting.
>> You can this by adding the extra
>> argument`--shutdownSourcesOnFinalWatermark=true`
>> And if that works and you want to open a PR to update our
>> documentation that would be greatly appreciated.
>>
>> Regards,
>> Ismaël
>>
>>
>> On Tue, Apr 21, 2020 at 10:04 PM Sruthi Sree Kumar
>>  wrote:
>>>
>>> Hello,
>>>
>>> I am trying to run nexmark queries using flink runner streaming. Followed 
>>> the documentation and used the command
>>> ./gradlew :sdks:java:testing:nexmark:run \
>>>
>>> -Pnexmark.runner=":runners:flink:1.10" \
>>> -Pnexmark.args="
>>> --runner=FlinkRunner
>>> --suite=SMOKE
>>> --streamTimeout=60
>>> --streaming=true
>>> --manageResources=false
>>> --monitorJobs=true
>>> --flinkMaster=[local]"
>>>
>>>
>>> But after the events are read from the source, there is no further progress 
>>> and the job is always stuck at 99%. Is there any configuration that I am 
>>> missing?
>>>
>>> Regards,
>>> Sruthi
>>

Re: Beam 2.21 release update

2020-05-11 Thread Maximilian Michels

FYI I've created this issue and marked it as a blocker:
https://jira.apache.org/jira/browse/BEAM-9947

Essentially, the timer encoding is broken for all non-standard key
coders. The fix can be found here: https://github.com/apache/beam/pull/11658

-Max

On 08.05.20 18:53, Udi Meiri wrote:
> +Chad Dombrova  , who added _find_protoc_gen_mypy.
> 
> I'm guessing that the code
> in _install_grpcio_tools_and_generate_proto_files creates a kind of
> virtualenv, but it only works well for staging Python modules and not
> binaries like protoc-gen-mypy.
> (I assume there's a reason why it doesn't invoke virtualenv, probably
> since the list of things setup.py can expect to be installed is very
> minimal (setuptools).)
> 
> One solution would be to make these setup.py dependencies explicit in
> pyproject.toml, such that pip installs them before running
> setup.py: https://pip.pypa.io/en/stable/reference/pip/#pep-517-and-518-support
> It would help when using tools like pip ("pip wheel"), but I'm not sure
> what the alternative for "python setup.py sdist" is.
> 
> 
> On Thu, May 7, 2020 at 10:40 PM Thomas Weise  > wrote:
> 
> No additional stacktraces. Full error output below.
> 
> It's not clear what is going wrong.
> 
> There isn't any exception from the subprocess execution since the
> "WARNING:root:Installing grpcio-tools took 305.39 seconds." is printed.
> 
> Also, the time it takes to perform the install is equivalent to
> successfully running the pip command.
> 
> I will report back if I find anything else. Currently doing the
> explicit install via pip install -r sdks/python/build-requirements.txt
> 
> Thanks,
> Thomas
> 
> WARNING:root:Installing grpcio-tools took 269.27 seconds.
> INFO:gen_protos:Regenerating Python proto definitions (no output files).
> Process Process-1:
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in
> _bootstrap
>     self.run()
>   File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
>     self._target(*self._args, **self._kwargs)
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 378, in _install_grpcio_tools_and_generate_proto_files
>     generate_proto_files(force=force)
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 315, in generate_proto_files
>     protoc_gen_mypy = _find_protoc_gen_mypy()
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 233, in _find_protoc_gen_mypy
>     (fname, ', '.join(search_paths)))
> RuntimeError: Could not find protoc-gen-mypy in
> /code/venvs/venv2/bin, /code/venvs/venv2/bin, /code/venvs/venv3/bin,
> /usr/local/sbin, /usr/local/bin, /usr/sbin, /usr/bin, /sbin, /bin
> Traceback (most recent call last):
>   File "setup.py", line 311, in 
>     'mypy': generate_protos_first(mypy),
>   File
> 
> "/code/venvs/venv2/local/lib/python2.7/site-packages/setuptools/__init__.py",
> line 129, in setup
>     return distutils.core.setup(**attrs)
>   File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
>     dist.run_commands()
>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
>     self.run_command(cmd)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
>     cmd_obj.run()
>   File
> 
> "/code/venvs/venv2/local/lib/python2.7/site-packages/wheel/bdist_wheel.py",
> line 204, in run
>     self.run_command('build')
>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
>     self.distribution.run_command(command)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
>     cmd_obj.run()
>   File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
>     self.run_command(cmd_name)
>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
>     self.distribution.run_command(command)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
>     cmd_obj.run()
>   File "setup.py", line 235, in run
>     gen_protos.generate_proto_files()
>   File
> "/src/streamingplatform/beam-release/beam/sdks/python/gen_protos.py", line
> 310, in generate_proto_files
>     raise ValueError("Proto generation failed (see log for details).")
> ValueError: Proto generation failed (see log for details).
> 
> 
> On Thu, May 7, 2020 at 2:25 PM Udi Meiri  > wrote:
> 
> It's hard to say without more details what's going on. Ahmet
> you're right that it installs build-requirements.txt and retries
> calling generate_proto_files().
> 
> Thomas, were there additional stacktraces? (after a

Re: Flink Runner with RequiresStableInput fails after a certain number of checkpoints

2020-05-05 Thread Maximilian Michels

Hey Eleanore,

The change will be part of the 2.21.0 release.

-Max

On 04.05.20 19:14, Eleanore Jin wrote:
> Hi Max, 
> 
> Thanks for the information and I saw this PR is already merged, just
> wonder is it backported to the affected versions already
> (i.e. 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.19.0, 2.20.0)? Or I have
> to wait for the 2.20.1 release? 
> 
> Thanks a lot!
> Eleanore
> 
> On Wed, Apr 22, 2020 at 2:31 AM Maximilian Michels  <mailto:m...@apache.org>> wrote:
> 
> Hi Eleanore,
> 
> Exactly-once is not affected but the pipeline can fail to checkpoint
> after the maximum number of state cells have been reached. We are
> working on a fix [1].
> 
> Cheers,
> Max
> 
> [1] https://github.com/apache/beam/pull/11478
> 
> On 22.04.20 07:19, Eleanore Jin wrote:
> > Hi Maxi, 
> >
> > I assume this will impact the Exactly Once Semantics that beam
> provided
> > as in the KafkaExactlyOnceSink, the processElement method is also
> > annotated with @RequiresStableInput?
> >
> > Thanks a lot!
> > Eleanore
> >
> > On Tue, Apr 21, 2020 at 12:58 AM Maximilian Michels
> mailto:m...@apache.org>
> > <mailto:m...@apache.org <mailto:m...@apache.org>>> wrote:
> >
> >     Hi Stephen,
> >
> >     Thanks for reporting the issue! David, good catch!
> >
> >     I think we have to resort to only using a single state cell for
> >     buffering on checkpoints, instead of using a new one for every
> >     checkpoint. I was under the assumption that, if the state cell was
> >     cleared, it would not be checkpointed but that does not seem to be
> >     the case.
> >
> >     Thanks,
> >     Max
> >
> >     On 21.04.20 09:29, David Morávek wrote:
> >     > Hi Stephen,
> >     >
> >     > nice catch and awesome report! ;) This definitely needs a
> proper fix.
> >     > I've created a new JIRA to track the issue and will try to
> resolve it
> >     > soon as this seems critical to me.
> >     >
> >     > https://issues.apache.org/jira/browse/BEAM-9794
> >     >
> >     > Thanks,
> >     > D.
> >     >
> >     > On Mon, Apr 20, 2020 at 10:41 PM Stephen Patel
> >     mailto:stephenpate...@gmail.com>
> <mailto:stephenpate...@gmail.com <mailto:stephenpate...@gmail.com>>
> >     > <mailto:stephenpate...@gmail.com
> <mailto:stephenpate...@gmail.com>
> >     <mailto:stephenpate...@gmail.com
> <mailto:stephenpate...@gmail.com>>>> wrote:
> >     >
> >     >     I was able to reproduce this in a unit test:
> >     >
> >     >         @Test
> >     >
> >     >           *public* *void* test() *throws* InterruptedException,
> >     >         ExecutionException {
> >     >
> >     >             FlinkPipelineOptions options =
> >     >       
>  PipelineOptionsFactory./as/(FlinkPipelineOptions.*class*);
> >     >
> >     >             options.setCheckpointingInterval(10L);
> >     >
> >     >             options.setParallelism(1);
> >     >
> >     >             options.setStreaming(*true*);
> >     >
> >     >             options.setRunner(FlinkRunner.*class*);
> >     >
> >     >             options.setFlinkMaster("[local]");
> >     >
> >     >             options.setStateBackend(*new*
> >     >         MemoryStateBackend(Integer.*/MAX_VALUE/*));
> >     >
> >     >             Pipeline pipeline = Pipeline./create/(options);
> >     >
> >     >             pipeline
> >     >
> >     >                 .apply(Create./of/((Void) *null*))
> >     >
> >     >                 .apply(
> >     >
> >     >                     ParDo./of/(
> >     >
> >     >                         *new* DoFn() {
> >     >
> >     >
> >     >                           *private* *static* *final* *long*
> >     >         */serialVersionUID/* = 1L;
> >     >
> >     >
> >     >                           @RequiresStableInput
> >     >
>

Re: Python 3.7 docker container fails to build

On 30.04.20 21:48, Hannah Jiang wrote:
> --info tag was passed to docker image build commands with PythonDocker
> Precommit to capture more logs. Without the tag, errors from
> DockerFile step are not printed out to the console.

Thanks for the info (pun intended).

On 30.04.20 21:48, Hannah Jiang wrote:
> Indeed, I can see the no space left on device in the following but
> not in the log above:
> 
> --info tag was passed to docker image build commands with PythonDocker
> Precommit to capture more logs. Without the tag, errors from DockerFile
> step are not printed out to the console.
> 
> On Thu, Apr 30, 2020 at 11:19 AM Udi Meiri  <mailto:eh...@google.com>> wrote:
> 
> I checked node 8 and it had over 40GB space available. Does your job
> require more than that?
> 
> Long term, I'm thinking we could clean up workspaces for successful
> jobs. This should free up additional space (I guess at least 100GB).
> https://plugins.jenkins.io/ws-cleanup/ - we already use this plugin
> to clean workspaces at job start.
> 
> 
> On Thu, Apr 30, 2020, 07:33 Maximilian Michels  <mailto:m...@apache.org>> wrote:
> 
> *It's working again, probably because it's running on a different
> machine now.
> 
> Who can check the disk space of the Jenkins hosts?
> 
> Thanks,
> Max
> 
> On 30.04.20 11:55, Maximilian Michels wrote:
> > Sorry, I meant to include the Jenkins log:
> >
> 
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console
> >
> > Thanks for investigating Hannah! Indeed, I can see the no
> space left on
> > device in the following but not in the log above:
> >
> 
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/473/console
> >
> > I'm going to try running the build again. Do you think we
> could add more
> > storage to our Jenkins hosts or delete old build data?
> >
> > Thanks,
> > Max
> >
> > On 30.04.20 08:43, Hannah Jiang wrote:
> >> Max, I found a link from your PR and noticed below errors.
> This would be
> >> the true error.
> >>
> >> *07:57:03* >*Task :sdks:python:container:py37:docker*
> >> *07:57:03*  [91mERROR: Could not install packages due to an
> EnvironmentError: [Errno 28] No space left on device
> >> *07:57:03*
> >> *07:57:03*  [0m
> >> *07:57:03* >*Task :sdks:python:container:py35:docker*
> >> *07:57:03*  [91mERROR: Could not install packages due to an
> EnvironmentError: [Errno 28] No space left on device
> >>
> >>
> >>
> >> On Wed, Apr 29, 2020 at 5:59 PM Hannah Jiang
> mailto:hannahji...@google.com>
> >> <mailto:hannahji...@google.com
> <mailto:hannahji...@google.com>>> wrote:
> >>
> >>     There is a PythonDocker Precommit test running for PRs
> with Python
> >>     changes. It seems running well.[1]
> >>     Max, can you please give me a link so I can check more
> details? Do
> >>     other images with different Python versions fail as well?
> >>
> >>   
>  1. https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/
> >>
> >>
> >>     On Wed, Apr 29, 2020 at 2:44 PM Ahmet Altay
>     mailto:al...@google.com>
> >>     <mailto:al...@google.com <mailto:al...@google.com>>> wrote:
> >>
> >>         +Valentyn Tymofieiev <mailto:valen...@google.com
> <mailto:valen...@google.com>> +Hannah Jiang
> >>         <mailto:hannahji...@google.com
> <mailto:hannahji...@google.com>> -- in case they have relevant
> >>         information.
> >>
> >>         On Wed, Apr 29, 2020 at 12:35 PM Maximilian Michels
> >>         mailto:m...@apache.org>
> <mailto:m...@apache.org <mailto:m...@apache.org>>> wrote:
> >>
> >>             Hi,
> >>
> >>             has anyone noticed the Python 3.7 Docker
> container fails to
> >>             build? I
> >>             haven't b

Re: Python 3.7 docker container fails to build

Is the issue that the workspace grows over time? Couldn't we delete it
daily to ensure it does not grow too much? Always deleting it on
successful runs may be too costly because we have to recreate the
workspace every time.

Logs are stored separately. I suppose they could also add up over time.

On 30.04.20 21:48, Hannah Jiang wrote:
> Indeed, I can see the no space left on device in the following but
> not in the log above:
> 
> --info tag was passed to docker image build commands with PythonDocker
> Precommit to capture more logs. Without the tag, errors from DockerFile
> step are not printed out to the console.
> 
> On Thu, Apr 30, 2020 at 11:19 AM Udi Meiri  <mailto:eh...@google.com>> wrote:
> 
> I checked node 8 and it had over 40GB space available. Does your job
> require more than that?
> 
> Long term, I'm thinking we could clean up workspaces for successful
> jobs. This should free up additional space (I guess at least 100GB).
> https://plugins.jenkins.io/ws-cleanup/ - we already use this plugin
> to clean workspaces at job start.
> 
> 
> On Thu, Apr 30, 2020, 07:33 Maximilian Michels  <mailto:m...@apache.org>> wrote:
> 
> *It's working again, probably because it's running on a different
> machine now.
> 
> Who can check the disk space of the Jenkins hosts?
> 
> Thanks,
> Max
> 
> On 30.04.20 11:55, Maximilian Michels wrote:
> > Sorry, I meant to include the Jenkins log:
> >
> 
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console
> >
> > Thanks for investigating Hannah! Indeed, I can see the no
> space left on
> > device in the following but not in the log above:
> >
> 
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/473/console
> >
> > I'm going to try running the build again. Do you think we
> could add more
> > storage to our Jenkins hosts or delete old build data?
> >
> > Thanks,
> > Max
> >
> > On 30.04.20 08:43, Hannah Jiang wrote:
> >> Max, I found a link from your PR and noticed below errors.
> This would be
> >> the true error.
> >>
> >> *07:57:03* >*Task :sdks:python:container:py37:docker*
> >> *07:57:03*  [91mERROR: Could not install packages due to an
> EnvironmentError: [Errno 28] No space left on device
> >> *07:57:03*
> >> *07:57:03*  [0m
> >> *07:57:03* >*Task :sdks:python:container:py35:docker*
> >> *07:57:03*  [91mERROR: Could not install packages due to an
> EnvironmentError: [Errno 28] No space left on device
> >>
> >>
> >>
> >> On Wed, Apr 29, 2020 at 5:59 PM Hannah Jiang
> mailto:hannahji...@google.com>
> >> <mailto:hannahji...@google.com
> <mailto:hannahji...@google.com>>> wrote:
> >>
> >>     There is a PythonDocker Precommit test running for PRs
> with Python
> >>     changes. It seems running well.[1]
> >>     Max, can you please give me a link so I can check more
> details? Do
> >>     other images with different Python versions fail as well?
> >>
> >>   
>  1. https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/
> >>
> >>
> >>     On Wed, Apr 29, 2020 at 2:44 PM Ahmet Altay
> mailto:al...@google.com>
> >>     <mailto:al...@google.com <mailto:al...@google.com>>> wrote:
> >>
> >>         +Valentyn Tymofieiev <mailto:valen...@google.com
> <mailto:valen...@google.com>> +Hannah Jiang
> >>         <mailto:hannahji...@google.com
> <mailto:hannahji...@google.com>> -- in case they have relevant
> >>         information.
> >>
> >>         On Wed, Apr 29, 2020 at 12:35 PM Maximilian Michels
> >>         mailto:m...@apache.org>
> <mailto:m...@apache.org <mailto:m...@apache.org>>> wrote:
> >>
> >>             Hi,
> >>
> >>             has anyone noticed the Python 3.7 Docker
> container fails to
> >>             build? I
> >>

"DNS resolution failed"

Hi,

Is anyone familiar with this GRPC error? The build logs are full of it.
Also getting it on my machine when I run tests:

23:17:02 ERROR:apache_beam.runners.worker.data_plane:Failed to read inputs in 
the data plane.
23:17:02 Traceback (most recent call last):
23:17:02   File "apache_beam/runners/worker/data_plane.py", line 528, in 
_read_inputs
23:17:02 for elements in elements_iterator:
23:17:02   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Phrase/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
 line 413, in next
23:17:02 return self._next()
23:17:02   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Phrase/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
 line 689, in _next
23:17:02 raise self
23:17:02 _MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that 
terminated with:
23:17:02status = StatusCode.UNAVAILABLE
23:17:02details = "DNS resolution failed"
23:17:02debug_error_string = 
"{"created":"@1588108621.907750662","description":"Failed to pick 
subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3981,"referenced_errors":[{"created":"@1588108621.907745000","description":"Resolver
 transient 
failure","file":"src/core/ext/filters/client_channel/resolving_lb_policy.cc","file_line":214,"referenced_errors":[{"created":"@1588108621.907743049","description":"DNS
 resolution 
failed","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/dns_resolver_ares.cc","file_line":357,"grpc_status":14,"referenced_errors":[{"created":"@1588108621.907719737","description":"C-ares
 status is not ARES_SUCCESS: Misformatted domain 
name","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244,"referenced_errors":[{"created":"@1588108621.907691960","description":"C-ares
 status is not ARES_SUCCESS: Misformatted domain 
name","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244}]}]}]}]}"

https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Phrase/158/console

Looks like a recent regression. Tracked here:
https://jira.apache.org/jira/browse/BEAM-9851

Thanks,
Max

Re: Python 3.7 docker container fails to build

*It's working again, probably because it's running on a different
machine now.

Who can check the disk space of the Jenkins hosts?

Thanks,
Max

On 30.04.20 11:55, Maximilian Michels wrote:
> Sorry, I meant to include the Jenkins log:
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console
> 
> Thanks for investigating Hannah! Indeed, I can see the no space left on
> device in the following but not in the log above:
> https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/473/console
> 
> I'm going to try running the build again. Do you think we could add more
> storage to our Jenkins hosts or delete old build data?
> 
> Thanks,
> Max
> 
> On 30.04.20 08:43, Hannah Jiang wrote:
>> Max, I found a link from your PR and noticed below errors. This would be
>> the true error.
>>
>> *07:57:03* >*Task :sdks:python:container:py37:docker*
>> *07:57:03*  [91mERROR: Could not install packages due to an 
>> EnvironmentError: [Errno 28] No space left on device
>> *07:57:03*
>> *07:57:03*  [0m
>> *07:57:03* >*Task :sdks:python:container:py35:docker*
>> *07:57:03*  [91mERROR: Could not install packages due to an 
>> EnvironmentError: [Errno 28] No space left on device
>>
>>
>>
>> On Wed, Apr 29, 2020 at 5:59 PM Hannah Jiang > <mailto:hannahji...@google.com>> wrote:
>>
>> There is a PythonDocker Precommit test running for PRs with Python
>> changes. It seems running well.[1]
>> Max, can you please give me a link so I can check more details? Do
>> other images with different Python versions fail as well?
>>
>> 1. https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/
>>
>>
>> On Wed, Apr 29, 2020 at 2:44 PM Ahmet Altay > <mailto:al...@google.com>> wrote:
>>
>> +Valentyn Tymofieiev <mailto:valen...@google.com> +Hannah Jiang
>> <mailto:hannahji...@google.com> -- in case they have relevant
>> information.
>>
>> On Wed, Apr 29, 2020 at 12:35 PM Maximilian Michels
>> mailto:m...@apache.org>> wrote:
>>
>> Hi,
>>
>> has anyone noticed the Python 3.7 Docker container fails to
>> build? I
>> haven't been able to build the Python 3.7 container, neither
>> locally nor
>> on Jenkins.
>>
>> I get:
>>
>> 17:48:10 > Task :sdks:python:container:py37:docker
>> 17:49:36 The command '/bin/sh -c pip install -r
>> /tmp/base_image_requirements.txt && python -c "from
>> google.protobuf.internal import api_implementation; assert
>> api_implementation._default_implementation_type == 'cpp'; print
>> ('Verified fast protobuf used.')" && rm -rf
>> /root/.cache/pip' returned a
>> non-zero code: 1
>> 17:49:36
>> 17:49:36 > Task :sdks:python:container:py37:docker FAILED
>>
>>
>> Cheers,
>> Max
>>

Re: Python 3.7 docker container fails to build

Sorry, I meant to include the Jenkins log:
https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console

Thanks for investigating Hannah! Indeed, I can see the no space left on
device in the following but not in the log above:
https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/473/console

I'm going to try running the build again. Do you think we could add more
storage to our Jenkins hosts or delete old build data?

Thanks,
Max

On 30.04.20 08:43, Hannah Jiang wrote:
> Max, I found a link from your PR and noticed below errors. This would be
> the true error.
> 
> *07:57:03* >*Task :sdks:python:container:py37:docker*
> *07:57:03*  [91mERROR: Could not install packages due to an EnvironmentError: 
> [Errno 28] No space left on device
> *07:57:03*
> *07:57:03*  [0m
> *07:57:03* >*Task :sdks:python:container:py35:docker*
> *07:57:03*  [91mERROR: Could not install packages due to an EnvironmentError: 
> [Errno 28] No space left on device
> 
> 
> 
> On Wed, Apr 29, 2020 at 5:59 PM Hannah Jiang  <mailto:hannahji...@google.com>> wrote:
> 
> There is a PythonDocker Precommit test running for PRs with Python
> changes. It seems running well.[1]
> Max, can you please give me a link so I can check more details? Do
> other images with different Python versions fail as well?
> 
> 1. https://builds.apache.org/job/beam_PreCommit_PythonDocker_Commit/
> 
> 
> On Wed, Apr 29, 2020 at 2:44 PM Ahmet Altay  <mailto:al...@google.com>> wrote:
> 
> +Valentyn Tymofieiev <mailto:valen...@google.com> +Hannah Jiang
> <mailto:hannahji...@google.com> -- in case they have relevant
> information.
> 
> On Wed, Apr 29, 2020 at 12:35 PM Maximilian Michels
> mailto:m...@apache.org>> wrote:
> 
> Hi,
> 
> has anyone noticed the Python 3.7 Docker container fails to
> build? I
> haven't been able to build the Python 3.7 container, neither
> locally nor
> on Jenkins.
> 
> I get:
> 
> 17:48:10 > Task :sdks:python:container:py37:docker
> 17:49:36 The command '/bin/sh -c pip install -r
> /tmp/base_image_requirements.txt && python -c "from
> google.protobuf.internal import api_implementation; assert
> api_implementation._default_implementation_type == 'cpp'; print
> ('Verified fast protobuf used.')" && rm -rf
> /root/.cache/pip' returned a
> non-zero code: 1
> 17:49:36
> 17:49:36 > Task :sdks:python:container:py37:docker FAILED
> 
> 
> Cheers,
> Max
>

Python 3.7 docker container fails to build

2020-04-29 Thread Maximilian Michels

Hi,

has anyone noticed the Python 3.7 Docker container fails to build? I
haven't been able to build the Python 3.7 container, neither locally nor
on Jenkins.

I get:

17:48:10 > Task :sdks:python:container:py37:docker
17:49:36 The command '/bin/sh -c pip install -r
/tmp/base_image_requirements.txt && python -c "from
google.protobuf.internal import api_implementation; assert
api_implementation._default_implementation_type == 'cpp'; print
('Verified fast protobuf used.')" && rm -rf /root/.cache/pip' returned a
non-zero code: 1
17:49:36
17:49:36 > Task :sdks:python:container:py37:docker FAILED


Cheers,
Max

Re: JIRA Committer Permissions

2020-04-28 Thread Maximilian Michels

FWIW committer and contributor roles in Beam's JIRA have practically
identical permissions[1]. The only different is the "Set Issue Security"
and "Manage Watchers" permissions, both of which I have never used before.

Thanks for updating the wiki page!

-Max

[1]
https://jira.apache.org/jira/plugins/servlet/project-config/BEAM/permissions

On 28.04.20 06:09, Kenneth Knowles wrote:
> I think it would be very valuable to have a committer onboarding guide,
> with info for both the committer and steps for PMC to take. I think the
> wiki is the right place for it...
> 
> (two seconds of checking later)
> 
> It
> exists! 
> https://cwiki.apache.org/confluence/display/BEAM/Committer+onboarding+guide
> 
> By the time you read this, I hope to have added the links that Luke
> suggested. We do need to remember to send this guide to new committers.
> And potentially announce changes that existing committers may not have
> followed.
> 
> Kenn
> 
> On Mon, Apr 27, 2020 at 7:23 PM Luke Cwik  > wrote:
> 
> The Beam committer guide is about reviewing code and the "become a
> committer" is more about what we look for and not the process.
> 
> Since this is common for all ASF projects, I suspect an ASF page may
> have this documented or should be updated to have this covered as
> well but didn't see it on the ASF new committers resources page[1]
> or on the developers & contributors overview[2].
> 
> 1: https://www.apache.org/dev/new-committers-guide.html
> 2: https://www.apache.org/dev/index.html
> 
> On Mon, Apr 27, 2020 at 3:32 PM Udi Meiri  > wrote:
> 
> Should this step be added to our new committer guide?
> 
> On Fri, Apr 24, 2020 at 6:21 PM Luke Cwik  > wrote:
> 
> I noticed that several committers only had contributor level
> permissions and I went and updated your account permissions
> for the Beam project to be committer level. Feel free to let
> me know If you run into any issues.
> 
> There were about ~25 accounts like this.
>

Github PR links in JIRA

2020-04-27 Thread Maximilian Michels

Hi everyone,

Did anyone notice that GitHub PRs tagged with a JIRA issue
("[BEAM-XXX]") do not automatically get linked anymore in JIRA?

Does anyone know how that stuff works?

Thanks,
Max

Re: [RESULT][VOTE] Accept the Firefly design donation as Beam Mascot - Deadline Mon April 6

2020-04-26 Thread Maximilian Michels

Hey Maria,

I can testify :)

Cheers,
Max

On 23.04.20 20:49, María Cruz wrote:
> Hi everyone!
> It is amazing to see how this process developed to collaboratively
> create Apache Beam's mascot. Thank you to everyone who got involved! 
> I would like to write a blogpost for the Beam website, and I wanted to
> ask you: would anyone like to offer their testimony about the process of
> creating the Beam mascot, and what this means to you? Everyone's
> testimony is welcome! If you witnessed the development of a mascot for
> another open source project, even better =) 
> 
> Please feel free to express interest on this thread, and I'll reach out
> to you off-list. 
> 
> Thanks, 
> 
> María
> 
> On Fri, Apr 17, 2020 at 6:19 AM Jeff Klukas  <mailto:jklu...@mozilla.com>> wrote:
> 
> I personally like the sound of "Datum" as a name. I also like the
> idea of not assigning them a gender.
> 
> As a counterpoint on the naming side, one of the slide decks
> provided while iterating on the design mentions:
> 
> > Mascot can change colors when it is “full of data” or has a “batch
> of data” to process.  Yellow is supercharged and ready to process!
> 
> Based on that, I'd argue that the mascot maps to the concept of a
> bundle in the beam execution model and we should consider a name
> that's a play on "bundle" or perhaps a play on "checkpoint".
> 
> On Thu, Apr 16, 2020 at 3:44 PM Julian Bruno  <mailto:juliangbr...@gmail.com>> wrote:
> 
> Hi all,
> 
> While working on the design of our Mascot
> Some ideas showed up and I wish to share them.
> In regard to Alex Van Boxel's question about the name of our Mascot.
>  
> I was thinking about this yesterday night and feel it could be a
> great idea to name the Mascot "*Data*" or "*Datum*". Both names
> sound cute and make sense to me. I prefer the later. Datum means
> a single piece of information. The Mascot is the first piece of
> information and its job is to collect batches of data and
> process it. Datum is in charge of linking information together.
> 
> In addition, our Mascot should have no gender. Rendering it
> accessible to all users. 
> 
> Beam as a name for the mascot is pretty straight forward but I
> think there are many things carrying that same name already.
> 
> What do you think?
> 
> Looking forward to hearing your feedback. Names are important
> and I feel it can expand the personality and create a cool
> background for our Mascot.
> 
> Cheers!
> 
> Julian 
> 
> On Mon, Apr 13, 2020, 3:40 PM Kyle Weaver  <mailto:kcwea...@google.com>> wrote:
> 
> Beam Firefly is fine with me (I guess people tend to forget
> mascot names anyway). But if anyone comes up with something
> particularly cute/clever we can consider it.
> 
> On Mon, Apr 13, 2020 at 6:33 PM Aizhamal Nurmamat kyzy
> mailto:aizha...@apache.org>> wrote:
> 
> @Alex, Beam Firefly?
> 
> On Thu, Apr 9, 2020 at 10:57 PM Alex Van Boxel
> mailto:a...@vanboxel.be>> wrote:
> 
> We forgot something      
> 
> ...
> 
> ...
> 
> it/she/he needs a *name*!
> 
> 
>  _/
> _/ Alex Van Boxel
> 
> 
> On Fri, Apr 10, 2020 at 6:19 AM Kenneth Knowles
> mailto:k...@apache.org>> wrote:
> 
> Looking forward to the guide. I enjoy doing
> (bad) drawings as a way to relax. And I want
> them to be properly on brand :-)
> 
> Kenn
> 
> On Thu, Apr 9, 2020 at 10:35 AM Maximilian
> Michels mailto:m...@apache.org>>
> wrote:
> 
> Awesome. What a milestone! The mascot is a
> real eye catcher. Thank you
> Julian and Aizhamal for making it happen.
> 
> On 06.04.20 22:05, Aizhamal Nurmamat kyzy wrote:
> > I am happy to announce that this vote has
> passed, with 13 approving +1
> > votes, 5 of which are binding PMC votes.
> >
>

Re: [ANNOUNCE] Beam 2.20.0 Released

2020-04-24 Thread Maximilian Michels

Thanks Rui for getting this one out!

-Max

On 24.04.20 15:03, Jan Lukavský wrote:
> Hi Rui,
> 
> thanks making for this release! Is is possible we are missing git tag
> for this release? I cannot find it.
> 
> Thanks,
> 
>  Jan
> 
> On 4/16/20 8:47 PM, Rui Wang wrote:
>> Note that due to a bug on infrastructure, the website change failed to
>> publish. But 2.20.0 artifacts are available to use right now.
>>
>>
>>
>> -Rui
>>
>> On Thu, Apr 16, 2020 at 11:45 AM Rui Wang > > wrote:
>>
>> The Apache Beam team is pleased
>> to announce the release of version 2.20.0.
>>
>> Apache Beam is an open source unified programming model to define and
>> execute data processing pipelines, including ETL, batch and stream
>> (continuous) processing. See https://beam.apache.org
>> 
>>
>> You can download the release here:
>>
>>     https://beam.apache.org/get-started/downloads/ 
>>
>> This release includes bug fixes, features, and improvements
>> detailed on
>> the Beam
>> blog: https://beam.apache.org/blog/2020/04/15/beam-2.20.0.html
>>
>> Thanks to everyone who contributed to this release, and we hope
>> you enjoy
>> using Beam 2.20.0.
>> -- Rui Wang, on behalf of The Apache Beam team
>>

Re: Beam Digital Summit 2020 -- JUNE 2020!

 Looking forward to this!

Cheers,
Max

On 22.04.20 21:09, Austin Bennett wrote:
> Hi All,
> 
> We are excited to announce the Beam Digital Summit 2020!
> 
> This will occur for partial days during the week of 15-19 June.
> 
> CfP is open and found: https://sessionize.com/beam-digital-summit-2020/
> 
> CfP closes on 20 May 2020.  Do not hesitate to reach out to the
> organizers with any questions.  
> 
> See you there (online)!
> Austin, on behalf of the Beam Summit Steering Committee 
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Beam Summit Steering" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to beam-summit-steering+unsubscr...@googlegroups.com
> .
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beam-summit-steering/CAEbFGqvhSeqPa_%3DcH2FYxtWGObR2iXqFwmwox77%2B_0%3DoGrCf9A%40mail.gmail.com
> .

Re: [Python] Setting a timer from a timer callback

Attempting to fix this here, if somebody could have a look:
https://github.com/apache/beam/pull/11492

On 22.04.20 17:10, Maximilian Michels wrote:
> Hi,
> 
> I'm trying to set a timer from a timer callback in the Python SDK:
> 
> class MyFn(beam.DoFn):
>   timer_spec = userstate.TimerSpec('timer', userstate.TimeDomain.WATERMARK)
> 
>   def process(self, element, timer=beam.DoFn.TimerParam(timer_spec)):
> self.key = element[0]
> timer.set(0)
> 
>   @userstate.on_timer(timer_spec)
>   def process_timer(self, timer=beam.DoFn.TimerParam(timer_spec)):
> timer.set(0)
> 
> This yields the following Python stack trace:
> 
> INFO:apache_beam.utils.subprocess_server:Caused by:
> java.lang.RuntimeException: Error received from SDK harness for
> instruction 4: Traceback (most recent call last):
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/worker/sdk_worker.py", line 245, in _execute
> INFO:apache_beam.utils.subprocess_server: response = task()
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/worker/sdk_worker.py", line 302, in 
> INFO:apache_beam.utils.subprocess_server: lambda:
> self.create_worker().do_instruction(request), request)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction
> INFO:apache_beam.utils.subprocess_server: getattr(request,
> request_type), request.instruction_id)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle
> INFO:apache_beam.utils.subprocess_server:
> bundle_processor.process_bundle(instruction_id))
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/worker/bundle_processor.py", line 910, in
> process_bundle
> INFO:apache_beam.utils.subprocess_server: element.timer_family_id,
> timer_data)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/worker/operations.py", line 688, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer_data.fire_timestamp)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/common.py", line 990, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: self._reraise_augmented(exn)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/common.py", line 1043, in _reraise_augmented
> INFO:apache_beam.utils.subprocess_server: raise_with_traceback(new_exn)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/common.py", line 988, in process_user_timer
> INFO:apache_beam.utils.subprocess_server:
> self.do_fn_invoker.invoke_user_timer(timer_spec, key, window, timestamp)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/common.py", line 517, in invoke_user_timer
> INFO:apache_beam.utils.subprocess_server: self.user_state_context, key,
> window, timestamp))
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/common.py", line 1093, in process_outputs
> INFO:apache_beam.utils.subprocess_server: for result in results:
> INFO:apache_beam.utils.subprocess_server: File
> "/Users/max/Dev/beam/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
> line 185, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer.set(0)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/runners/worker/bundle_processor.py", line 589, in set
> INFO:apache_beam.utils.subprocess_server:
> self._timer_coder_impl.encode_to_stream(timer, self._output_stream, True)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/coders/coder_impl.py", line 651, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: value.hold_timestamp, out, True)
> INFO:apache_beam.utils.subprocess_server: File
> "apache_beam/coders/coder_impl.py", line 608, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: millis = value.micros // 1000
> INFO:apache_beam.utils.subprocess_server:AttributeError: 'NoneType'
> object has no attribute 'micros' [while running 'GenerateLoad']
> 
> Looking at the code base, I'm not sure we have tests for timer output
> timestamps. Am I missing something?
> 
> -Max
>

[Python] Setting a timer from a timer callback

Hi,

I'm trying to set a timer from a timer callback in the Python SDK:

class MyFn(beam.DoFn):
  timer_spec = userstate.TimerSpec('timer', userstate.TimeDomain.WATERMARK)

  def process(self, element, timer=beam.DoFn.TimerParam(timer_spec)):
self.key = element[0]
timer.set(0)

  @userstate.on_timer(timer_spec)
  def process_timer(self, timer=beam.DoFn.TimerParam(timer_spec)):
timer.set(0)

This yields the following Python stack trace:

INFO:apache_beam.utils.subprocess_server:Caused by:
java.lang.RuntimeException: Error received from SDK harness for
instruction 4: Traceback (most recent call last):
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/worker/sdk_worker.py", line 245, in _execute
INFO:apache_beam.utils.subprocess_server: response = task()
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/worker/sdk_worker.py", line 302, in 
INFO:apache_beam.utils.subprocess_server: lambda:
self.create_worker().do_instruction(request), request)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction
INFO:apache_beam.utils.subprocess_server: getattr(request,
request_type), request.instruction_id)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle
INFO:apache_beam.utils.subprocess_server:
bundle_processor.process_bundle(instruction_id))
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/worker/bundle_processor.py", line 910, in
process_bundle
INFO:apache_beam.utils.subprocess_server: element.timer_family_id,
timer_data)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/worker/operations.py", line 688, in process_timer
INFO:apache_beam.utils.subprocess_server: timer_data.fire_timestamp)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/common.py", line 990, in process_user_timer
INFO:apache_beam.utils.subprocess_server: self._reraise_augmented(exn)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/common.py", line 1043, in _reraise_augmented
INFO:apache_beam.utils.subprocess_server: raise_with_traceback(new_exn)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/common.py", line 988, in process_user_timer
INFO:apache_beam.utils.subprocess_server:
self.do_fn_invoker.invoke_user_timer(timer_spec, key, window, timestamp)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/common.py", line 517, in invoke_user_timer
INFO:apache_beam.utils.subprocess_server: self.user_state_context, key,
window, timestamp))
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/common.py", line 1093, in process_outputs
INFO:apache_beam.utils.subprocess_server: for result in results:
INFO:apache_beam.utils.subprocess_server: File
"/Users/max/Dev/beam/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
line 185, in process_timer
INFO:apache_beam.utils.subprocess_server: timer.set(0)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/runners/worker/bundle_processor.py", line 589, in set
INFO:apache_beam.utils.subprocess_server:
self._timer_coder_impl.encode_to_stream(timer, self._output_stream, True)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/coders/coder_impl.py", line 651, in encode_to_stream
INFO:apache_beam.utils.subprocess_server: value.hold_timestamp, out, True)
INFO:apache_beam.utils.subprocess_server: File
"apache_beam/coders/coder_impl.py", line 608, in encode_to_stream
INFO:apache_beam.utils.subprocess_server: millis = value.micros // 1000
INFO:apache_beam.utils.subprocess_server:AttributeError: 'NoneType'
object has no attribute 'micros' [while running 'GenerateLoad']

Looking at the code base, I'm not sure we have tests for timer output
timestamps. Am I missing something?

-Max

Re: Running NexMark Tests

The flag is needed when checkpointing is enabled because Flink is unable
to create a new checkpoint when not all operators are running.

By default, operators shut down when all input has been read. That will
trigger sending out the maximum (final) watermark at the sources. The
flag name is a bit confusing in this regard because shutting down the
sources triggers sending out the watermark, not the other way around.

-Max

On 22.04.20 06:26, Kenneth Knowles wrote:
> We should always want to shut down sources on final watermark. All
> incoming data should be dropped anyhow.
> 
> Kenn
> 
> On Tue, Apr 21, 2020 at 1:34 PM Luke Cwik  > wrote:
> 
> +dev
> 
> When would we not want --shutdownSourcesOnFinalWatermark=true ?
> 
> On Tue, Apr 21, 2020 at 1:22 PM Ismaël Mejía  > wrote:
> 
> You need to instruct the Flink runner to shutdown the the source
> otherwise it will stay waiting.
> You can this by adding the extra
> argument`--shutdownSourcesOnFinalWatermark=true`
> And if that works and you want to open a PR to update our
> documentation that would be greatly appreciated.
> 
> Regards,
> Ismaël
> 
> 
> On Tue, Apr 21, 2020 at 10:04 PM Sruthi Sree Kumar
>  > wrote:
> >
> > Hello,
> >
> > I am trying to run nexmark queries using flink runner
> streaming. Followed the documentation and used the command
> > ./gradlew :sdks:java:testing:nexmark:run \
> >
> >     -Pnexmark.runner=":runners:flink:1.10" \
> >     -Pnexmark.args="
> >         --runner=FlinkRunner
> >         --suite=SMOKE
> >         --streamTimeout=60
> >         --streaming=true
> >         --manageResources=false
> >         --monitorJobs=true
> >         --flinkMaster=[local]"
> >
> >
> > But after the events are read from the source, there is no
> further progress and the job is always stuck at 99%. Is there
> any configuration that I am missing?
> >
> > Regards,
> > Sruthi
>

Re: Flink Runner with RequiresStableInput fails after a certain number of checkpoints