Not directly related to the 'flakiness' discussion of this thread, but I
think it would be good if pre-commit checks could be run locally without
GCP credentials.
On 8/16/21 6:24 PM, Luke Cwik wrote:
The fix was inadvertently run in dry run mode so didn't make any
changes. Since the fix was taking a couple of hours or so and it was
getting late on Friday people didn't want to start it again till today
(after the weekend).
I don't think removing the few tests that run an unbounded pipeline on
Dataflow for a long term is a good idea. Sure, we can disable them and
re-enable them when there is an issue that is blocking folks.
On Mon, Aug 16, 2021 at 9:19 AM Andrew Pilloud <[email protected]
<mailto:[email protected]>> wrote:
The two hours to estimated fix has long passed and we are now at
18 days since the last successful run. What is the latest estimate?
It sounds like these tests are primarily testing Dataflow, not
Beam. They seem like good candidates to remove from the
precommit (or limit to Dataflow runner changes) even after they
are fixed.
On Fri, Aug 13, 2021 at 6:48 PM Luke Cwik <[email protected]
<mailto:[email protected]>> wrote:
The failure is related due to data that is associated with the
apache-beam-testing project which is impacting all the
Dataflow streaming tests.
Yes, disabling the tests should have happened weeks ago if:
1) The fix seemed like it was going to take a long time (was
unknown at the time)
2) We had confidence in test coverage minus Dataflow streaming
test coverage (which I believe we did)
On Fri, Aug 13, 2021 at 6:27 PM Andrew Pilloud
<[email protected] <mailto:[email protected]>> wrote:
Or if a rollback won't fix this, can we disable the broken
tests?
On Fri, Aug 13, 2021 at 6:25 PM Andrew Pilloud
<[email protected] <mailto:[email protected]>> wrote:
So you can roll back in two hours. Beam has been
broken for two weeks. Why isn't a rollback appropriate?
On Fri, Aug 13, 2021 at 6:06 PM Luke Cwik
<[email protected] <mailto:[email protected]>> wrote:
From the test failures that I have seen they have
been because of BEAM-12676[1] which is due to a
bug impacting Dataflow streaming pipelines for the
apache-beam-testing project. The fix is rolling
out now from my understanding and should take
another 2hrs or so. Rolling back master doesn't
seem like what we should be doing at the moment.
1:
https://issues.apache.org/jira/projects/BEAM/issues/BEAM-12676
<https://issues.apache.org/jira/projects/BEAM/issues/BEAM-12676>
On Fri, Aug 13, 2021 at 5:51 PM Andrew Pilloud
<[email protected] <mailto:[email protected]>>
wrote:
Both java and python precommits are reporting
the last successful run being in July (for
both Cron and Precommit), so it looks like
changes are being submitting without
successful test runs. We probably shouldn't be
doing that?
https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/
<https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/>
https://ci-beam.apache.org/job/beam_PreCommit_Python_Commit/
<https://ci-beam.apache.org/job/beam_PreCommit_Python_Commit/>
https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Cron/
<https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Cron/>
https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Commit/
<https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Commit/>
Is there a plan to get this fixed? Should we
roll master back to July?
On Tue, Aug 3, 2021 at 12:24 PM Tyson Hamilton
<[email protected]
<mailto:[email protected]>> wrote:
I only realized after sending that I used
the IP for the link, that was by accident,
here is the proper domain link:
http://metrics.beam.apache.org/d/D81lW0pmk/post-commit-test-reliability?orgId=1
<http://metrics.beam.apache.org/d/D81lW0pmk/post-commit-test-reliability?orgId=1>
On Tue, Aug 3, 2021 at 3:22 PM Tyson
Hamilton <[email protected]
<mailto:[email protected]>> wrote:
The way I've investigated precommit
flake stability is by looking at the
'Post-commit Test Reliability' [1]
dashboard (hah!). There is a cron job
that runs precommits and those results
are tracked in the post commit
dashboard confusingly. This week, Java
is about 50% green for the pre-commit
cron job, not great.
The plugin we installed for tracking
the most flaky tests for a job doesn't
do well for the number of tests
present in the precommit cron job.
This could be an area of improvement
to help add granularity and visibility
to the flakiest tests over some period
of time.
[1]:
http://104.154.241.245/d/D81lW0pmk/post-commit-test-reliability?orgId=1
<http://104.154.241.245/d/D81lW0pmk/post-commit-test-reliability?orgId=1>
(look for "PreCommit_Java_Cron")
On Tue, Aug 3, 2021 at 2:24 PM Andrew
Pilloud <[email protected]
<mailto:[email protected]>> wrote:
Our metrics show java is nearly
free from flakes, that go has
significant flakes, and that
python is effectively broken. It
appears they may be missing
coverage on the Java side. The
dashboard is here:
http://104.154.241.245/d/McTAiu0ik/stability-critical-jobs-status?orgId=1
<http://104.154.241.245/d/McTAiu0ik/stability-critical-jobs-status?orgId=1>
I agree that this is important to
address. I haven't submitted any
code recently but I spent a
significant amount of time on the
2.31.0 release investigating
flakes in the release
validation tests.
Andrew
On Tue, Aug 3, 2021 at 10:43 AM
Reuven Lax <[email protected]
<mailto:[email protected]>> wrote:
I've noticed recently that our
precommit tests are getting
flakier and flakier. Recently
I had to run Java PreCommit 5
times before I was able to get
a clean run. This is
frustrating for us as
developers, but it also is
extremely wasteful of our
compute resources.
I started making a list of the
flaky tests I've seen. Here
are some of the ones I've
dealt with just the past few
days; this is not nearly an
exhaustive list - I've seen
many others before I started
recording them. Of the below,
failures in
ElasticsearchIOTest are by far
the most common!
We need to try and make these
tests not flaky. Barring that,
I think the extremely flaky
tests need to be excluded from
our presubmit until they can
be fixed. Rerunning the
precommit over and over again
till green is not a good
testing strategy.
*
org.apache.beam.runners.flink.ReadSourcePortableTest.testExecution[streaming:
false]
<https://ci-beam.apache.org/job/beam_PreCommit_Java_Phrase/3901/testReport/junit/org.apache.beam.runners.flink/ReadSourcePortableTest/testExecution_streaming__false_/>
*
org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety
<https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/18485/testReport/junit/org.apache.beam.sdk.io.jms/JmsIOTest/testCheckpointMarkSafety/>
*
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInFinishBundleStateful
<https://ci-beam.apache.org/job/beam_PreCommit_Java_Phrase/3903/testReport/junit/org.apache.beam.sdk.transforms/ParDoLifecycleTest/testTeardownCalledAfterExceptionInFinishBundleStateful/>
*
org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest.testSplit
<https://ci-beam.apache.org/job/beam_PreCommit_Java_Phrase/3903/testReport/junit/org.apache.beam.sdk.io.elasticsearch/ElasticsearchIOTest/testSplit/>
*
org.apache.beam.sdk.io.gcp.datastore.RampupThrottlingFnTest.testRampupThrottler
<https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/18501/testReport/junit/org.apache.beam.sdk.io.gcp.datastore/RampupThrottlingFnTest/testRampupThrottler/>