Flaky test issue report (51)
This is your daily summary of Beam's current flaky tests (https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20labels%20%3D%20flake) These are P1 issues because they have a major negative impact on the community and make it hard to determine the quality of the software. https://issues.apache.org/jira/browse/BEAM-14172: beam_PreCommit_PythonDocs failing (jinja2) (created 2022-03-24) https://issues.apache.org/jira/browse/BEAM-13952: Dataflow streaming tests failing new AfterSynchronizedProcessingTime test (created 2022-02-15) https://issues.apache.org/jira/browse/BEAM-13859: Test flake: test_split_half_sdf (created 2022-02-09) https://issues.apache.org/jira/browse/BEAM-13850: beam_PostCommit_Python_Examples_Dataflow failing (created 2022-02-08) https://issues.apache.org/jira/browse/BEAM-13822: GBK and CoGBK streaming Java load tests failing (created 2022-02-03) https://issues.apache.org/jira/browse/BEAM-13810: Flaky tests: Gradle build daemon disappeared unexpectedly (created 2022-02-03) https://issues.apache.org/jira/browse/BEAM-13809: beam_PostCommit_XVR_Flink flaky: Connection refused (created 2022-02-03) https://issues.apache.org/jira/browse/BEAM-13797: Flakes: Failed to load cache entry (created 2022-02-01) https://issues.apache.org/jira/browse/BEAM-13783: apache_beam.transforms.combinefn_lifecycle_test.LocalCombineFnLifecycleTest.test_combine is flaky (created 2022-02-01) https://issues.apache.org/jira/browse/BEAM-13741: :sdks:java:extensions:sql:hcatalog:compileJava failing in beam_Release_NightlySnapshot (created 2022-01-25) https://issues.apache.org/jira/browse/BEAM-13708: flake: FlinkRunnerTest.testEnsureStdoutStdErrIsRestored (created 2022-01-20) https://issues.apache.org/jira/browse/BEAM-13575: Flink testParDoRequiresStableInput flaky (created 2021-12-28) https://issues.apache.org/jira/browse/BEAM-13519: Java precommit flaky (timing out) (created 2021-12-22) https://issues.apache.org/jira/browse/BEAM-13500: NPE in Flink Portable ValidatesRunner streaming suite (created 2021-12-21) https://issues.apache.org/jira/browse/BEAM-13453: Flake in org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use (created 2021-12-13) https://issues.apache.org/jira/browse/BEAM-13393: GroupIntoBatchesTest is failing (created 2021-12-07) https://issues.apache.org/jira/browse/BEAM-13367: [beam_PostCommit_Python36] [ apache_beam.io.gcp.experimental.spannerio_read_it_test] Failure summary (created 2021-12-01) https://issues.apache.org/jira/browse/BEAM-13312: org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle is flaky in Java Spark ValidatesRunner suite (created 2021-11-23) https://issues.apache.org/jira/browse/BEAM-13311: org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful is flaky in Java ValidatesRunner Flink suite. (created 2021-11-23) https://issues.apache.org/jira/browse/BEAM-13237: org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView flaky on Dataflow Runner V2 (created 2021-11-12) https://issues.apache.org/jira/browse/BEAM-13234: Flake in StreamingWordCountIT.test_streaming_wordcount_it (created 2021-11-12) https://issues.apache.org/jira/browse/BEAM-13025: pubsublite.ReadWriteIT flaky in beam_PostCommit_Java_DataflowV2 (created 2021-10-08) https://issues.apache.org/jira/browse/BEAM-12928: beam_PostCommit_Python36 - CrossLanguageSpannerIOTest - flakey failing (created 2021-09-21) https://issues.apache.org/jira/browse/BEAM-12859: org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer is flaky (created 2021-09-08) https://issues.apache.org/jira/browse/BEAM-12858: org.apache.beam.sdk.io.gcp.datastore.RampupThrottlingFnTest.testRampupThrottler is flaky (created 2021-09-08) https://issues.apache.org/jira/browse/BEAM-12809: testTwoTimersSettingEachOtherWithCreateAsInputBounded flaky (created 2021-08-26) https://issues.apache.org/jira/browse/BEAM-12794: PortableRunnerTestWithExternalEnv.test_pardo_timers flaky (created 2021-08-24) https://issues.apache.org/jira/browse/BEAM-12793: beam_PostRelease_NightlySnapshot failed (created 2021-08-24) https://issues.apache.org/jira/browse/BEAM-12766: Already Exists: Dataset apache-beam-testing:python_bq_file_loads_NNN (created 2021-08-16) https://issues.apache.org/jira/browse/BEAM-12673: apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT.test_streaming_wordcount_it flakey (created 2021-07-28) https://issues.apache.org/jira/browse/BEAM-12515: Python PreCommit flaking in PipelineOptionsTest.test_display_data (created 2021-06-18) https://issues.apache.org/jira/browse/BEAM-12322: Python precommit
P1 issues report (73)
This is your daily summary of Beam's current P1 issues, not including flaky tests (https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20priority%20%3D%20P1%20AND%20(labels%20is%20EMPTY%20OR%20labels%20!%3D%20flake). See https://beam.apache.org/contribute/jira-priorities/#p1-critical for the meaning and expectations around P1 issues. https://issues.apache.org/jira/browse/BEAM-14191: CrossLanguageJdbcIOTest broken with "Cannot load JDBC driver class 'com.mysql.cj.jdbc.Driver'" (created 2022-03-28) https://issues.apache.org/jira/browse/BEAM-14181: BQ: Storage API Sink reuses closed connections (created 2022-03-25) https://issues.apache.org/jira/browse/BEAM-14171: CoGroupByKey loses values with large groups on Dataflow v1 (created 2022-03-24) https://issues.apache.org/jira/browse/BEAM-14138: Python PostCommit BQ test failures due to NOT_FOUND for Dataset (created 2022-03-21) https://issues.apache.org/jira/browse/BEAM-14135: BigQuery Storage API insert with writeResult retry and write to error table (created 2022-03-20) https://issues.apache.org/jira/browse/BEAM-14126: Python 3.10 Support (created 2022-03-18) https://issues.apache.org/jira/browse/BEAM-14064: ElasticSearchIO#Write buffering and outputting across windows (created 2022-03-07) https://issues.apache.org/jira/browse/BEAM-14017: beam_PreCommit_CommunityMetrics_Cron is failing. (created 2022-03-01) https://issues.apache.org/jira/browse/BEAM-13953: Document BigQueryIO Storage Write API methods (created 2022-02-15) https://issues.apache.org/jira/browse/BEAM-13952: Dataflow streaming tests failing new AfterSynchronizedProcessingTime test (created 2022-02-15) https://issues.apache.org/jira/browse/BEAM-13950: PVR_Spark2_Streaming perma-red (created 2022-02-15) https://issues.apache.org/jira/browse/BEAM-13920: Beam x-lang Dataflow tests failing due to _InactiveRpcError (created 2022-02-10) https://issues.apache.org/jira/browse/BEAM-13852: KafkaIO.read.withDynamicRead() doesn't pick up new TopicPartitions (created 2022-02-08) https://issues.apache.org/jira/browse/BEAM-13850: beam_PostCommit_Python_Examples_Dataflow failing (created 2022-02-08) https://issues.apache.org/jira/browse/BEAM-13830: XVR Direct/Spark/Flink tests are timing out (created 2022-02-04) https://issues.apache.org/jira/browse/BEAM-13822: GBK and CoGBK streaming Java load tests failing (created 2022-02-03) https://issues.apache.org/jira/browse/BEAM-13805: Simplify version override for Dev versions of the Go SDK. (created 2022-02-02) https://issues.apache.org/jira/browse/BEAM-13798: Upgrade Kubernetes Clusters (created 2022-02-01) https://issues.apache.org/jira/browse/BEAM-13747: Add integration testing for BQ Storage API write modes (created 2022-01-26) https://issues.apache.org/jira/browse/BEAM-13741: :sdks:java:extensions:sql:hcatalog:compileJava failing in beam_Release_NightlySnapshot (created 2022-01-25) https://issues.apache.org/jira/browse/BEAM-13715: Kafka commit offset drop data on failure for runners that have non-checkpointing shuffle (created 2022-01-21) https://issues.apache.org/jira/browse/BEAM-13582: Beam website precommit mentions broken links, but passes. (created 2021-12-30) https://issues.apache.org/jira/browse/BEAM-13487: WriteToBigQuery Dynamic table destinations returns wrong tableId (created 2021-12-17) https://issues.apache.org/jira/browse/BEAM-13393: GroupIntoBatchesTest is failing (created 2021-12-07) https://issues.apache.org/jira/browse/BEAM-13164: Race between member variable being accessed due to leaking uninitialized state via OutboundObserverFactory (created 2021-11-01) https://issues.apache.org/jira/browse/BEAM-13132: WriteToBigQuery submits a duplicate BQ load job if a 503 error code is returned from googleapi (created 2021-10-27) https://issues.apache.org/jira/browse/BEAM-13087: apache_beam.runners.portability.fn_api_runner.translations_test.TranslationsTest.test_run_packable_combine_globally 'apache_beam.coders.coder_impl._AbstractIterable' object is not reversible (created 2021-10-20) https://issues.apache.org/jira/browse/BEAM-13078: Python DirectRunner does not emit data at GC time (created 2021-10-18) https://issues.apache.org/jira/browse/BEAM-13076: Python AfterAny, AfterAll do not follow spec (created 2021-10-18) https://issues.apache.org/jira/browse/BEAM-13010: Delete orphaned files (created 2021-10-06) https://issues.apache.org/jira/browse/BEAM-12995: Consumer group with random prefix (created 2021-10-04) https://issues.apache.org/jira/browse/BEAM-12959: Dataflow error in CombinePerKey operation (created 2021-09-26) https://issues.apache.org/jira/browse/BEAM-12867: Either Create or DirectRunner fails to produce all elements to the following transform (created 2021-09-09)
Re: [DISCUSS] Migrate Jira to GitHub Issues?
Thank you for putting this together, Danny! I can help with the label creation task. Anyone else want to help? On Thu, Mar 17, 2022 at 11:55 AM Danny McCormick wrote: > Here's a spreadsheet to sign up if you'd like to help with the migration! > https://docs.google.com/spreadsheets/d/1hqztI7ECf8NjcmfQ8ZfUj6OU0U-6orJzhG6OMufmTFE/edit?usp=sharing > > Thanks, > Danny > > On Thu, Mar 17, 2022 at 1:59 PM Danny McCormick > wrote: > >> Hey everyone, >> >> Aizhamal is currently out for a little bit and asked me to start to put >> together a more detailed plan for migrating from Jira to GitHub since we >> seem to have consensus here (or close to it). Here's my proposal on a plan >> to migrate - >> https://docs.google.com/document/d/1powrXGbjMLMYl9ibRzMda5o5HM_p44XvBy5MZu75Q5E/edit?usp=sharing >> - I'd really appreciate any feedback or recommendations you have! In >> particular, I imagine people will have thoughts on the plan to migrate >> Jiras to Issues - I included that as a section and think its worth it, but >> others may disagree (or disagree on the fields we care about keeping). >> >> If anyone is interested in helping with the migration itself, please >> chime in as well! We will almost certainly need PMC help for some of the >> settings level work, but there's also a decent bit of parallelizable work >> available to update templates/documentation, update automation, and help >> build/design the issue migrator! >> >> Thanks, >> Danny >> >> On Thu, Feb 17, 2022 at 5:28 PM Sachin Agarwal >> wrote: >> >>> Thank you! I believe the benefits to make it easier for folks to >>> contribute to Beam will pay significant dividends quickly. >>> >>> On Thu, Feb 17, 2022 at 2:09 PM Aizhamal Nurmamat kyzy < >>> aizha...@apache.org> wrote: >>> Awesome, thanks for the feedback everyone. Then I will go ahead, and start documenting the plan in detail and share it here afterwards. On Tue, Feb 15, 2022 at 3:17 PM Alexey Romanenko < aromanenko@gmail.com> wrote: > First of all, many thanks for putting the details into this design doc > and sorry for delay with my response. > > I’m still quite neutral with this migration because of several > concerns: > > - Imho, Github Issues is still not well enough mature as an issue > tracker and it doesn’t provide the solutions for all needs as, for > example, > Jira and other tracker do (though, seems that there are many features > upcoming). For example, many things in GH Issues still can be resolved > only > with “labels" and we can potentially end up with a huge bunch of them with > a different naming policy, mixed purposes and so on. > > - If we won’t do a transfer of the issues/users/filters/etc from Jira > to GH Issues then, it looks, that we will live with two trackers for some > (unknown) amount of time which is not very convenient (I believe that we > need to specify our workflows with having this). > > - If we do a transfer then what kind of tools are going to be used, > how much time it will take - so, we’d need a detailed plan on this. > > On the other positive hand, for sure, GH Issues has, by design, a > solid integration with other Github services which is, obviously, a huge > advantage for the long term as well. > > In any case, adding (or substitute) a new tool should help us to make > the development process, in general, easier and faster. So I hope we can > achieve this with Github Issues. > > — > Alexey > > On 15 Feb 2022, at 06:52, Aizhamal Nurmamat kyzy > wrote: > > Very humbly, I think the benefits of moving to GitHub Issues > outweigh the shortcomings. > > Jan, Kenn, Alexey, JB: adding you directly as you had some concerns. > Please, let us know if they were addressed by the options that we > described > in the doc [1]? > > If noone objects, I can start working with some of you on > Migration TODOs outlined in the doc I am referencing. > > > [1] > https://docs.google.com/document/d/1_n7gboVbSKPs-CVcHzADgg8qpNL9igiHqUPCmiOslf0/edit#bookmark=id.izn35w5gsjft > > On Thu, Feb 10, 2022 at 1:12 PM Danny McCormick < > dannymccorm...@google.com> wrote: > >> I'm definitely +1 on moving to help make the bar for entry lower for >> new contributors (like myself!) >> >> Thanks, >> Danny >> >> On Thu, Feb 10, 2022 at 2:32 PM Aizhamal Nurmamat kyzy < >> aizha...@apache.org> wrote: >> >>> Hi all, >>> >>> I think we've had a chance to discuss shortcomings and advantages. I >>> think each person may have a different bias / preference. My bias is to >>> move to Github, to have a more inclusive, approachable project despite >>> the >>> differences in workflow. So I'm +1 on moving. >>> >>> Could others share their bias? Don't think of this as a vote, but
Re: Updating output watermark on bundle boundaries
> Yes, outputWithTimestamp should likely be restricted to min(elements seen so far). Am I understanding correctly that in terms of options for immediate fixes, given that some runners such as Flink have only ad hoc bundles, the most feasible way to enforce that "min(elements seen so far)" restriction would be to enforce that watermark updates may only happen at bundle boundaries? Are there other immediate term options? - Evan On Tue, Mar 29, 2022 at 4:11 AM Jan Lukavský wrote: > > There's another interesting API, which is being discussed for the > > internal variant of Dataflow, which is that rather than allowing one > > to fabricate timestamps (or windows) ex nihilo one would instead need > > ot ask for a "timestamped" or "windowed" element in the Process > > method, from which one could construct a new timestamped/windowed > > element (with a new value, but the same timestamp/window/paneinfo) > > that could then be safely emitted. I'm curious how constraining this > > would be. > > I'm not sure I follow. Do you suggest that - for the case of in-memory > batching - one would store a TimestampedValue in the buffer and when > flushing the buffer one would say "I'm emitting this value, that was > created based on this input element"? That seems to work fine, though I > suppose this is probably not the main motivation for such API. :) > > On 3/28/22 20:54, Robert Bradshaw wrote: > > On Mon, Mar 28, 2022 at 11:45 AM Jan Lukavský wrote: > >> On 3/28/22 20:17, Reuven Lax wrote: > >> > >> On Mon, Mar 28, 2022 at 11:08 AM Robert Bradshaw > wrote: > >>> On Mon, Mar 28, 2022 at 11:04 AM Reuven Lax wrote: > On Mon, Mar 28, 2022 at 10:59 AM Evan Galpin > wrote: > > I don't believe that the issue is Flink specific but rather that > Flink is one example of many potential examples. Enforcing that watermark > updates can only happen at bundle boundaries would ensure that any data > buffered while processing a single bundle in a DoFn could be output > ON_TIME, especially without any need for a TimerSpec to explicitly hold the > watermark for that purpose. This is in reference to data buffered within a > single bundle, and not cross-bundle buffering such as in the case of > GroupIntoBatches. > > Any in-flight data (i.e. data being processed that is not yet > committed back to the runner) must hold up the output watermark. Since in > the Beam model all records in a bundle are somewhat atomic (e.g. if the > bundle succeeds, none of of them should be replayed in a proper > exactly-once runner), I think this implicitly means that any elements in an > in-flight bundle must hold up the watermark. This doesn't mean that the > watermark can't advance while the bundle is in flight -just that it can't > advance past any of the timestamps outstanding in the bundle. > >>> Yes. The difficulty is that we don't have much visibility into > >>> "timestamps outstanding in the bundle" so we have to take > >>> min(timestamps of input elements in the bundle) which is not that > >>> different from only having watermark updates at bundle boundaries. > >> > >> Exactly. > >> > >> Agree, this works exactly the same. The requirement is not to not > update the watermark, but not to update it past any on-time element in the > bundle. Not updating the watermark at all is one solution, computing > min(timestamps in bundle) works the same. Unfortunately, Flink does not > construct bundles in advance, it is more an ad-hoc concept. Therefore the > only way to hold the watermark is not to update it, because the timestamps > of elements that will be part of the bundle are not known. > >> > >> Two more questions: > >> > >> a) it seems that we are missing some @ValidatesRunner tests for this, > right? > >> > >> b) should we relax the restriction of not allowing > outputWithTimestamp() output element before the current element? I think it > should be "before lowest element in the current bundle" or "before output > watermark, if not already late, or not droppable if late (uh, this gets a > little complicated :))". Not allowing outputting element with timestamp > lower than the current element seems to be just a "safety-first" solution > to the problem discussed here and is too restrictive. It could be > worked-around using getAllowedTimestampSkew(), but that can cause errors. > > Yes, outputWithTimestamp should likely be restricted to min(elements > > seen so far). > > > > There's another interesting API, which is being discussed for the > > internal variant of Dataflow, which is that rather than allowing one > > to fabricate timestamps (or windows) ex nihilo one would instead need > > ot ask for a "timestamped" or "windowed" element in the Process > > method, from which one could construct a new timestamped/windowed > > element (with a new value, but the same timestamp/window/paneinfo) > > that could then be safely emitted. I'm curious how constraining this > > would be. > > > > Take for example a PCollection with 1 second
How to contribute to BEAM doco
Hi guys, I found an error around the withExtendedErrorInfo method detail. Doco says this is only compatible with STREAMING_INSERTS, but it also works with STORAGE_WRITE_API. Also is this email address the best way to contribute to the documentation? Thanks, Andres Walsh