This is your daily summary of Beam's current high priority issues that may need attention.
See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/23306 [Bug]: BigQueryBatchFileLoads in python loses data when using WRITE_TRUNCATE https://github.com/apache/beam/issues/23179 [Bug]: Parquet size exploded for no apparent reason https://github.com/apache/beam/issues/22913 [Bug]: beam_PostCommit_Java_ValidatesRunner_Flink is flakes in org.apache.beam.sdk.transforms.GroupByKeyTest$BasicTests.testAfterProcessingTimeContinuationTriggerUsingState https://github.com/apache/beam/issues/22303 [Task]: Add tests to Kafka SDF and fix known and discovered issues https://github.com/apache/beam/issues/22299 [Bug]: JDBCIO Write freeze at getConnection() in WriteFn https://github.com/apache/beam/issues/21794 Dataflow runner creates a new timer whenever the output timestamp is change https://github.com/apache/beam/issues/21713 404s in BigQueryIO don't get output to Failed Inserts PCollection https://github.com/apache/beam/issues/21704 beam_PostCommit_Java_DataflowV2 failures parent bug https://github.com/apache/beam/issues/21701 beam_PostCommit_Java_DataflowV1 failing with a variety of flakes and errors https://github.com/apache/beam/issues/21700 --dataflowServiceOptions=use_runner_v2 is broken https://github.com/apache/beam/issues/21696 Flink Tests failure : java.lang.NoClassDefFoundError: Could not initialize class org.apache.beam.runners.core.construction.SerializablePipelineOptions https://github.com/apache/beam/issues/21695 DataflowPipelineResult does not raise exception for unsuccessful states. https://github.com/apache/beam/issues/21480 flake: FlinkRunnerTest.testEnsureStdoutStdErrIsRestored https://github.com/apache/beam/issues/21472 Dataflow streaming tests failing new AfterSynchronizedProcessingTime test https://github.com/apache/beam/issues/21471 Flakes: Failed to load cache entry https://github.com/apache/beam/issues/21470 Test flake: test_split_half_sdf https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink flaky: Connection refused https://github.com/apache/beam/issues/21468 beam_PostCommit_Python_Examples_Dataflow failing https://github.com/apache/beam/issues/21467 GBK and CoGBK streaming Java load tests failing https://github.com/apache/beam/issues/21463 NPE in Flink Portable ValidatesRunner streaming suite https://github.com/apache/beam/issues/21462 Flake in org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use https://github.com/apache/beam/issues/21271 pubsublite.ReadWriteIT flaky in beam_PostCommit_Java_DataflowV2 https://github.com/apache/beam/issues/21270 org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView flaky on Dataflow Runner V2 https://github.com/apache/beam/issues/21266 org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful is flaky in Java ValidatesRunner Flink suite. https://github.com/apache/beam/issues/21262 Python AfterAny, AfterAll do not follow spec https://github.com/apache/beam/issues/21261 org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer is flaky https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit data at GC time https://github.com/apache/beam/issues/21257 Either Create or DirectRunner fails to produce all elements to the following transform https://github.com/apache/beam/issues/21123 Multiple jobs running on Flink session cluster reuse the persistent Python environment. https://github.com/apache/beam/issues/21121 apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT.test_streaming_wordcount_it flakey https://github.com/apache/beam/issues/21118 PortableRunnerTestWithExternalEnv.test_pardo_timers flaky https://github.com/apache/beam/issues/21114 Already Exists: Dataset apache-beam-testing:python_bq_file_loads_NNN https://github.com/apache/beam/issues/21113 testTwoTimersSettingEachOtherWithCreateAsInputBounded flaky https://github.com/apache/beam/issues/21111 Java creates an incorrect pipeline proto when core-construction-java jar is not in the CLASSPATH https://github.com/apache/beam/issues/20981 Python precommit flaky: Failed to read inputs in the data plane https://github.com/apache/beam/issues/20977 SamzaStoreStateInternalsTest is flaky https://github.com/apache/beam/issues/20976 apache_beam.runners.portability.flink_runner_test.FlinkRunnerTestOptimized.test_flink_metrics is flaky https://github.com/apache/beam/issues/20975 org.apache.beam.runners.flink.ReadSourcePortableTest.testExecution[streaming: false] is flaky https://github.com/apache/beam/issues/20974 Python GHA PreCommits flake with grpc.FutureTimeoutError on SDK harness startup https://github.com/apache/beam/issues/20817 Bigquery Read tests are flaky on Flink runner in Python PostCommit suites https://github.com/apache/beam/issues/20815 testTeardownCalledAfterExceptionInProcessElement flakes on direct runner. https://github.com/apache/beam/issues/20689 Kafka commitOffsetsInFinalize OOM on Flink https://github.com/apache/beam/issues/20528 python CombineGlobally().with_fanout() cause duplicate combine results for sliding windows https://github.com/apache/beam/issues/20109 SortValues should fail if SecondaryKey coder is not deterministic https://github.com/apache/beam/issues/20108 Python direct runner doesn't emit empty pane when it should https://github.com/apache/beam/issues/19816 MetricsTest$AttemptedMetricTests.testAllAttemptedMetrics is flaky on DirectRunner https://github.com/apache/beam/issues/19814 Flakes in ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundleStateful for Direct, Spark, Flink P1 Issues with no update in the last week: https://github.com/apache/beam/issues/23022 [Bug]: PubsubIO does not consider attributes as part of the limit https://github.com/apache/beam/issues/22969 Discrepancy in behavior of `DoFn.process()` when `yield` is combined with `return` statement, or vice versa https://github.com/apache/beam/issues/22605 [Bug]: Beam Python failure for dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest.test_metrics_it https://github.com/apache/beam/issues/22449 [Bug]: Home page. The button "Link to GitHub Repo" is disappeared when the screen size less than 1024px x 721.6px https://github.com/apache/beam/issues/22288 [Bug]: BigQueryIO throws IllegalStateException if dynamicDestionations.getSchema returns null https://github.com/apache/beam/issues/22192 [Bug]: NullPointerException when copying files using FileSystems.copy and setting StandardMoveOptions.SKIP_IF_DESTINATION_EXISTS https://github.com/apache/beam/issues/22011 [Bug]: org.apache.beam.sdk.io.aws2.kinesis.KinesisIOWriteTest.testWriteFailure flaky https://github.com/apache/beam/issues/22010 [Bug]: org.apache.beam.runners.flink.FlinkRunnerTest.testEnsureStdoutStdErrIsRestored flaky https://github.com/apache/beam/issues/22009 [Bug]: org.apache.beam.examples.subprocess.ExampleEchoPipelineTest.testExampleEchoPipeline flaky https://github.com/apache/beam/issues/21893 [Bug]: BigQuery Storage Write API implementation does not support table partitioning https://github.com/apache/beam/issues/21714 PulsarIOTest.testReadFromSimpleTopic is very flaky https://github.com/apache/beam/issues/21711 Python Streaming job failing to drain with BigQueryIO write errors https://github.com/apache/beam/issues/21709 beam_PostCommit_Java_ValidatesRunner_Samza Failing https://github.com/apache/beam/issues/21708 beam_PostCommit_Java_DataflowV2, testBigQueryStorageWrite30MProto failing consistently https://github.com/apache/beam/issues/21707 GroupByKeyTest BasicTests testLargeKeys100MB flake (on ULR) https://github.com/apache/beam/issues/21706 Flaky timeout in github Python unit test action StatefulDoFnOnDirectRunnerTest.test_dynamic_timer_clear_then_set_timer https://github.com/apache/beam/issues/21476 WriteToBigQuery Dynamic table destinations returns wrong tableId https://github.com/apache/beam/issues/21475 Beam x-lang Dataflow tests failing due to _InactiveRpcError https://github.com/apache/beam/issues/21474 Flaky tests: Gradle build daemon disappeared unexpectedly https://github.com/apache/beam/issues/21473 PVR_Spark2_Streaming perma-red https://github.com/apache/beam/issues/21465 Kafka commit offset drop data on failure for runners that have non-checkpointing shuffle https://github.com/apache/beam/issues/20814 JmsIO is not acknowledging messages correctly https://github.com/apache/beam/issues/20812 Cross-language consistency (RequiresStableInputs) is quietly broken (at least on portable flink runner) https://github.com/apache/beam/issues/20333 beam_PerformanceTests_Kafka_IO failing due to " provided port is already allocated"