This is your daily summary of Beam's current high priority issues that may need
attention.
See https://beam.apache.org/contribute/issue-priorities for the meaning and
expectations around issue priorities.
Unassigned P0 Issues:
https://github.com/apache/beam/issues/22606 [Bug]: The pipeline fails when
trying to write to kafka
Unassigned P1 Issues:
https://github.com/apache/beam/issues/22676 [Bug]: Seed job failing to access
LDAP
https://github.com/apache/beam/issues/22642 [Bug]: Dataflow fails to drain a
job when using BigQuery (java sdk v.2.38)
https://github.com/apache/beam/issues/22440 [Bug]: Python Batch Dataflow
SideInput LoadTests failing
https://github.com/apache/beam/issues/22321
PortableRunnerTestWithExternalEnv.test_pardo_large_input is regularly failing
on jenkins
https://github.com/apache/beam/issues/22303 [Task]: Add tests to Kafka SDF and
fix known and discovered issues
https://github.com/apache/beam/issues/22299 [Bug]: JDBCIO Write freeze at
getConnection() in WriteFn
https://github.com/apache/beam/issues/22283 [Bug]: Python Lots of fn runner
test items cost exactly 5 seconds to run
https://github.com/apache/beam/issues/21794 Dataflow runner creates a new timer
whenever the output timestamp is change
https://github.com/apache/beam/issues/21713 404s in BigQueryIO don't get output
to Failed Inserts PCollection
https://github.com/apache/beam/issues/21704 beam_PostCommit_Java_DataflowV2
failures parent bug
https://github.com/apache/beam/issues/21703 pubsublite.ReadWriteIT failing in
beam_PostCommit_Java_DataflowV1 and V2
https://github.com/apache/beam/issues/21702 SpannerWriteIT failing in beam
PostCommit Java V1
https://github.com/apache/beam/issues/21701 beam_PostCommit_Java_DataflowV1
failing with a variety of flakes and errors
https://github.com/apache/beam/issues/21700
--dataflowServiceOptions=use_runner_v2 is broken
https://github.com/apache/beam/issues/21696 Flink Tests failure :
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.beam.runners.core.construction.SerializablePipelineOptions
https://github.com/apache/beam/issues/21695 DataflowPipelineResult does not
raise exception for unsuccessful states.
https://github.com/apache/beam/issues/21694 BigQuery Storage API insert with
writeResult retry and write to error table
https://github.com/apache/beam/issues/21480 flake:
FlinkRunnerTest.testEnsureStdoutStdErrIsRestored
https://github.com/apache/beam/issues/21472 Dataflow streaming tests failing
new AfterSynchronizedProcessingTime test
https://github.com/apache/beam/issues/21471 Flakes: Failed to load cache entry
https://github.com/apache/beam/issues/21470 Test flake: test_split_half_sdf
https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink flaky:
Connection refused
https://github.com/apache/beam/issues/21468
beam_PostCommit_Python_Examples_Dataflow failing
https://github.com/apache/beam/issues/21467 GBK and CoGBK streaming Java load
tests failing
https://github.com/apache/beam/issues/21465 Kafka commit offset drop data on
failure for runners that have non-checkpointing shuffle
https://github.com/apache/beam/issues/21463 NPE in Flink Portable
ValidatesRunner streaming suite
https://github.com/apache/beam/issues/21462 Flake in
org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use
https://github.com/apache/beam/issues/21271 pubsublite.ReadWriteIT flaky in
beam_PostCommit_Java_DataflowV2
https://github.com/apache/beam/issues/21270
org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView
flaky on Dataflow Runner V2
https://github.com/apache/beam/issues/21268 Race between member variable being
accessed due to leaking uninitialized state via OutboundObserverFactory
https://github.com/apache/beam/issues/21267 WriteToBigQuery submits a duplicate
BQ load job if a 503 error code is returned from googleapi
https://github.com/apache/beam/issues/21266
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
is flaky in Java ValidatesRunner Flink suite.
https://github.com/apache/beam/issues/21265
apache_beam.runners.portability.fn_api_runner.translations_test.TranslationsTest.test_run_packable_combine_globally
'apache_beam.coders.coder_impl._AbstractIterable' object is not reversible
https://github.com/apache/beam/issues/21263 (Broken Pipe induced) Bricked
Dataflow Pipeline
https://github.com/apache/beam/issues/21262 Python AfterAny, AfterAll do not
follow spec
https://github.com/apache/beam/issues/21261
org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
is flaky
https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit
data at GC time
https://github.com/apache/beam/issues/21257 Either Create or DirectRunner fails
to produce all elements to the following transform
https://github.com/apache/beam/issues/21123 Multiple jobs running on Flink
session cluster reuse the persistent Python environment.
https://github.com/apache/beam/issues/21121
apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT.test_streaming_wordcount_it
flakey
https://github.com/apache/beam/issues/21118
PortableRunnerTestWithExternalEnv.test_pardo_timers flaky
https://github.com/apache/beam/issues/21117 "Java IO IT Tests" - missing data
in grafana
https://github.com/apache/beam/issues/21115 JdbcIO date conversion is sensitive
to OS
https://github.com/apache/beam/issues/21114 Already Exists: Dataset
apache-beam-testing:python_bq_file_loads_NNN
https://github.com/apache/beam/issues/21113
testTwoTimersSettingEachOtherWithCreateAsInputBounded flaky
https://github.com/apache/beam/issues/21111 Java creates an incorrect pipeline
proto when core-construction-java jar is not in the CLASSPATH
https://github.com/apache/beam/issues/21109 SDF BoundedSource seems to execute
significantly slower than 'normal' BoundedSource
https://github.com/apache/beam/issues/21108 java.io.InvalidClassException With
Flink Kafka
https://github.com/apache/beam/issues/20981 Python precommit flaky: Failed to
read inputs in the data plane
https://github.com/apache/beam/issues/20979 Portable runners should be able to
issue checkpoints to Splittable DoFn
https://github.com/apache/beam/issues/20978 PubsubIO.readAvroGenericRecord
creates SchemaCoder that fails to decode some Avro logical types
https://github.com/apache/beam/issues/20977 SamzaStoreStateInternalsTest is
flaky
https://github.com/apache/beam/issues/20976
apache_beam.runners.portability.flink_runner_test.FlinkRunnerTestOptimized.test_flink_metrics
is flaky
https://github.com/apache/beam/issues/20975
org.apache.beam.runners.flink.ReadSourcePortableTest.testExecution[streaming:
false] is flaky
https://github.com/apache/beam/issues/20974 Python GHA PreCommits flake with
grpc.FutureTimeoutError on SDK harness startup
https://github.com/apache/beam/issues/20818 XmlIO.Read does not handle XML
encoding per spec
https://github.com/apache/beam/issues/20817 Bigquery Read tests are flaky on
Flink runner in Python PostCommit suites
https://github.com/apache/beam/issues/20815
testTeardownCalledAfterExceptionInProcessElement flakes on direct runner.
https://github.com/apache/beam/issues/20814 JmsIO is not acknowledging messages
correctly
https://github.com/apache/beam/issues/20813 No trigger early repeatedly for
session windows
https://github.com/apache/beam/issues/20692 Timer with dataflow runner can be
set multiple times (dataflow runner)
https://github.com/apache/beam/issues/20689 Kafka commitOffsetsInFinalize OOM
on Flink
https://github.com/apache/beam/issues/20528 python
CombineGlobally().with_fanout() cause duplicate combine results for sliding
windows
https://github.com/apache/beam/issues/20332 FileIO writeDynamic with
AvroIO.sink not writing all data
https://github.com/apache/beam/issues/20331
org.apache.beam.sdk.io.mongodb.MongoDbIOTest.testReadWithAggregate is flaky
https://github.com/apache/beam/issues/20109 SortValues should fail if
SecondaryKey coder is not deterministic
https://github.com/apache/beam/issues/20108 Python direct runner doesn't emit
empty pane when it should
https://github.com/apache/beam/issues/19816
MetricsTest$AttemptedMetricTests.testAllAttemptedMetrics is flaky on
DirectRunner
https://github.com/apache/beam/issues/19814 Flakes in
ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundleStateful for
Direct, Spark, Flink