Re: P1 issues report (70)

2022-06-22 Thread Manu Zhang
Hi all,

what is this daily summary intended for? Not all issues look like P1. And
will a weekly summary be less noise?

于2022年6月22日 周三23:45写道:

> This is your daily summary of Beam's current P1 issues, not including
> flaky tests.
>
> See https://beam.apache.org/contribute/issue-priorities/#p1-critical
> for the meaning and expectations around P1 issues.
>
>
>
> https://api.github.com/repos/apache/beam/issues/21978: [Playground]
> Implement Share Any Code feature on the frontend
> https://api.github.com/repos/apache/beam/issues/21946: [Bug]: No way to
> read or write to file when running Beam in Flink
> https://api.github.com/repos/apache/beam/issues/21935: [Bug]: Reject
> illformed GBK Coders
> https://api.github.com/repos/apache/beam/issues/21897: [Feature Request]:
> Flink runner savepoint backward compatibility
> https://api.github.com/repos/apache/beam/issues/21893: [Bug]: BigQuery
> Storage Write API implementation does not support table partitioning
> https://api.github.com/repos/apache/beam/issues/21794: Dataflow runner
> creates a new timer whenever the output timestamp is change
> https://api.github.com/repos/apache/beam/issues/21763: [Playground Task]:
> Migrate from Google Analytics to Matomo Cloud
> https://api.github.com/repos/apache/beam/issues/21715: Data missing when
> using CassandraIO.Read
> https://api.github.com/repos/apache/beam/issues/21713: 404s in BigQueryIO
> don't get output to Failed Inserts PCollection
> https://api.github.com/repos/apache/beam/issues/21711: Python Streaming
> job failing to drain with BigQueryIO write errors
> https://api.github.com/repos/apache/beam/issues/21703:
> pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1 and V2
> https://api.github.com/repos/apache/beam/issues/21702: SpannerWriteIT
> failing in beam PostCommit Java V1
> https://api.github.com/repos/apache/beam/issues/21700:
> --dataflowServiceOptions=use_runner_v2 is broken
> https://api.github.com/repos/apache/beam/issues/21695:
> DataflowPipelineResult does not raise exception for unsuccessful states.
> https://api.github.com/repos/apache/beam/issues/21694: BigQuery Storage
> API insert with writeResult retry and write to error table
> https://api.github.com/repos/apache/beam/issues/21479: Install Python
> wheel and dependencies to local venv in SDK harness
> https://api.github.com/repos/apache/beam/issues/21478:
> KafkaIO.read.withDynamicRead() doesn't pick up new TopicPartitions
> https://api.github.com/repos/apache/beam/issues/21477: Add integration
> testing for BQ Storage API  write modes
> https://api.github.com/repos/apache/beam/issues/21476: WriteToBigQuery
> Dynamic table destinations returns wrong tableId
> https://api.github.com/repos/apache/beam/issues/21475: Beam x-lang
> Dataflow tests failing due to _InactiveRpcError
> https://api.github.com/repos/apache/beam/issues/21473:
> PVR_Spark2_Streaming perma-red
> https://api.github.com/repos/apache/beam/issues/21466: Simplify version
> override for Dev versions of the Go SDK.
> https://api.github.com/repos/apache/beam/issues/21465: Kafka commit
> offset drop data on failure for runners that have non-checkpointing shuffle
> https://api.github.com/repos/apache/beam/issues/21269: Delete orphaned
> files
> https://api.github.com/repos/apache/beam/issues/21268: Race between
> member variable being accessed due to leaking uninitialized state via
> OutboundObserverFactory
> https://api.github.com/repos/apache/beam/issues/21267: WriteToBigQuery
> submits a duplicate BQ load job if a 503 error code is returned from
> googleapi
> https://api.github.com/repos/apache/beam/issues/21265:
> apache_beam.runners.portability.fn_api_runner.translations_test.TranslationsTest.test_run_packable_combine_globally
> 'apache_beam.coders.coder_impl._AbstractIterable' object is not reversible
> https://api.github.com/repos/apache/beam/issues/21263: (Broken Pipe
> induced) Bricked Dataflow Pipeline
> https://api.github.com/repos/apache/beam/issues/21262: Python AfterAny,
> AfterAll do not follow spec
> https://api.github.com/repos/apache/beam/issues/21260: Python
> DirectRunner does not emit data at GC time
> https://api.github.com/repos/apache/beam/issues/21259: Consumer group
> with random prefix
> https://api.github.com/repos/apache/beam/issues/21258: Dataflow error in
> CombinePerKey operation
> https://api.github.com/repos/apache/beam/issues/21257: Either Create or
> DirectRunner fails to produce all elements to the following transform
> https://api.github.com/repos/apache/beam/issues/21123: Multiple jobs
> running on Flink session cluster reuse the persistent Python environment.
> https://api.github.com/repos/apache/beam/issues/21119: Migrate to the
> next version of Python `requests` when released
> https://api.github.com/repos/apache/beam/issues/21117: "Java IO IT Tests"
> - missing data in grafana
> https://api.github.com/repos/apache/beam/issues/21115: JdbcIO date
> conversion is sensitive to OS
> 

P0 issues report (2)

2022-06-22 Thread beamactions
This is your daily summary of Beam's current P0 issues, not including flaky 
tests.

See https://beam.apache.org/contribute/issue-priorities/#p0-outage for the 
meaning and expectations around P0 issues.



https://api.github.com/repos/apache/beam/issues/21948: [Bug]: KinesisIO javadoc 
is no longer up-to-date
https://api.github.com/repos/apache/beam/issues/21824: [Bug]: Disable PR 
comment trigger


P1 issues report (70)

2022-06-22 Thread beamactions
This is your daily summary of Beam's current P1 issues, not including flaky 
tests.

See https://beam.apache.org/contribute/issue-priorities/#p1-critical for 
the meaning and expectations around P1 issues.



https://api.github.com/repos/apache/beam/issues/21978: [Playground] Implement 
Share Any Code feature on the frontend
https://api.github.com/repos/apache/beam/issues/21946: [Bug]: No way to read or 
write to file when running Beam in Flink
https://api.github.com/repos/apache/beam/issues/21935: [Bug]: Reject illformed 
GBK Coders
https://api.github.com/repos/apache/beam/issues/21897: [Feature Request]: Flink 
runner savepoint backward compatibility 
https://api.github.com/repos/apache/beam/issues/21893: [Bug]: BigQuery Storage 
Write API implementation does not support table partitioning
https://api.github.com/repos/apache/beam/issues/21794: Dataflow runner creates 
a new timer whenever the output timestamp is change
https://api.github.com/repos/apache/beam/issues/21763: [Playground Task]: 
Migrate from Google Analytics to Matomo Cloud
https://api.github.com/repos/apache/beam/issues/21715: Data missing when using 
CassandraIO.Read
https://api.github.com/repos/apache/beam/issues/21713: 404s in BigQueryIO don't 
get output to Failed Inserts PCollection
https://api.github.com/repos/apache/beam/issues/21711: Python Streaming job 
failing to drain with BigQueryIO write errors
https://api.github.com/repos/apache/beam/issues/21703: pubsublite.ReadWriteIT 
failing in beam_PostCommit_Java_DataflowV1 and V2
https://api.github.com/repos/apache/beam/issues/21702: SpannerWriteIT failing 
in beam PostCommit Java V1
https://api.github.com/repos/apache/beam/issues/21700: 
--dataflowServiceOptions=use_runner_v2 is broken
https://api.github.com/repos/apache/beam/issues/21695: DataflowPipelineResult 
does not raise exception for unsuccessful states.
https://api.github.com/repos/apache/beam/issues/21694: BigQuery Storage API 
insert with writeResult retry and write to error table
https://api.github.com/repos/apache/beam/issues/21479: Install Python wheel and 
dependencies to local venv in SDK harness
https://api.github.com/repos/apache/beam/issues/21478: 
KafkaIO.read.withDynamicRead() doesn't pick up new TopicPartitions
https://api.github.com/repos/apache/beam/issues/21477: Add integration testing 
for BQ Storage API  write modes
https://api.github.com/repos/apache/beam/issues/21476: WriteToBigQuery Dynamic 
table destinations returns wrong tableId
https://api.github.com/repos/apache/beam/issues/21475: Beam x-lang Dataflow 
tests failing due to _InactiveRpcError
https://api.github.com/repos/apache/beam/issues/21473: PVR_Spark2_Streaming 
perma-red
https://api.github.com/repos/apache/beam/issues/21466: Simplify version 
override for Dev versions of the Go SDK.
https://api.github.com/repos/apache/beam/issues/21465: Kafka commit offset drop 
data on failure for runners that have non-checkpointing shuffle
https://api.github.com/repos/apache/beam/issues/21269: Delete orphaned files
https://api.github.com/repos/apache/beam/issues/21268: Race between member 
variable being accessed due to leaking uninitialized state via 
OutboundObserverFactory
https://api.github.com/repos/apache/beam/issues/21267: WriteToBigQuery submits 
a duplicate BQ load job if a 503 error code is returned from googleapi
https://api.github.com/repos/apache/beam/issues/21265: 
apache_beam.runners.portability.fn_api_runner.translations_test.TranslationsTest.test_run_packable_combine_globally
 'apache_beam.coders.coder_impl._AbstractIterable' object is not reversible
https://api.github.com/repos/apache/beam/issues/21263: (Broken Pipe induced) 
Bricked Dataflow Pipeline 
https://api.github.com/repos/apache/beam/issues/21262: Python AfterAny, 
AfterAll do not follow spec
https://api.github.com/repos/apache/beam/issues/21260: Python DirectRunner does 
not emit data at GC time
https://api.github.com/repos/apache/beam/issues/21259: Consumer group with 
random prefix
https://api.github.com/repos/apache/beam/issues/21258: Dataflow error in 
CombinePerKey operation
https://api.github.com/repos/apache/beam/issues/21257: Either Create or 
DirectRunner fails to produce all elements to the following transform
https://api.github.com/repos/apache/beam/issues/21123: Multiple jobs running on 
Flink session cluster reuse the persistent Python environment.
https://api.github.com/repos/apache/beam/issues/21119: Migrate to the next 
version of Python `requests` when released
https://api.github.com/repos/apache/beam/issues/21117: "Java IO IT Tests" - 
missing data in grafana
https://api.github.com/repos/apache/beam/issues/21115: JdbcIO date conversion 
is sensitive to OS
https://api.github.com/repos/apache/beam/issues/21112: Dataflow SocketException 
(SSLException) error while trying to send message from Cloud Pub/Sub to BigQuery
https://api.github.com/repos/apache/beam/issues/2: Java creates an 
incorrect pipeline proto when core-construction-java jar is not in the 

Flaky test issue report (56)

2022-06-22 Thread beamactions
This is your daily summary of Beam's current flaky tests.

These are P1 issues because they have a major negative impact on the 
community and make it hard to determine the quality of the software.



https://api.github.com/repos/apache/beam/issues/21714: 
PulsarIOTest.testReadFromSimpleTopic is very flaky
https://api.github.com/repos/apache/beam/issues/21709: 
beam_PostCommit_Java_ValidatesRunner_Samza Failing
https://api.github.com/repos/apache/beam/issues/21708: 
beam_PostCommit_Java_DataflowV2, testBigQueryStorageWrite30MProto failing 
consistently
https://api.github.com/repos/apache/beam/issues/21707: GroupByKeyTest 
BasicTests testLargeKeys100MB flake (on ULR)
https://api.github.com/repos/apache/beam/issues/21706: Flaky timeout in github 
Python unit test action 
StatefulDoFnOnDirectRunnerTest.test_dynamic_timer_clear_then_set_timer
https://api.github.com/repos/apache/beam/issues/21704: 
beam_PostCommit_Java_DataflowV2 failures parent bug
https://api.github.com/repos/apache/beam/issues/21701: 
beam_PostCommit_Java_DataflowV1 failing with a variety of flakes and errors
https://api.github.com/repos/apache/beam/issues/21698: Docker Snapshots failing 
to be published since April 14th
https://api.github.com/repos/apache/beam/issues/21696: Flink Tests failure :  
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.beam.runners.core.construction.SerializablePipelineOptions 
https://api.github.com/repos/apache/beam/issues/21643: FnRunnerTest with 
non-trivial (order 1000 elements) numpy input flakes in non-cython environment
https://api.github.com/repos/apache/beam/issues/21629: Multiple XVR Suites 
having similar flakes simultaneously
https://api.github.com/repos/apache/beam/issues/21587: 
beam_PreCommit_PythonDocs failing (jinja2)
https://api.github.com/repos/apache/beam/issues/21540: Jenkins worker sometimes 
crashes while running Python Flink pipeline
https://api.github.com/repos/apache/beam/issues/21480: flake: 
FlinkRunnerTest.testEnsureStdoutStdErrIsRestored
https://api.github.com/repos/apache/beam/issues/21474: Flaky tests: Gradle 
build daemon disappeared unexpectedly
https://api.github.com/repos/apache/beam/issues/21472: Dataflow streaming tests 
failing new AfterSynchronizedProcessingTime test
https://api.github.com/repos/apache/beam/issues/21471: Flakes: Failed to load 
cache entry
https://api.github.com/repos/apache/beam/issues/21470: Test flake: 
test_split_half_sdf
https://api.github.com/repos/apache/beam/issues/21469: 
beam_PostCommit_XVR_Flink flaky: Connection refused
https://api.github.com/repos/apache/beam/issues/21468: 
beam_PostCommit_Python_Examples_Dataflow failing
https://api.github.com/repos/apache/beam/issues/21467: GBK and CoGBK streaming 
Java load tests failing
https://api.github.com/repos/apache/beam/issues/21464: GroupIntoBatchesTest is 
failing
https://api.github.com/repos/apache/beam/issues/21463: NPE in Flink Portable 
ValidatesRunner streaming suite
https://api.github.com/repos/apache/beam/issues/21462: Flake in 
org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use
https://api.github.com/repos/apache/beam/issues/21333: Flink 
testParDoRequiresStableInput flaky
https://api.github.com/repos/apache/beam/issues/21271: pubsublite.ReadWriteIT 
flaky in beam_PostCommit_Java_DataflowV2  
https://api.github.com/repos/apache/beam/issues/21270: 
org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView
 flaky on Dataflow Runner V2
https://api.github.com/repos/apache/beam/issues/21266: 
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
 is flaky in Java ValidatesRunner Flink suite.
https://api.github.com/repos/apache/beam/issues/21264: beam_PostCommit_Python36 
- CrossLanguageSpannerIOTest - flakey failing
https://api.github.com/repos/apache/beam/issues/21261: 
org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
 is flaky
https://api.github.com/repos/apache/beam/issues/21242: 
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle
 is flaky in Java Spark ValidatesRunner suite 
https://api.github.com/repos/apache/beam/issues/21121: 
apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT.test_streaming_wordcount_it
 flakey
https://api.github.com/repos/apache/beam/issues/21120: 
beam_PostRelease_NightlySnapshot failed
https://api.github.com/repos/apache/beam/issues/21118: 
PortableRunnerTestWithExternalEnv.test_pardo_timers flaky
https://api.github.com/repos/apache/beam/issues/21116: Python PreCommit flaking 
in PipelineOptionsTest.test_display_data
https://api.github.com/repos/apache/beam/issues/21114: Already Exists: Dataset 
apache-beam-testing:python_bq_file_loads_NNN
https://api.github.com/repos/apache/beam/issues/21113: 
testTwoTimersSettingEachOtherWithCreateAsInputBounded flaky

Re: [VOTE] Release 2.40.0, candidate #1

2022-06-22 Thread Alexey Romanenko
+1 (binding)

I tested it with  https://github.com/Talend/beam-samples/ 
 
(Java 8&11 SDK, Spark 3 runner).

---
Alexey

> On 21 Jun 2022, at 04:28, Ahmet Altay  wrote:
> 
> 
> 
> On Mon, Jun 20, 2022 at 6:35 PM Pablo Estrada  > wrote:
> I have not yet pushed Dataflow containers. I'll push those tomorrow.
> 
> Ack. I missed the note at the end of your first email.
>  
> 
> On Mon, Jun 20, 2022 at 6:34 PM Ahmet Altay  > wrote:
> I ran into issues with running python jobs on Dataflow. They failed with an 
> "Failed to fetch \"2.40.0\" from request 
> \"/v2/cloud-dataflow/v1beta3/python38/manifests/2.40.0\"." error. It might 
> also be an issue on my end because I was using a different setup than my 
> usual. It would be good for someone else to verify running python on dataflow.
> 
> Ahmet
> 
> On Mon, Jun 20, 2022 at 4:10 PM Pablo Estrada  > wrote:
> Hi everyone,
> 
> Please review and vote on the release candidate #1 for the version 2.40.0, as 
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>  
>  
> Reviewers are encouraged to test their own use cases with the release 
> candidate, and vote +1 if no issues are found.
>  
> The complete staging area is available for your review, which includes:
> * Release notes [1],
> * the official Apache source release to be deployed to dist.apache.org 
>  [2], which is signed with the key with fingerprint 
> C79DDD47DAF3808F0B9DDFAC02B2D9F742008494 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "v2.40.0-RC1" [5],
> * website pull request listing the release [6], the blog post [6], and 
> publishing the API reference manual [7].
> * Java artifacts were built with Gradle 7.4 and openjdk version "1.8.0_232".
> * Python artifacts are deployed along with the source release to the 
> dist.apache.org  [2] and PyPI[8].
> * Validation sheet with a tab for 2.40.0 release to help with validation [9].
> * Docker images published to Docker Hub [10].
>  
> The vote will be open for at least 72 hours. It is adopted by majority 
> approval, with at least 3 PMC affirmative votes.
>  
> For guidelines on how to try the release in your projects, check out our blog 
> post at https://beam.apache.org/blog/validate-beam-release/ 
> .
>  
> Thanks,
> -P.
> 
> P.S.: Dataflow containers have not yet been pushed, so please hold off before 
> testing that. I'll push them by tomorrow.
>  
> [1] https://github.com/apache/beam/releases/tag/untagged-c5c3f847bb360d87ac15 
>  
> [2] https://dist.apache.org/repos/dist/dev/beam/2.40.0/ 
> 
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS 
> 
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1274/ 
> 
> [5] https://github.com/apache/beam/tree/v2.40.0-RC1 
> 
> [6] https://github.com/apache/beam/pull/21947 
> 
> [7] https://github.com/apache/beam-site/pull/632 
> 
> [8] https://pypi.org/project/apache-beam/2.40.0rc1/ 
> 
> [9] 
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1844197258
>  
> 
> [10] https://hub.docker.com/search?q=apache%2Fbeam=image 
>