[GSOC] Build out Beam Use Cases Project Proposal

2024-04-01 Thread Ayush Pandey
Hi Everyone,

I am Ayush Pandey, interested in the "Build out Beam Use Cases" (Project
Link: [GSOC] Build out Beam Use Cases
 )  Project for GSoC 2024.

I worked as a GSoC contributor for Apache Cloudstack in 2023 and was really
interested in working on Apache Beam in GSoC 2024.

I am new to Apache Beam and have been experimenting with it for a better
part of last month. With my focus on Machine learning and past experience
in engineering and infrastructure, I wanted to work on building out a Beam
Use Cases Project. I find Apache Beam architecture and its MLTransform
library highly flexible and desirable for many of my curriculum topics in
ML.

Please find my proposal draft at this link
.
Any input or suggestion will be immensely valuable as I strive to refine my
proposal and make a meaningful contribution to the Apache Beam community.

I also apologise for reaching out late. I wasn’t aware of broad feedback
options until recently and I was reaching out to individuals instead. I
will be highly grateful for any feedback.

Thanks and Regards,
Ayush


Beam High Priority Issue Report (62)

2024-04-01 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need 
attention.

See https://beam.apache.org/contribute/issue-priorities for the meaning and 
expectations around issue priorities.

Unassigned P1 Issues:

https://github.com/apache/beam/issues/30820 [Bug]: Install fails with 
Python3.12 due to old grpcio-tools version
https://github.com/apache/beam/issues/30813 The PreCommit Python Coverage job 
is flaky
https://github.com/apache/beam/issues/30799 The PostCommit Python Dependency 
job is flaky
https://github.com/apache/beam/issues/30767 [Bug]: org.junit.rules.TestRule 
class referenced in TestPipeline deprecated in newest version of junit
https://github.com/apache/beam/issues/30760 The PostCommit Python Arm job is 
flaky
https://github.com/apache/beam/issues/30757 [Bug]: Beam Playground scio 
examples cannot run
https://github.com/apache/beam/issues/30737 [Failing Test]: Playground 
PreCommit failing goLint
https://github.com/apache/beam/issues/30644 The Inference Python Benchmarks 
Dataflow job is flaky
https://github.com/apache/beam/issues/30612 The Playground CI Nightly job is 
flaky
https://github.com/apache/beam/issues/30606 The PostCommit Java Nexmark 
Dataflow job is flaky
https://github.com/apache/beam/issues/30530 The LoadTests Java GBK Smoke job is 
flaky
https://github.com/apache/beam/issues/30529 The PostCommit Java Sickbay job is 
flaky
https://github.com/apache/beam/issues/30527 The PostCommit Java IO Performance 
Tests job is flaky
https://github.com/apache/beam/issues/30526 The PerformanceTests xlang KafkaIO 
Python job is flaky
https://github.com/apache/beam/issues/30525 The PostCommit Python 
ValidatesContainer Dataflow With RC job is flaky
https://github.com/apache/beam/issues/30521 The LoadTests Go Combine Flink 
Batch job is flaky
https://github.com/apache/beam/issues/30520 The LoadTests Python Combine Flink 
Streaming job is flaky
https://github.com/apache/beam/issues/30519 The PostCommit XVR GoUsingJava 
Dataflow job is flaky
https://github.com/apache/beam/issues/30517 The PostCommit XVR Direct job is 
flaky
https://github.com/apache/beam/issues/30513 The PostCommit Python job is flaky
https://github.com/apache/beam/issues/30511 The LoadTests Python Smoke job is 
flaky
https://github.com/apache/beam/issues/30507 The LoadTests Go GBK Flink Batch 
job is flaky
https://github.com/apache/beam/issues/30506 The TypeScript Tests job is flaky
https://github.com/apache/beam/issues/30505 The PostRelease Nightly Snapshot 
job is flaky
https://github.com/apache/beam/issues/30504 The LoadTests Python Combine 
Dataflow Streaming job is flaky
https://github.com/apache/beam/issues/30503 The PostCommit Java ValidatesRunner 
Flink Java11 job is flaky
https://github.com/apache/beam/issues/30502 The LoadTests Go CoGBK Flink Batch 
job is flaky
https://github.com/apache/beam/issues/30498 [Bug]: Beam Sql is ignoring aliases 
fields in some situations which causes to huge data loss
https://github.com/apache/beam/issues/29971 [Bug]: FixedWindows not working for 
large Kafka topic
https://github.com/apache/beam/issues/29926 [Bug]: FileIO: lack of timeouts may 
cause the pipeline to get stuck indefinitely
https://github.com/apache/beam/issues/29902 [Bug]: Messages are not ACK on 
Pubsub starting Beam 2.52.0 on Flink Runner in detached mode
https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java SDK Harness 
doesn't update user counters in OnTimer callback functions
https://github.com/apache/beam/issues/28760 [Bug]: EFO Kinesis IO reader 
provided by apache beam does not pick the event time for watermarking
https://github.com/apache/beam/issues/28383 [Failing Test]: 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorkerTest.testMaxThreadMetric
https://github.com/apache/beam/issues/28326 Bug: 
apache_beam.io.gcp.pubsublite.ReadFromPubSubLite not working
https://github.com/apache/beam/issues/27892 [Bug]: ignoreUnknownValues not 
working when using CreateDisposition.CREATE_IF_NEEDED 
https://github.com/apache/beam/issues/27616 [Bug]: Unable to use 
applyRowMutations() in bigquery IO apache beam java
https://github.com/apache/beam/issues/27486 [Bug]: Read from datastore with 
inequality filters
https://github.com/apache/beam/issues/27314 [Failing Test]: 
bigquery.StorageApiSinkCreateIfNeededIT.testCreateManyTables[1]
https://github.com/apache/beam/issues/27238 [Bug]: Window trigger has lag when 
using Kafka and GroupByKey on Dataflow Runner
https://github.com/apache/beam/issues/26911 [Bug]: UNNEST ARRAY with a nested 
ROW (described below)
https://github.com/apache/beam/issues/26343 [Bug]: 
apache_beam.io.gcp.bigquery_read_it_test.ReadAllBQTests.test_read_queries is 
flaky
https://github.com/apache/beam/issues/26329 [Bug]: BigQuerySourceBase does not 
propagate a Coder to AvroSource
https://github.com/apache/beam/issues/26041 [Bug]: Unable to create 
exactly-once Flink pipeline with stream source and file sink
https://github.com/apache/beam/issues/24776 [Bug]: Race