Hi Beam Community, I recently opened a Pull Request to implement parallel reading support in SparkReceiverIO (Java SDK).
Currently, SparkReceiverIO is limited to a single worker because it initiates reading with a single `Impulse`, creating a bottleneck for high-throughput scenarios. I have submitted a fix that allows users to configure [withNumReaders(int)](/beam/sdks/java/io/sparkreceiver/3/src/main/java/org/apache/beam/sdk/io/sparkreceiver/SparkReceiverIO.java:169:4-177:5), which distributes reading tasks across multiple workers using a `Create.of(shards) + Reshuffle` pattern. Key details: - PR: https://github.com/apache/beam/pull/[YOUR_PR_NUMBER] - Issue: https://github.com/apache/beam/issues/37410 - Impact: Enables horizontal scalability for SparkReceiverIO while maintaining strict backward compatibility. I would appreciate any feedback or review on this change. Thanks, Atharva Ralegankar https://www.linkedin.com/in/atharvaralegankar/
