mosche commented on PR #22620: URL: https://github.com/apache/beam/pull/22620#issuecomment-1246776582
I took a bit of a turn here after validating my initial approach replacing bounded sources with `UnboundedReadFromBoundedSource` with VR tests in Flink: - Tests that failed likely due to watermark issues with the Spark runner (#23129, see [test results](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_PR/494/)) ran fine with Flink suggesting there really is a major problem (in streaming mode). - Nevertheless, it also showed that the approach is somehow flawed. Some bounded test cases simply cannot be forced into a streaming execution, e.g. any GroupByKey will fail on the GlobalWindow if there's no trigger set. The initial reason for this approach was to prevent the Spark runner from failing when streaming was forced via pipeline options in VR tests for bounded test cases: Spark refuses to start if there's no streaming workload scheduled. Instead `TestSparkRunner` now just detects the translation mode and acts accordingly. Unfortunately, this hides any watermark issues uncovered above as VR tests succeed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
