pabloem opened a new pull request #10291: [BEAM-7516][BEAM-8823] FnApiRunner works with work queues, and a primitive watermark manager URL: https://github.com/apache/beam/pull/10291 These changes to the FnApiRunner prepare it to work on the streaming case. Changes, in general are: - Execution now is driven by executing available early work. - Before this change, execution was driven by fully executing each stage one by one. - There is a primitive watermark manager that tracks the advancement of the watermark across the pipeline. Next steps: - Support for TestStream, to enable streaming-based testing of the FnApiRunner. - Creating a first streaming source. **Worth noting:** - Note that commit https://github.com/apache/beam/pull/10291/commits/da37922 changes the behavior of a particular corner case. - A number of SDF-related tests are skipped on multiworkers. [BEAM-8833](https://jira.apache.org/jira/browse/BEAM-8833) was filed to track what is the potential issue here. **Benchmarking:** Ran the `fn_api_runner_microbenchmark`. For this benchmark, the most important metric is Fixed Cost, as it generally calculates the cost of overhead of running all of the stages (fixed number of stages). The overhead is about 6% (`avg(with change)/avg(without) - 1`). Without changes: Run | Fixed Cost | Per Element --- | --- | --- Run 1 | 4.740000717798869 | 0.006113449501268791 Run 2 | 4.835741568883261 | 0.005539953520803741 With changes: Run | Fixed Cost | Per Element --- | --- | --- Run 1 | 5.128020217259724 | 0.006195688189882221 Run 2 | 5.0422313202222195 | 0.0062748139843796236 r: @robertwb cc: @boyuanzz @aaltay As a tip for reviewing, I'd say that the main function to look into is in this set of changes: https://github.com/apache/beam/pull/10291/files#diff-adf22e1eb39b451e73d4541052a33686R740-R811 Everything else should follow from that. Post-Commit Tests Status (on master branch) ------------------------------------------------------------------------------------------------ Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/) XLang | --- | --- | --- | [](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | --- Pre-Commit Tests Status (on master branch) ------------------------------------------------------------------------------------------------ --- |Java | Python | Go | Website --- | --- | --- | --- | --- Non-portable | [](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) Portable | --- | [](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | --- See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
