[
https://issues.apache.org/jira/browse/BEAM-11076?focusedWorklogId=543637&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543637
]
ASF GitHub Bot logged work on BEAM-11076:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 28/Jan/21 13:29
Start Date: 28/Jan/21 13:29
Worklog Time Spent: 10m
Work Description: scwhittle opened a new pull request #13831:
URL: https://github.com/apache/beam/pull/13831
…ndowViaWindowSetFn
It shows up as 1% cpu on a nexmark Query11 benchmark.
------------------------
Thank you for your contribution! Follow this checklist to help us
incorporate your contribution quickly and easily:
- [ ] [**Choose
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA
issue, if applicable. This will automatically link the pull request to the
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
See the [Contributor Guide](https://beam.apache.org/contribute) for more
tips on [how to make review process
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
--- | --- | --- | --- | --- | --- | ---
Go | [](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
| --- | [](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
| --- | [](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
| ---
Java | [](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
Python | [](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
| --- | [](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
| ---
XLang | [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/)
| --- | [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
| ---
Pre-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
--- |Java | Python | Go | Website | Whitespace | Typescript
--- | --- | --- | --- | --- | --- | ---
Non-portable | [](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/)
<br>[](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/)
Portable | --- | [](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/)
| --- | --- | --- | ---
See
[.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md)
for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
------------------------------------------------------------------------------------------------
[](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
[](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
[](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more
information about GitHub Actions CI.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 543637)
Remaining Estimate: 0h
Time Spent: 10m
> Go Direct Runner Improvements
> -----------------------------
>
> Key: BEAM-11076
> URL: https://issues.apache.org/jira/browse/BEAM-11076
> Project: Beam
> Issue Type: Improvement
> Components: sdk-go
> Reporter: Robert Burke
> Priority: P3
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The Go SDK has a simple direct runner intended for basic batch framework
> testing. That is, it's only suitable for the barest tests, and not that it
> ensures that the basics work for arbitrary pipelines.
> The runner has the following features:
> * Operates on the direct pipeline graph without marshalling through the beam
> protos.
> ** This prevents it from validating that the pipeline is valid for portable
> runners.
> * Executes the whole pipeline as a single bundle, on a single worker thread.
> "in process"
> ** This renders it only suitable for very small data sets, that likely
> operate in memory.
> * Doesn't marshal elements.
> ** While this avoids notionally unnecessary work, it's another reason why
> users will run into errors after using the direct runner to "validate" their
> pipeline before moving to Spark or Flink.
> Further, the runner hasn't been validated for beam semantics, nor have more
> complex features of the Beam Model been implemented or validated. This makes
> it unsuitable for more than it's current use for demoing the SDK in basic
> batch operation, and the light use it has testing the SDK itself.
> However, implementing full beam semantics for a runner, even without the
> distributed portion is a project in itself. It's part of the beam design that
> implementing the semantics for a beam runner to be more complicated on the
> runner side vs the SDK side.
> But there's no reason why we can't improve the Go Direct Runner to match all
> semantics required of beam for single machine contexts.
> In particular the various improvements below could be made (and should
> probably be sharded into separate sub task JIRAs as required):
> * Convert the Go Direct Runner to a "Go Portable Runner" instead, which
> means implementing the Job Management and FnApi protocols. This would
> ensure that all runners are operating the Go SDK workers in the same way, via
> the harness.
> ** This doesn't preclude "go awareness" for operating everthing in a single
> binary, or later re-optimizing to avoid serialization.
> * Allow the runner to execute "headless" (as a job submission server).
> * Allow the runner to execute more than a single bundle at once.
> ** Enabling better use of CPU cores in single execution mode.
> * Add loopback and docker execution mode support, in addition to the Go "in
> process" support it has.
> * Once the runner can execute portable pipelines done, it becomes possible
> to run the Python and Java Runner Validation Tests against the runner to
> validate all the features of the [Beam Programming
> Model|https://beam.apache.org/documentation/programming-guide]
> ** Each feature / TestSuite of which should be handled in separate JIRAs.
> ** Adding jenkins runs of those passing tests to ensure ongoing validation
> of the runner against the model.
>
> A good place to start is being able to run and execute pipelines on the
> Python Portable runner, which implements all beam semantics correctly.
> Instructions for doing so are on [Go Tips page in the Dev
> Wiki|https://cwiki.apache.org/confluence/display/BEAM/Go+Tips].
>
> Direct Runner Code:
> [https://github.com/apache/beam/tree/master/sdks/go/pkg/beam/runners/direct]
> SDK Harness Code:
> [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/harness/harness.go]
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)