[
https://issues.apache.org/jira/browse/BEAM-3304?focusedWorklogId=643547&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-643547
]
ASF GitHub Bot logged work on BEAM-3304:
----------------------------------------
Author: ASF GitHub Bot
Created on: 30/Aug/21 16:31
Start Date: 30/Aug/21 16:31
Worklog Time Spent: 10m
Work Description: lostluck commented on a change in pull request #15409:
URL: https://github.com/apache/beam/pull/15409#discussion_r698613685
##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -4578,6 +4579,7 @@ firings:
{{< code_sample "sdks/python/apache_beam/examples/snippets/snippets_test.py"
model_early_late_triggers >}}
{{< /highlight >}}
Review comment:
Since you've added snippets, I assume the intent is to include the
appropriate block so they get displayed:
```
{{< highlight go >}}
{{< code_sample "sdks/go/examples/snippets/09triggers.go"
model_early_late_triggers >}}
{{< /highlight >}}
```
And similar throughout.
You can see how things render using `./gradlew :website:serveWebsite ` to
stand up a localhost version of the website. Note, for new snippet files you
need to add at least one code_sample block for them to be known. I end up
iterating between using the gradle command above, stopping that and then
clearing the running docker containers `docker stop $(docker ps -q -f
ancestor=beam-website)` to see fresh changes on the next run of the gradle
command.
##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -4594,7 +4596,8 @@ modifying this behavior.
The `AfterProcessingTime` trigger operates on *processing time*. For example,
the <span
class="language-java">`AfterProcessingTime.pastFirstElementInPane()`</span>
-<span class="language-py">`AfterProcessingTime`</span> trigger emits a window
+<span class="language-py">`AfterProcessingTime`</span>
+<span class="language-go">`Trigger{Kind: AfterProcessingTimeTrigger, Delay:
5000}`</span> trigger emits a window
Review comment:
can probably simplify this down to `window.AfterProcessingTimeTrigger`
which matches how users will see it. We don't need to show the wrapping, that's
what the code sample blocks are for.
##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -4607,14 +4610,16 @@ window.
Beam provides one data-driven trigger,
<span class="language-java">`AfterPane.elementCountAtLeast()`</span>
-<span class="language-py">`AfterCount`</span>. This trigger works on an element
+<span class="language-py">`AfterCount`</span>
+<span class="language-go">`Trigger{Kind: ElementCountTrigger, ElementCount:
2}`</span>. This trigger works on an element
count; it fires after the current pane has collected at least *N* elements.
This
allows a window to emit early results (before all the data has accumulated),
which can be particularly useful if you are using a single global window.
It is important to note that if, for example, you specify
<span class="language-java">`.elementCountAtLeast(50)`</span>
-<span class="language-py">AfterCount(50)</span> and only 32 elements arrive,
+<span class="language-py">AfterCount(50)</span>
+<span class="language-go">`Trigger{Kind: ElementCountTrigger, ElementCount:
50}`</span> and only 32 elements arrive,
Review comment:
For contrast, this one indicates a specific number.
Note how verbose this is compared to Java and Python. Perhaps we should have
helper functions like `window.TriggerAfterCount(50)` to make the user
experience user easier? (it avoids users accidentally constructing bad triggers
too)
##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -4679,6 +4695,13 @@ trigger. To set a window to discard fired panes, set
`accumulation_mode` to
`DISCARDING`.
{{< /paragraph >}}
+{{< paragraph class="language-go" >}}
+To set a window to accumulate the panes that are produced when the trigger
+fires, set the `AccumulationMode{Mode:}` parameter to
`AccumulationMode_ACCUMULATING` when you set the
Review comment:
The phrasing here is awkward. Don't try to present it as setting a
parameter in these instances, demonstrate the code directly instead.
```
To set a window to accumulate the panes that are produced when the trigger
fires, use `beam.AccumulationMode{Mode: window.Accumulating}`.
```
etc.
##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -4679,6 +4695,13 @@ trigger. To set a window to discard fired panes, set
`accumulation_mode` to
`DISCARDING`.
{{< /paragraph >}}
+{{< paragraph class="language-go" >}}
+To set a window to accumulate the panes that are produced when the trigger
+fires, set the `AccumulationMode{Mode:}` parameter to
`AccumulationMode_ACCUMULATING` when you set the
Review comment:
It's not 100% clear to me whether we can make this a bit shorter for
users by having convenience methods, other than using separate convenience
method names: `beam.PanesAccumulate`, `beam.PanesDiscard`, (ignoring
`beam.PanesRetract` for now since nothing supports it IIRC...)
##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -4623,7 +4628,7 @@ either when I receive 50 elements, or every 1 second".
### 9.4. Setting a trigger {#setting-a-trigger}
When you set a windowing function for a `PCollection` by using the
-<span class="language-java">`Window`</span><span
class="language-py">`WindowInto`</span>
+<span class="language-java">`Window`</span><span
class="language-py">`WindowInto`</span><span
class="language-go">`WindowInto`</span>
Review comment:
In this case, to make it look more Go, clarify directly that it's
`beam.WindowInto` since that's how users will always see it.
Note: Fun feature of CSS: If two blockers *would* have the same content, you
can just put two classes in the span class argument. eg. `<span
class="language-py language-go">.
##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -4643,6 +4648,13 @@ element in that window has been processed. The
`accumulation_mode` parameter
sets the window's **accumulation mode**.
{{< /paragraph >}}
+{{< paragraph class="language-go" >}}
+You set the trigger(s) for a `PCollection` by passing in the `WindowTrigger`
parameter
Review comment:
When directly referring to code constructs for Go, prefer the way it
shows up in the code, including the usual package short name. eg.
`beam.WindowTrigger`, `beam.WindowInto`, `beam.AccumulationMode` etc. (Since
the PCollection type isn't being referred to in code, it doesn't get the
package prefix.)
##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -4607,14 +4610,16 @@ window.
Beam provides one data-driven trigger,
<span class="language-java">`AfterPane.elementCountAtLeast()`</span>
-<span class="language-py">`AfterCount`</span>. This trigger works on an element
+<span class="language-py">`AfterCount`</span>
+<span class="language-go">`Trigger{Kind: ElementCountTrigger, ElementCount:
2}`</span>. This trigger works on an element
Review comment:
Similar to the above we can probably simplify this down to
`window.ElementCountTrigger` which matches how users will see it. We don't need
to show the wrapping, that's what the code sample blocks are for.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 643547)
Time Spent: 7h 10m (was: 7h)
> Go triggering support
> ---------------------
>
> Key: BEAM-3304
> URL: https://issues.apache.org/jira/browse/BEAM-3304
> Project: Beam
> Issue Type: Improvement
> Components: sdk-go
> Reporter: Henning Rohde
> Assignee: Ritesh Ghorse
> Priority: P3
> Time Spent: 7h 10m
> Remaining Estimate: 0h
>
> `Add support for triggers.
> [https://beam.apache.org/documentation/programming-guide/#triggers]
> Triggers are special runner side behavior indicating how to handle data WRT
> the watermark and window. Commonly configuring the processing for “late data”
> and similar.
> These are not currently implemented for user use in the Go SDK. Reshuffle
> configures triggers, but it’s not accessible. A correct trigger
> implementation can at least re-implement Reshuffle in a user pipeline, rather
> than handled specially within the framework.
> * Requires extending the window package to be able to configure the various
> triggers.
> * Specifically being able to compose triggers as also permitted by the proto.
> **
> [https://github.com/apache/beam/blob/6e7b1c44bc7275ee047afc059fd610cd3f4e5bee/model/pipeline/src/main/proto/beam_runner_api.proto#L1111]
>
> * Requires updating the graphx package translate.go to marshal (and
> unmarshal?) the triggers to and from Beam PipelineProto Windowing strategies.
> * Requires supporting triggers with the beam.WindowInto transform for user
> pipeline use as well as complete documentation on its use from the user side.
> **
> [https://github.com/apache/beam/blob/6e7b1c44bc7275ee047afc059fd610cd3f4e5bee/sdks/go/pkg/beam/windowing.go]
>
> * Panes need to be decoded, otherwise triggering will cause runtime errors:
> [https://lists.apache.org/thread.html/r94c42d2d116f6464cd6b689543e5e578edf8310bf7c6e48a0958a56c%40%3Cdev.beam.apache.org%3E]
>
> * Handle pane propagation and observation in the exec package, and in user
> dofns.
> ** Panes indicate whether data was on time or not, and similar facets which
> may be relevant for processing.
> ** Might simply extend the existing window interface.
>
> Similar to windowing, many of the same places as
> https://issues.apache.org/jira/browse/BEAM-11100 need to be modified.
> At simplest though, it's mostly a runner side construction, with less concern
> on the exec side, and generally much simpler.
> Appropriate integration tests against portable runners must be implemented:
> [https://github.com/apache/beam/tree/master/sdks/go/test/integration/primitives]
>
> And optionally add support for the configurable triggers to the the Go Direct
> Runner. However, the results must be compared and validated against a
> semantically correct runner like the python portable runner first. At
> minimum, the Go Direct Runner should be made aware of triggers and produce a
> coherent error whenever there's a trigger it can't deal with.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)