[ 
https://issues.apache.org/jira/browse/BEAM-14484?focusedWorklogId=772929&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772929
 ]

ASF GitHub Bot logged work on BEAM-14484:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/May/22 17:16
            Start Date: 20/May/22 17:16
    Worklog Time Spent: 10m 
      Work Description: jrmccluskey commented on code in PR #17724:
URL: https://github.com/apache/beam/pull/17724#discussion_r878372580


##########
sdks/go/pkg/beam/core/sdf/sdf.go:
##########
@@ -73,8 +73,8 @@ type RTracker interface {
        // reason), then this function returns nil as the residual.
        //
        // If the split fraction is 0 (e.g. a self-checkpointing split) 
TrySplit() should return either
-       // a nil primary or an RTracker that is both bounded and has size 0. 
This ensures that there is
-       // no data that is lost by not being rescheduled for execution later.
+       // a nil primary or a restriction that represents no remaining work. 
This will ensure that there
+       // is not data loss.

Review Comment:
   So the returned primary restriction is supposed to be the same as the 
restriction the RTracker holds as part of a typical runner-side split, so in 
general we do care that the returns are configured correctly.
   
   We do fail the pipeline to prevent the data loss, but I do prefer calling 
out the why of failing the pipeline. 





Issue Time Tracking
-------------------

    Worklog Id:     (was: 772929)
    Time Spent: 6h 50m  (was: 6h 40m)

> Improve error message surrounding primary returns in the self-checkpointing 
> code
> --------------------------------------------------------------------------------
>
>                 Key: BEAM-14484
>                 URL: https://issues.apache.org/jira/browse/BEAM-14484
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-go
>            Reporter: Jack McCluskey
>            Assignee: Jack McCluskey
>            Priority: P1
>          Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> The error message in the Go SDK harness around returned primaries in the 
> self-checkpointing code 
> ([https://github.com/apache/beam/blob/ea1f292e9cf31fc8c4803b10d811f0d3ee184ae7/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L375)]
>  is unclear and should be made more explicit. It should also guide the user 
> towards making sure that the restriction behaves properly in the 
> self-checkpointing case. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to