[ 
https://issues.apache.org/jira/browse/BEAM-10166?focusedWorklogId=455079&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-455079
 ]

ASF GitHub Bot logged work on BEAM-10166:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Jul/20 20:45
            Start Date: 06/Jul/20 20:45
    Worklog Time Spent: 10m 
      Work Description: lostluck commented on a change in pull request #12170:
URL: https://github.com/apache/beam/pull/12170#discussion_r450470297



##########
File path: sdks/go/pkg/beam/core/runtime/exec/util.go
##########
@@ -27,6 +28,28 @@ type GenID struct {
        last int
 }
 
+// doFnError is a helpful error where DoFns with emitters have failed.
+//
+// Golang SDK uses errors returned by DoFns to signal failures to process
+// bundles and terminate bundle processing. However, in the case of  emitters,
+// a panic is used as a means of bundle process termination, which  in turn is
+// wrapped by callNoPanic to recover the panicking code and return golang error
+// instead. However, formatting this error yields the stack trace a message
+// which is both long, and unhelpful the user.
+//
+// This error enables to create a more helpful user error where a panic is
+// caught where the origin is due to a failed DoFn execution in process bundle
+// execution.
+type doFnError struct {
+       doFn string
+       err  error
+       uid  UnitID
+       pid  string
+}
+func (e *doFnError) Error() string {
+       return fmt.Sprintf("%v caused error %v, uid is %v pid is %v", e.doFn, 
e.err, e.uid, e.pid)

Review comment:
       The wiki syntax in JIRA really butchered my intended suggestion of 
`DoFn[<uid>;<pid>]<doFn> returned error:<error>`
   
   Which places the  short/known length things before the potentially long and 
rife with newlines error. It puts the context first, and avoids it getting lost 
or truncated behind a very long error string.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 455079)
    Time Spent: 1h 10m  (was: 1h)

> Improve execution time errors
> -----------------------------
>
>                 Key: BEAM-10166
>                 URL: https://issues.apache.org/jira/browse/BEAM-10166
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-go
>            Reporter: Robert Burke
>            Assignee: Aaron Tillekeratne
>            Priority: P3
>              Labels: beginner, n00b, starter
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The Go SDK uses errors returned by DoFns to signal failures to process 
> bundles, and terminate bundle processing. However, if the preceding DoFn uses 
> emitters, rather than error returns, the code has no choice to panic to avoid 
> user code handling or ignoring the cross DoFn error (which could cause 
> dataloss or other correctness problems). 
> All bundle executions are wrapped in 
> `[callNoPanic|https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/util.go#L37]`
>  to prevent worker termination on such panics, and orderly terminate just the 
> affected bundle instead.`callNoPanic` uses Go's built in recover mechanism to 
> get the error and provide a stack trace.
> We can do better.
> The value returned by recover is just an interface{} which means we could 
> detect the specific type of error it is. In particular, we could have the 
> exec package have an error that we can detect. If the recovered value is that 
> error, then we could use that to provide a clearer error message  than a 
> panic stack trace.
> Such an error wrapper would contain: the error in question, the user DoFn 
> that caused it, the debug id of the DoFn node (To be able to relate it back 
> to the plan.)
> See https://gobyexample.com/errors and [other articles on creating custom 
> errors in 
> Go|https://www.digitalocean.com/community/tutorials/creating-custom-errors-in-go].
>  It doesn't need to be complicated.
> Then in `callNoPanic` we could detect this error wrapper and produce a 
> clearer error message based on the existing plan. If not, we can maintain the 
> current behavior. This latter part is necessary to handle panics originating 
> in user code. 
> To avoid mistaken user use which would breach this protocol, we're best off 
> keeping the wrapper unexported from the exec package.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to