[ 
https://issues.apache.org/jira/browse/BEAM-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142378#comment-17142378
 ] 

Robert Burke edited comment on BEAM-10166 at 7/6/20, 8:37 PM:
--------------------------------------------------------------

I like it. The UID and PID thing will make more sense in context (since the 
error will be bolstered with the plan that it failed in IIRC). You're right 
that "Additional Info" isn't common in Go style. It's usually clear from 
context that additional info is such. In this case the UID and PID are here to 
help associate it with other information, like the execution plan (the uid), 
and what the runner using to refer to the DoFn (the pid)

An alternative worth consideration: {{DoFn[\{uid};\{pid}]\{doFn} returned 
error:\{error}}}

A bit more concise. Provides clarity about what doFn actually was far as the 
framework is concerned, and puts the fixed/small length information before the 
variable length information.


was (Author: lostluck):
I like it. The UID and PID thing will make more sense in context (since the 
error will be bolstered with the plan that it failed in IIRC). You're right 
that "Additional Info" isn't common in Go style. It's usually clear from 
context that additional info is such. In this case the UID and PID are here to 
help associate it with other information, like the execution plan (the uid), 
and what the runner using to refer to the DoFn (the pid)

An alternative worth consideration: `DoFn[{uid};{pid}]  {doFn} returned error: 
{error}`

A bit more concise. Provides clarity about what doFn actually was far as the 
framework is concerned, and puts the fixed/small length information before the 
variable length information.

> Improve execution time errors
> -----------------------------
>
>                 Key: BEAM-10166
>                 URL: https://issues.apache.org/jira/browse/BEAM-10166
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-go
>            Reporter: Robert Burke
>            Assignee: Aaron Tillekeratne
>            Priority: P3
>              Labels: beginner, n00b, starter
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> The Go SDK uses errors returned by DoFns to signal failures to process 
> bundles, and terminate bundle processing. However, if the preceding DoFn uses 
> emitters, rather than error returns, the code has no choice to panic to avoid 
> user code handling or ignoring the cross DoFn error (which could cause 
> dataloss or other correctness problems). 
> All bundle executions are wrapped in 
> `[callNoPanic|https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/util.go#L37]`
>  to prevent worker termination on such panics, and orderly terminate just the 
> affected bundle instead.`callNoPanic` uses Go's built in recover mechanism to 
> get the error and provide a stack trace.
> We can do better.
> The value returned by recover is just an interface{} which means we could 
> detect the specific type of error it is. In particular, we could have the 
> exec package have an error that we can detect. If the recovered value is that 
> error, then we could use that to provide a clearer error message  than a 
> panic stack trace.
> Such an error wrapper would contain: the error in question, the user DoFn 
> that caused it, the debug id of the DoFn node (To be able to relate it back 
> to the plan.)
> See https://gobyexample.com/errors and [other articles on creating custom 
> errors in 
> Go|https://www.digitalocean.com/community/tutorials/creating-custom-errors-in-go].
>  It doesn't need to be complicated.
> Then in `callNoPanic` we could detect this error wrapper and produce a 
> clearer error message based on the existing plan. If not, we can maintain the 
> current behavior. This latter part is necessary to handle panics originating 
> in user code. 
> To avoid mistaken user use which would breach this protocol, we're best off 
> keeping the wrapper unexported from the exec package.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to