Robert Burke created BEAM-10166:
-----------------------------------

             Summary: Improve execution time errors
                 Key: BEAM-10166
                 URL: https://issues.apache.org/jira/browse/BEAM-10166
             Project: Beam
          Issue Type: Task
          Components: sdk-go
            Reporter: Robert Burke


The Go SDK uses errors returned by DoFns to signal failures to process bundles, 
and terminate bundle processing. However, if the preceding DoFn uses emitters, 
rather than error returns, the code has no choice to panic to avoid user code 
handling or ignoring the cross DoFn error (which could cause dataloss or other 
correctness problems). 

All bundle executions are wrapped in `callNoPanic` to prevent worker 
termination on such panics, and orderly terminate just the affected bundle 
instead.`callNoPanic` uses Go's built in recover mechanism to get the error and 
provide a stack trace.

We can do better.

The value returned by recover is just an interface{} which means we could 
detect the specific type of error it is. In particular, we could have the exec 
package have an error that we can detect. If the recovered value is that error, 
then we could use that to provide a clearer error message  than a panic stack 
trace.
Such an error wrapper would contain: the error in question, the user DoFn that 
caused it, the debug id of the DoFn node (To be related back to the plan.)

Then in `callNoPanic` we could detect this error wrapper and produce a clearer 
error message based on the existing plan. If not, we can maintain the current 
behavior. This latter part is necessary to handle panics originating in user 
code. 
To avoid mistaken user use which would breach this protocol, we're best off 
keeping the wrapper unexported from the exec package.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to