[ 
https://issues.apache.org/jira/browse/BEAM-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250033#comment-17250033
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-4244:
-----------------------------------------------------

So this is complicated by the fact that runners may use coders at arbitrary 
steps not just in pipeline steps. For example, Dataflow adds steps to serialize 
intermediate data where it might use coders defined for surrounding steps.

 

I think a better solution for your specific case might be to validate data in a 
ParDo step before such data is encoded by coders in following steps. This might 
allow you to programmatically handle non-conforming messages programmatically 
from your pipeline.

> Provide a better way for programmatically handling errors raised while 
> encoding/decoding data
> ---------------------------------------------------------------------------------------------
>
>                 Key: BEAM-4244
>                 URL: https://issues.apache.org/jira/browse/BEAM-4244
>             Project: Beam
>          Issue Type: New Feature
>          Components: beam-model, runner-core
>            Reporter: Chamikara Madhusanka Jayalath
>            Priority: P3
>
> Beam runners use coders in various stages of a pipeline to encode/decode 
> data. Coders are executed directly by the runner of a pipeline and user do 
> not have control over exceptions raised during encoding/decoding (could be 
> either due to malformed/corrupted data provided by users or intermediate 
> malformed/corrupted data generated during the system execution).
> Currently users can rely on runner-specific worker logging to detect the 
> error and update the pipeline but it would be better if we can provide a way 
> to programmatically handle these errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to