[ 
https://issues.apache.org/jira/browse/BEAM-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162415#comment-17162415
 ] 

Brian Hulette commented on BEAM-10275:
--------------------------------------

Closing this as obsolete since BEAM-10274 is fixed.

I do think we need to be more strict about the type of pipeline options we 
allow. If there's a problem with a user-defined option we should be able to 
fail at pipeline construction time, we shouldn't ever get to the point that an 
unparsable option makes it to a worker and causes it to crash. This is a 
particularly egregious problem on Dataflow due to the delay of starting up 
workers.

> sdk_worker_main.py eagerly parses pipeline options
> --------------------------------------------------
>
>                 Key: BEAM-10275
>                 URL: https://issues.apache.org/jira/browse/BEAM-10275
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-harness
>            Reporter: Brian Hulette
>            Assignee: Brian Hulette
>            Priority: P2
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> sdk_worker_main.py eagerly parses pipeline options beause of the call to 
> get_all_options here: 
> https://github.com/apache/beam/blob/61b665640d6c0f91751bba59782c0ac6aceacba6/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L138
> This can cause the worker to crash if any option that can't be read at 
> execution is time is used, even if we don't need to access it at execution 
> time (e.g. json.loads arguments, described in BEAM-10274)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to