[ 
https://issues.apache.org/jira/browse/BEAM-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856186#comment-16856186
 ] 

Ahmet Altay commented on BEAM-7395:
-----------------------------------

Hi [~evindj] if you are planning to work on Beam's direct runner, that issue is 
tracked here: https://issues.apache.org/jira/browse/BEAM-529

There is no DirectPipelineRunner because that name was changed on previous 
refactorings. Code is here: 
https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/direct

The idea here is that, elements passed to the process method should not be 
mutated by process(). Direct runner can enforce this by serializing elements 
before process call and checking after the call. Doing this in distributed 
runners like Dataflow will have a significant cost, but we can potentially do 
it based on sampling (e.g. check on 1 every N elements.)

> Check immutability violations in Dataflow Runner (as an option)
> ---------------------------------------------------------------
>
>                 Key: BEAM-7395
>                 URL: https://issues.apache.org/jira/browse/BEAM-7395
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Ahmet Altay
>            Assignee: Innocent
>            Priority: Minor
>              Labels: newbie, starter
>
> Users are going to mutate inputs and outputs of DoFn inappropriately. We 
> should help their tests fail to catch such mistakes. (Similar to the 
> DirectPipelineRunner in Java SDK) This should be offered as an option and 
> worked based on sampling because of the cost of these checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to