[
https://issues.apache.org/jira/browse/BEAM-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856186#comment-16856186
]
Ahmet Altay commented on BEAM-7395:
-----------------------------------
Hi [~evindj] if you are planning to work on Beam's direct runner, that issue is
tracked here: https://issues.apache.org/jira/browse/BEAM-529
There is no DirectPipelineRunner because that name was changed on previous
refactorings. Code is here:
https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/direct
The idea here is that, elements passed to the process method should not be
mutated by process(). Direct runner can enforce this by serializing elements
before process call and checking after the call. Doing this in distributed
runners like Dataflow will have a significant cost, but we can potentially do
it based on sampling (e.g. check on 1 every N elements.)
> Check immutability violations in Dataflow Runner (as an option)
> ---------------------------------------------------------------
>
> Key: BEAM-7395
> URL: https://issues.apache.org/jira/browse/BEAM-7395
> Project: Beam
> Issue Type: New Feature
> Components: sdk-py-core
> Reporter: Ahmet Altay
> Assignee: Innocent
> Priority: Minor
> Labels: newbie, starter
>
> Users are going to mutate inputs and outputs of DoFn inappropriately. We
> should help their tests fail to catch such mistakes. (Similar to the
> DirectPipelineRunner in Java SDK) This should be offered as an option and
> worked based on sampling because of the cost of these checks.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)