[ 
https://issues.apache.org/jira/browse/BEAM-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16954083#comment-16954083
 ] 

Ning Kang commented on BEAM-8425:
---------------------------------

1. We record all the cells that change the pipeline graph in a list.
2. Every time we change the pipeline graph, we go through the cells in that 
list and see if any of them is missing in the notebook file. If so, it means 
the user has either deleted the cell or has re-executed the cell, we print a 
warning and remove the cell from the list.
3. Continue the operation.

The way to do (1), is to have the interactive runner record the current cell 
number upon each .apply call. The current cell number is accessible through 
len(In)-1.

Note that we are not changing the behavior of the Beam programming model, 
except for printing a warning when the user has deleted or re-executed a cell 
that had side effect on the job graph, and that's when we think the confusion 
likely arises.

> Notifying Interactive Beam user about Beam related cell deletion or 
> re-execution
> --------------------------------------------------------------------------------
>
>                 Key: BEAM-8425
>                 URL: https://issues.apache.org/jira/browse/BEAM-8425
>             Project: Beam
>          Issue Type: New Feature
>          Components: runner-py-interactive
>            Reporter: Ning Kang
>            Assignee: Ning Kang
>            Priority: Major
>
> There is a general problem about Interactive Notebooks that when an end user 
> deletes a cell that has been executed or re-executes a cell, those previous 
> executions are hidden from the end user.
> However, hidden states will still have side effects in the notebook.
> This kind of problem bothers Beam even more because Beam's pipeline 
> construction statements are note idempotent and pipeline execution is 
> decoupled and deferred from pipeline construction.
> Re-executing a cell with Beam statements that build a pipeline would cause 
> unexpected pipeline state and the user wouldn't notice it due to the problem 
> of notebooks.
> We'll intercept each transform application invocation from the 
> InteractiveRunner and record the ipython/notebook prompt number. Then each 
> time a user executes a cell that applies PTransform, we'll compare the 
> recorded list of prompt numbers with current notebook file's content and 
> figure out if there is any missing number. If so, we know for sure that a 
> re-execution happens and use display manager to notify the end user of 
> potential side effects caused by hidden states of the notebook/ipython.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to