[
https://issues.apache.org/jira/browse/BEAM-10708?focusedWorklogId=650210&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-650210
]
ASF GitHub Bot logged work on BEAM-10708:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 13/Sep/21 19:35
Start Date: 13/Sep/21 19:35
Worklog Time Spent: 10m
Work Description: KevinGG commented on a change in pull request #15490:
URL: https://github.com/apache/beam/pull/15490#discussion_r707624430
##########
File path:
sdks/python/apache_beam/runners/interactive/interactive_environment.py
##########
@@ -235,9 +239,16 @@ def is_in_notebook(self):
@property
def inspector(self):
"""Gets the singleton InteractiveEnvironmentInspector to retrieve
- information consumable by other applications."""
+ information consumable by other applications such as a notebook
+ extension."""
return self._inspector
+ @property
+ def inspector_with_synthetic(self):
+ """Gets the singleton InteractiveEnvironmentInspector with additional
+ synthetic variables generated by Interactive Beam. Internally used."""
+ return self._inspector_with_synthetic
+
Review comment:
The synthetic PCollections are the ones created when caching unbounded
source outputs.
The inspector was originally created to list all the PCollections and
pipelines defined in the notebook and then display them in Jupyter. Since the
synthetic ones are not defined by the user, the should be hidden from them in
that use case.
However, internally, when looking for the cacheables, the synthetic ones are
needed to correctly create WriteCache transforms. Thus, we need a separate
inspector. Or at least the inspector should produce 2 views of PCollections:
one with the synthetics and one without.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 650210)
Time Spent: 32.5h (was: 32h 20m)
> InteractiveRunner cannot execute pipeline with cross-language transform
> -----------------------------------------------------------------------
>
> Key: BEAM-10708
> URL: https://issues.apache.org/jira/browse/BEAM-10708
> Project: Beam
> Issue Type: Bug
> Components: cross-language
> Reporter: Brian Hulette
> Assignee: Ning
> Priority: P2
> Time Spent: 32.5h
> Remaining Estimate: 0h
>
> The InteractiveRunner crashes when given a pipeline that includes a
> cross-language transform.
> Here's the example I tried to run in a jupyter notebook:
> {code:python}
> p = beam.Pipeline(InteractiveRunner())
> pc = (p | SqlTransform("""SELECT
> CAST(1 AS INT) AS `id`,
> CAST('foo' AS VARCHAR) AS `str`,
> CAST(3.14 AS DOUBLE) AS `flt`"""))
> df = interactive_beam.collect(pc)
> {code}
> The problem occurs when
> [pipeline_fragment.py|https://github.com/apache/beam/blob/dce1eb83b8d5137c56ac58568820c24bd8fda526/sdks/python/apache_beam/runners/interactive/pipeline_fragment.py#L66]
> creates a copy of the pipeline by [writing it to proto and reading it
> back|https://github.com/apache/beam/blob/dce1eb83b8d5137c56ac58568820c24bd8fda526/sdks/python/apache_beam/runners/interactive/pipeline_fragment.py#L120].
> Reading it back fails because some of the pipeline is not written in Python.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)