[
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=353804&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-353804
]
ASF GitHub Bot logged work on BEAM-7926:
----------------------------------------
Author: ASF GitHub Bot
Created on: 04/Dec/19 21:32
Start Date: 04/Dec/19 21:32
Worklog Time Spent: 10m
Work Description: KevinGG commented on pull request #10276: [BEAM-7926]
Data-centric Interactive Part1
URL: https://github.com/apache/beam/pull/10276#discussion_r353993939
##########
File path: sdks/python/apache_beam/utils/interactive_utils.py
##########
@@ -0,0 +1,98 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Common interactive utility module.
+
+For experimental usage only; no backwards-compatibility guarantees.
+"""
+from __future__ import absolute_import
+
+import logging
+
+_LOGGER = logging.getLogger(__name__)
+
+
+def is_in_ipython():
+ """Determines if current code is executed within an ipython session."""
+ is_in_ipython = False
+ # Check if the runtime is within an interactive environment, i.e., ipython.
+ try:
+ from IPython import get_ipython # pylint: disable=import-error
+ if get_ipython():
+ is_in_ipython = True
+ except ImportError:
+ pass # If dependencies are not available, then not interactive for sure.
+ return is_in_ipython
+
+
+def is_in_notebook():
+ """Determines if current code is executed from an ipython notebook.
+
+ If is_in_notebook() is True, then is_in_ipython() must also be True.
+ """
+ is_in_notebook = False
+ if is_in_ipython():
+ # The import and usage must be valid under the execution path.
+ from IPython import get_ipython
+ if 'IPKernelApp' in get_ipython().config:
+ is_in_notebook = True
+ return is_in_notebook
+
+
+def alter_label_if_interactive(transform, pvalueish):
+ """Alters the label to an interactive label with ipython prompt metadata
+ prefixed for the given transform if the given pvalueish belongs to a
+ user-defined pipeline. Otherwise, noop.
+
+ A label is either a user-defined or auto-generated str name of a PTransform
+ that is unique within a pipeline. If current environment is_in_ipython(),
Beam
+ can implicitly create interactive labels to replace labels of root
PTransforms
+ to be applied. The label is formatted as `Cell {prompt}: {original_label}`.
+ """
+ if is_in_ipython():
+ from apache_beam.runners.interactive import interactive_environment as ie
+ # Tracks user defined pipeline instances in watched scopes so that we only
+ # alter labels for any transform to pvalueish belonging to those pipeline
+ # instances, excluding any transform to be applied in other pipeline
+ # instances the Beam SDK creates implicitly.
+ ie.current_env().track_user_pipelines()
+ from IPython import get_ipython
+ prompt = get_ipython().execution_count
+ pipeline = _extract_pipeline_of_pvalueish(pvalueish)
+ if not pipeline:
+ _LOGGER.warning('Failed to alter the label of a transform with the '
+ 'ipython prompt metadata. Cannot figure out the pipeline
'
+ 'that the given pvalueish %s belongs to. Thus noop.'
+ % pvalueish)
+ if (pipeline
+ # We only alter for transforms to be applied to user-defined pipelines
+ # at pipeline construction time.
+ and pipeline in ie.current_env().tracked_user_pipelines):
+ transform.label = 'Cell {}: {}'.format(prompt, transform.label)
+
+
+def _extract_pipeline_of_pvalueish(pvalueish):
+ """Extracts the pipeline that the given pvalueish belongs to."""
Review comment:
Extracts the pipeline.
When constructing a pipeline, it's unlikely to run into the first few
conditional paths.
When running a pipeline, under the hood, Beam SDK and runners do a lot of
magic including implicitly building the real pipelines for execution. I see
pvalueish being tuples and dicts from time to time.
So the first step is to get a concrete pvalue from the pvalueish. Any pvalue
would be fine.
Then fetch the pipeline the pvalue belongs to.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 353804)
Time Spent: 25h 10m (was: 25h)
> Show PCollection with Interactive Beam in a data-centric user flow
> ------------------------------------------------------------------
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
> Issue Type: New Feature
> Components: runner-py-interactive
> Reporter: Ning Kang
> Assignee: Ning Kang
> Priority: Major
> Time Spent: 25h 10m
> Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
>
> {code:java}
> p = beam.Pipeline(InteractiveRunner())
> pcoll = p | 'Transform' >> transform()
> pcoll2 = ...
> pcoll3 = ...{code}
> The use can call a single function and get auto-magical charting of the data.
> e.g.,
> {code:java}
> show(pcoll, pcoll2)
> {code}
> Throughout the process, a pipeline fragment is built to include only
> transforms necessary to produce the desired pcolls (pcoll and pcoll2) and
> execute that fragment.
> This makes the Interactive Beam user flow data-centric.
>
> Detailed
> [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)