[
https://issues.apache.org/jira/browse/BEAM-3042?focusedWorklogId=82952&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82952
]
ASF GitHub Bot logged work on BEAM-3042:
----------------------------------------
Author: ASF GitHub Bot
Created on: 21/Mar/18 22:03
Start Date: 21/Mar/18 22:03
Worklog Time Spent: 10m
Work Description: chamikaramj closed pull request #4912: [BEAM-3042]
Fixing check for sideinput_io_metrics experiment flag.
URL: https://github.com/apache/beam/pull/4912
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git a/sdks/python/apache_beam/runners/worker/operations.py
b/sdks/python/apache_beam/runners/worker/operations.py
index 11ff909f3e9..0fa32e3c997 100644
--- a/sdks/python/apache_beam/runners/worker/operations.py
+++ b/sdks/python/apache_beam/runners/worker/operations.py
@@ -289,8 +289,7 @@ def _read_side_inputs(self, tags_and_types):
assert self.side_input_maps is None
# Get experiments active in the worker to check for side input metrics exp.
- experiments = set(
- RuntimeValueProvider.get_value('experiments', str, '').split(','))
+ experiments = RuntimeValueProvider.get_value('experiments', list, [])
# We will read the side inputs in the order prescribed by the
# tags_and_types argument because this is exactly the order needed to
diff --git a/sdks/python/apache_beam/runners/worker/sideinputs.py
b/sdks/python/apache_beam/runners/worker/sideinputs.py
index cc405e0e477..77157857b05 100644
--- a/sdks/python/apache_beam/runners/worker/sideinputs.py
+++ b/sdks/python/apache_beam/runners/worker/sideinputs.py
@@ -105,8 +105,7 @@ def _start_reader_threads(self):
def _reader_thread(self):
# pylint: disable=too-many-nested-blocks
- experiments = set(
- RuntimeValueProvider.get_value('experiments', str, '').split(','))
+ experiments = RuntimeValueProvider.get_value('experiments', list, [])
try:
while True:
try:
diff --git a/sdks/python/apache_beam/runners/worker/sideinputs_test.py
b/sdks/python/apache_beam/runners/worker/sideinputs_test.py
index 212dc19fde9..050ecdc5003 100644
--- a/sdks/python/apache_beam/runners/worker/sideinputs_test.py
+++ b/sdks/python/apache_beam/runners/worker/sideinputs_test.py
@@ -92,7 +92,7 @@ def test_bytes_read_behind_experiment(self):
def test_bytes_read_are_reported(self):
RuntimeValueProvider.set_runtime_options(
- {'experiments': 'sideinput_io_metrics,other'})
+ {'experiments': ['sideinput_io_metrics', 'other']})
mock_read_counter = mock.MagicMock()
source_records = ['a', 'b', 'c', 'd']
sources = [
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 82952)
Time Spent: 40m (was: 0.5h)
> Add tracking of bytes read / time spent when reading side inputs
> ----------------------------------------------------------------
>
> Key: BEAM-3042
> URL: https://issues.apache.org/jira/browse/BEAM-3042
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Pablo Estrada
> Assignee: Pablo Estrada
> Priority: Major
> Time Spent: 40m
> Remaining Estimate: 0h
>
> It is difficult for Dataflow users to understand how modifying a pipeline or
> data set can affect how much inter-transform IO is used in their job. The
> intent of this feature request is to help users understand how side inputs
> behave when they are consumed.
> This will allow users to understand how much time and how much data their
> pipeline uses to read/write to inter-transform IO. Users will also be able to
> modify their pipelines and understand how their changes affect these IO
> metrics.
> For further information, please review the internal Google doc
> go/insights-transform-io-design-doc.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)