[
https://issues.apache.org/jira/browse/BEAM-11056?focusedWorklogId=499754&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-499754
]
ASF GitHub Bot logged work on BEAM-11056:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 13/Oct/20 00:52
Start Date: 13/Oct/20 00:52
Worklog Time Spent: 10m
Work Description: davidyan74 commented on a change in pull request #13080:
URL: https://github.com/apache/beam/pull/13080#discussion_r503607479
##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -19,21 +19,21 @@
For internal use only; no backwards-compatibility guarantees.
-A background caching job is a job that captures events for all capturable
+A background caching job is a job that records events for all recordable
Review comment:
"Background caching job" -> "Background source recording job". Please
check all occurrences.
##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -56,7 +56,7 @@ class BackgroundCachingJob(object):
"""A simple abstraction that controls necessary components of a timed and
space limited background caching job.
- A background caching job successfully completes source data capture in 2
+ A background caching job successfully completes source data record in 2
Review comment:
recording
##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -165,9 +165,9 @@ def is_background_caching_job_needed(user_pipeline):
# If this is True, we can invalidate a previous done/running job if there is
# one.
cache_changed = is_source_to_cache_changed(user_pipeline)
- # When capture replay is disabled, cache is always needed for capturable
+ # When record replay is disabled, cache is always needed for recordable
Review comment:
recording
##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -165,9 +165,9 @@ def is_background_caching_job_needed(user_pipeline):
# If this is True, we can invalidate a previous done/running job if there is
# one.
cache_changed = is_source_to_cache_changed(user_pipeline)
- # When capture replay is disabled, cache is always needed for capturable
+ # When record replay is disabled, cache is always needed for recordable
# sources (if any).
- if need_cache and not ie.current_env().options.enable_capture_replay:
+ if need_cache and not ie.current_env().options.enable_record_replay:
Review comment:
enable_recording_replay
##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -19,21 +19,21 @@
For internal use only; no backwards-compatibility guarantees.
-A background caching job is a job that captures events for all capturable
+A background caching job is a job that records events for all recordable
sources of a given pipeline. With Interactive Beam, one such job is started
when
a pipeline run happens (which produces a main job in contrast to the background
caching job) and meets the following conditions:
- #. The pipeline contains capturable sources, configured through
- interactive_beam.options.capturable_sources.
+ #. The pipeline contains recordable sources, configured through
+ interactive_beam.options.recordable_sources.
#. No such background job is running.
#. No such background job has completed successfully and the cached events
are
- still valid (invalidated when capturable sources change in the pipeline).
+ still valid (invalidated when recordable sources change in the pipeline).
Once started, the background caching job runs asynchronously until it hits some
-capture limit configured in interactive_beam.options. Meanwhile, the main job
+record limit configured in interactive_beam.options. Meanwhile, the main job
Review comment:
recording
##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -301,13 +301,13 @@ def sizeof_fmt(num, suffix='B'):
'In order to have a deterministic replay, a segment of data will '
'be recorded from all sources for %s seconds or until a total of '
'%s have been written to disk.',
- options.capture_duration.total_seconds(),
- sizeof_fmt(options.capture_size_limit))
+ options.record_duration.total_seconds(),
Review comment:
recording_duration
##########
File path: sdks/python/apache_beam/runners/interactive/interactive_beam.py
##########
@@ -52,98 +53,116 @@
class Options(interactive_options.InteractiveOptions):
"""Options that guide how Interactive Beam works."""
@property
- def enable_capture_replay(self):
- """Whether replayable source data capture should be replayed for multiple
- PCollection evaluations and pipeline runs as long as the data captured is
+ def enable_record_replay(self):
Review comment:
enable_recording_replay. Basically, if "capture" is used as a noun,
change it to "recording" instead of "record", since "record" might have a
notion of an individual record. Please check all occurrences.
##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -301,13 +301,13 @@ def sizeof_fmt(num, suffix='B'):
'In order to have a deterministic replay, a segment of data will '
'be recorded from all sources for %s seconds or until a total of '
'%s have been written to disk.',
- options.capture_duration.total_seconds(),
- sizeof_fmt(options.capture_size_limit))
+ options.record_duration.total_seconds(),
+ sizeof_fmt(options.record_size_limit))
Review comment:
recording_size_limit
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 499754)
Time Spent: 1h (was: 50m)
> Fix warning message and rename old APIs
> ---------------------------------------
>
> Key: BEAM-11056
> URL: https://issues.apache.org/jira/browse/BEAM-11056
> Project: Beam
> Issue Type: Bug
> Components: runner-py-interactive
> Reporter: Ning Kang
> Assignee: Ning Kang
> Priority: P2
> Time Spent: 1h
> Remaining Estimate: 0h
>
> When invoking `ib.evict_captured_data()`, the logging contains a typo
> `recordeddata`:
> `You have requested Interactive Beam to evict all recordeddata that could be
> deterministically replayed among multiple pipeline runs.`
> Also, the `capture_control` should be renamed to `record_control`. All
> occurrences of `capture` should be changed to `record` to keep the
> consistency of large source recording improvements.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)