[
https://issues.apache.org/jira/browse/BEAM-10420?focusedWorklogId=460547&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460547
]
ASF GitHub Bot logged work on BEAM-10420:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 17/Jul/20 21:19
Start Date: 17/Jul/20 21:19
Worklog Time Spent: 10m
Work Description: lukecwik commented on a change in pull request #12275:
URL: https://github.com/apache/beam/pull/12275#discussion_r456679167
##########
File path: sdks/python/apache_beam/runners/common.py
##########
@@ -685,46 +689,40 @@ def invoke_process(self,
# or if the process accesses the window parameter. We can just call it once
# otherwise as none of the arguments are changing
+ residuals = []
if self.is_splittable:
- restriction_tracker = self.invoke_create_tracker(restriction)
- watermark_estimator = self.invoke_create_watermark_estimator(
- watermark_estimator_state)
-
- if len(windowed_value.windows) > 1 and self.has_windowed_inputs:
- # Should never get here due to window explosion in
- # the upstream pair-with-restriction.
- raise NotImplementedError(
- 'SDFs in multiply-windowed values with windowed arguments.')
with self.splitting_lock:
- self.threadsafe_restriction_tracker = ThreadsafeRestrictionTracker(
- restriction_tracker)
self.current_windowed_value = windowed_value
- self.threadsafe_watermark_estimator = (
- ThreadsafeWatermarkEstimator(watermark_estimator))
-
- restriction_tracker_param = (
- self.signature.process_method.restriction_provider_arg_name)
- if not restriction_tracker_param:
- raise ValueError(
- 'DoFn is splittable but DoFn does not have a '
- 'RestrictionTrackerParam defined')
- additional_kwargs[restriction_tracker_param] = (
- RestrictionTrackerView(self.threadsafe_restriction_tracker))
- watermark_param = (
- self.signature.process_method.watermark_estimator_provider_arg_name)
- # When the watermark_estimator is a NoOpWatermarkEstimator, the system
- # will not add watermark_param into the DoFn param list.
- if watermark_param is not None:
- additional_kwargs[watermark_param] =
self.threadsafe_watermark_estimator
+ self.restriction = restriction
+ self.watermark_estimator_state = watermark_estimator_state
try:
- return self._invoke_process_per_window(
- windowed_value, additional_args, additional_kwargs)
+ if self.has_windowed_inputs and len(windowed_value.windows) != 1:
Review comment:
It seems like we shouldn't produce any outputs if we are in 0 windows
but your recommendation makes sense.
Fixed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 460547)
Time Spent: 8h 40m (was: 8.5h)
> PerWindowInvoker to handle window observing SplittableDoFns
> -----------------------------------------------------------
>
> Key: BEAM-10420
> URL: https://issues.apache.org/jira/browse/BEAM-10420
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-harness
> Reporter: Luke Cwik
> Assignee: Luke Cwik
> Priority: P2
> Labels: portability
> Time Spent: 8h 40m
> Remaining Estimate: 0h
>
> Currently the FnApiDoFnRunner processes each element within it's own window.
> There is an easy optimization where we process the element once if and only
> if the function doesn't observe the window (either directly or indirectly via
> side inputs/state/...).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)