[
https://issues.apache.org/jira/browse/BEAM-10303?focusedWorklogId=466917&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-466917
]
ASF GitHub Bot logged work on BEAM-10303:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 05/Aug/20 17:53
Start Date: 05/Aug/20 17:53
Worklog Time Spent: 10m
Work Description: lukecwik commented on a change in pull request #12430:
URL: https://github.com/apache/beam/pull/12430#discussion_r465902065
##########
File path:
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java
##########
@@ -1029,7 +1040,27 @@ public double getProgress() {
private Progress getProgress() {
synchronized (splitLock) {
if (currentTracker instanceof RestrictionTracker.HasProgress) {
- return ((HasProgress) currentTracker).getProgress();
+ Progress progress = ((HasProgress) currentTracker).getProgress();
+ double totalWork = progress.getWorkCompleted() +
progress.getWorkRemaining();
+ double completed =
+ totalWork * currentWindowIterator.previousIndex() +
progress.getWorkCompleted();
+ double remaining =
+ totalWork * (currentElement.getWindows().size() -
currentWindowIterator.nextIndex())
+ + progress.getWorkRemaining();
+ return Progress.from(completed, remaining);
+ }
+ }
+ return null;
+ }
+
+ private Progress getProgressFromWindowObservingTruncate(double
elementCompleted) {
+ synchronized (splitLock) {
+ if (currentWindowIterator != null) {
Review comment:
Originally the idea was that we didn't want the SDK to have to perform
these calculations and it is why each operator was going to report
work_completed/work_remaining if it had them but it seems like accurate
splitting by fraction needs to take it into account.
Using the graph to compute the progress shouldn't be any more/less difficult
then the work that is being put into the SDK.
Is there still value in reporting the work_completed/work_remaining metrics
then?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 466917)
Time Spent: 4h 40m (was: 4.5h)
> FnApiDoFnRunner window observing optimization
> ---------------------------------------------
>
> Key: BEAM-10303
> URL: https://issues.apache.org/jira/browse/BEAM-10303
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-harness
> Reporter: Luke Cwik
> Assignee: Luke Cwik
> Priority: P2
> Labels: portability
> Time Spent: 4h 40m
> Remaining Estimate: 0h
>
> Currently the FnApiDoFnRunner processes each element within it's own window.
> There is an easy optimization where we process the element once if and only
> if the function doesn't observe the window (either directly or indirectly via
> side inputs/state/...).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)