[
https://issues.apache.org/jira/browse/GOBBLIN-1493?focusedWorklogId=631402&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631402
]
ASF GitHub Bot logged work on GOBBLIN-1493:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 30/Jul/21 00:14
Start Date: 30/Jul/21 00:14
Worklog Time Spent: 10m
Work Description: umustafi commented on a change in pull request #3336:
URL: https://github.com/apache/gobblin/pull/3336#discussion_r679563392
##########
File path:
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/TaskStateCollectorService.java
##########
@@ -235,6 +255,49 @@ public Void call() throws Exception {
this.eventBus.post(new
NewTaskCompletionEvent(ImmutableList.copyOf(taskStateQueue)));
}
+ /**
+ * Uses the size of work units to determine a job's progress and reports the
progress as a percentage via
+ * GobblinTrackingEvents
+ * @param taskState of job launched
+ */
+ private void reportJobProgress(TaskState taskState) {
+ String stringSize =
taskState.getProp(ServiceConfigKeys.WORK_UNIT_BYTE_SIZE);
+ if (stringSize == null) {
+ LOGGER.warn("Expected to report job progress but work unit byte size
property null");
+ return;
+ }
+
+ Long taskByteSize = Long.parseLong(stringSize);
+
+ // if progress reporting is enabled, value should be present
+ if (!this.jobState.contains(AbstractJobLauncher.TOTAL_BYTES_TO_COPY)) {
+ LOGGER.warn("Expected to report job progress but total bytes to copy
property null");
+ return;
+ }
+ this.totalSizeToCopy =
this.jobState.getPropAsLong(AbstractJobLauncher.TOTAL_BYTES_TO_COPY);
+
+ // avoid flooding Kafka message queue by sending GobblinTrackingEvents
only when threshold is passed
+ this.bytesCopiedSoFar += taskByteSize;
Review comment:
Yes it will be 0 until the first mapper completes but the percentage of
progress will also be 0 - one will not be set without the other and we hope
this is understandable enough to the user. In the usual case we expect the
first mapper to complete in seconds or 1-2 min.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 631402)
Time Spent: 7h (was: 6h 50m)
> Data Copy Progress Reporting
> -----------------------------
>
> Key: GOBBLIN-1493
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1493
> Project: Apache Gobblin
> Issue Type: New Feature
> Components: gobblin-core, gobblin-service
> Reporter: Urmi Mustafi
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 7h
> Remaining Estimate: 0h
>
> Progress reporting for a data copy will provide users with quantitative
> feedback on the progress of a data copy job as a percentage as well as an
> estimate of the time remaining for completion. This will update the existing
> job status endpoint to include the progress percentage and estimate of time
> left.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)