gargvishesh commented on code in PR #15180:
URL: https://github.com/apache/druid/pull/15180#discussion_r1367324097
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/MSQWorkerTaskLauncher.java:
##########
@@ -348,6 +351,68 @@ public boolean isTaskLatest(String taskId)
}
}
+ public static class WorkerStats
+ {
+ String workerId;
+ TaskState state;
+ long duration;
+
+ /**
+ * For JSON deserialization only
+ */
+ public WorkerStats()
+ {
+ }
+
+ public WorkerStats(String workerId, TaskState state, long duration)
+ {
+ this.workerId = workerId;
+ this.state = state;
+ this.duration = duration;
+ }
+
+ @JsonProperty()
+ public String getWorkerId()
+ {
+ return workerId;
+ }
+
+ @JsonProperty()
+ public TaskState getState()
+ {
+ return state;
+ }
+
+ @JsonProperty("durationMs")
+ public long getDuration()
+ {
+ return duration;
+ }
+ }
+
+ public Map<Integer, List<WorkerStats>> getWorkerStats()
+ {
+ final Map<Integer, List<WorkerStats>> workerStats = new TreeMap<>();
+
+ for (Map.Entry<String, TaskTracker> taskEntry : taskTrackers.entrySet()) {
+
+ TaskTracker taskTracker = taskEntry.getValue();
+
+ long duration = (taskTracker.status.getDuration() == -1
Review Comment:
Did the check both from a run and in code and found it to be correct: the
start time in MSQ is recorded when the task is submitted whereas in overlord is
upon start of the run.
Currently `TaskStatus` doesn't have any field to record a startTime, so for
MSQ to get a worker's startTime from the Overlord, this field needs to be added
and [persisted in the
database](https://github.com/apache/druid/blob/fffb2e4fe72c9c8a0e59d4a191a1de8f6763df88/server/src/main/java/org/apache/druid/metadata/SQLMetadataStorageActionHandler.java#L246)
upon the worker task's start.
There is no such issue with reporting of query's duration periodically in
live reports since it's a single hop from (from overlord to indexer) and the
query's start time is maintained inside the controller -- so the duration can
be calculated
[on-the-fly](https://github.com/apache/druid/blob/50a0eb7251f9c0943c810f2fd57f88327d75cec2/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java#L922).
Since the worker's metrics are currently targeted more for the billing
usecase which will have only finished workers, I think for now it is fine to
just report the duration as `-1` for worker tasks instead of adding a new field
in `TaskStatus` which is used at multiple places. But do let me know if you
think we should take the route of adding a new field.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]