gianm commented on code in PR #15174:
URL: https://github.com/apache/druid/pull/15174#discussion_r1372797681
##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java:
##########
@@ -2113,7 +2113,13 @@ public Boolean
apply(Pair<SeekableStreamIndexTaskRunner.Status, Map<PartitionIdT
for (int i = 0; i < results.size(); i++) {
String taskId = futureTaskIds.get(i);
if (results.get(i).isError() || results.get(i).valueOrThrow() == null) {
- killTask(taskId, "Task [%s] failed to return status, killing task",
taskId);
+ log.info("Task [%s] failed to respond, might need to kill it.",
taskId);
+ Optional<TaskStatus> killTaskStatus = taskStorage.getStatus(taskId);
+ if (killTaskStatus.isPresent() && !killTaskStatus.get().isComplete()) {
+ // If the task completed while we were failing to chat with it,
don't do anything.
+ log.info("Task [%s] failed to return status, killing task", taskId);
+ killTask(taskId, "Task [%s] failed to return status, killing task",
taskId);
Review Comment:
Seems like it would be better to fix things by making `TaskQueue#shutdown`
(which `killTask` calls) safe to call if the task has already completed. Then
this block can revert to being a simple `killTask` call, and all other callers
of `TaskQueue#shutdown` will also benefit.
So, what goes wrong when `TaskQueue#shutdown` is called on a task that is
already complete? Is that fixable?
##########
server/src/main/java/org/apache/druid/segment/realtime/firehose/ChatHandlerResource.java:
##########
@@ -73,6 +74,6 @@ public Object doTaskChat(@PathParam("id") String handlerId,
@Context HttpHeaders
return handler.get();
}
- throw new BadRequestException(StringUtils.format("Can't find chatHandler
for handler[%s]", handlerId));
+ throw new ServiceUnavailableException(StringUtils.format("Can't find
chatHandler for handler[%s]", handlerId));
Review Comment:
Deserves a comment: something like "Return HTTP 503 so
SpecificTaskRetryPolicy retries in case the handler is not registered yet."
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]