georgew5656 commented on code in PR #14909:
URL: https://github.com/apache/druid/pull/14909#discussion_r1324661379


##########
extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesTaskRunner.java:
##########
@@ -309,23 +309,23 @@ public Optional<InputStream> streamTaskReports(String 
taskid) throws IOException
   @Override
   public List<Pair<Task, ListenableFuture<TaskStatus>>> restore()
   {
-    List<Pair<Task, ListenableFuture<TaskStatus>>> restoredTasks = new 
ArrayList<>();
+    return ImmutableList.of();
+  }
+
+  @Override
+  @LifecycleStart
+  public void start()
+  {
     for (Job job : client.getPeonJobs()) {
       try {
-        Task task = adapter.toTask(job);
-        restoredTasks.add(Pair.of(task, joinAsync(task)));
+        joinAsync(adapter.toTask(job));

Review Comment:
   @gianm
   
   I tried to add some code in the start logic of KubernetesTaskRunner to check 
the tasks that have completed before the overlord came up and wait on their 
futures to complete, but this didn't actually solve the problem.
   
   I think this is because the callbacks that are responsible for listening to 
the taskRunner future and updating the task's status in TaskStorage are added 
by the TaskQueue manageInternalCritical block, which doesn't get run during 
LIfeycleStart.
   
   I think this may actually be a general race condition-y thing with 
supervisors that mm-less ingestion may be running into more often. Will need 
more time to figure out what's happening but for now I think this PR can be 
merged because the logic of list all jobs in k8s -> add them to the tasks map 
definitely belongs in the start method and not the restore method. (the restore 
method appears to be called in every manage() run in TaskQueue).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to