[GitHub] [druid] churromorales commented on pull request #13156: Support for middle manager less druid, tasks launch as k8s jobs

GitBox Thu, 13 Oct 2022 16:43:22 -0700


churromorales commented on PR #13156:
URL: https://github.com/apache/druid/pull/13156#issuecomment-1278289541


   I noticed a slight issue when testing on heavily loaded k8s clusters.  Those 
that take a while to spin up / down pods.  
   
   Currently the way things work: 
        1. The overlord launches a task, waits for the pod to come up, it not 
only waits for a pod ip to be available, but also opens a socket connection to 
the peon processes web server.  Once you have a socket connection the task is 
considered started.
        2. Then it will monitor the job to complete, but this is a blocking 
call, after the job completes, sends pushes task reports, logs and then does a 
delete for the peon k8s job.
   
   What was problematic here, is I noticed that the process was complete, the 
task itself was finished, but the k8s operations were slow on my cluster.  The 
task itself took less than a minute to complete, but the entire k8s lifecycle 
was taking much longer around 10 minutes.  (A very heavily loaded cluster). 
   
   Thus the task status only updates in the finally block so the interval lock 
is not released until the status updates.  I have another patch I will test on 
our clusters but a brief description of the patch is this: 
   
   1. The overlord launches a task, waits for the pod to come up.  No more 
opening a socket for the webserver.  Instead I added 2 TaskActions: One to 
update the `TaskStatus`, one to update the `TaskLocation`.
   2. Now in the `AbstractTask` the setup method will update its own location 
before the `run` method is called.  If the overlord goes down for whatever 
reason, that is fine.  We don't lose anything really, the call fails and the 
task itself dies. 
   3. In the cleanup method of the `AbstractTask`we also update the 
`TaskLocation` to `unknown` and update the status to `success` or `failure`.  
So when the process exits, we can give up the lock and things wont be blocked.  
In case the pod goes away during the cleanup, that is okay, the overlord will 
still monitor the job and report the correct results albeit a bit slower.  
   
   
   I thought about this approach originally, but didn't want to make too many 
changes to `core`, but after running this patch on our clusters, I do think 
this is the best approach.  I will call it out in a separate commit so you guys 
can review this work, it doesn't seem too abrasive to me, but we should focus 
on correctness here and not hold the locks longer than necessary as we were 
doing in the forking task runner. 
   
   I will test this out over the next day or so and then add the commit to this 
PR. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] churromorales commented on pull request #13156: Support for middle manager less druid, tasks launch as k8s jobs

Reply via email to