[GitHub] [druid] AdheipSingh commented on issue #8801: KubernetesTaskRunner for running druid tasks as kubernetes jobs

GitBox Sun, 26 Jul 2020 10:59:11 -0700


AdheipSingh commented on issue #8801:
URL: https://github.com/apache/druid/issues/8801#issuecomment-664020630



   Talking only from a kubernetes and druid operator perspective in order to 
solve the problem of scaling MM dynamically.
   
   ### Scenario: 
   - To scale MM when a task is in pending state, Scale down MM when their is 
no task associated with it.
   
   ### Problems:
   - When druid operator scales down a MM we need to make sure we don't scale 
down a MM pod which is running a task. How can we protect a MM during scale 
down ?
   - Here to use Statefulsets  for MM is another blocker. Lets say i have two 
MM running with statefulsets MM-0 and MM-1. I have a scenario where MM-0 is not 
running any task where as MM-1 is running a task, in case i scale down 
kubernetes will delete MM-1, since StatefulSet controllers always removes the 
most recently created MM. 
   - With this we can support deployments for MM 
https://github.com/druid-io/druid-operator/pull/52 . 
   - Another blocker here is we cannot have a single service encapsulating MM 
pods. We need to have separate service for each MM. In that way i can have a 
total distinction of MM pod. So that if i hit this endpoint  
```/druid/worker/v1/tasks``` for MM-0 and MM-1 i get exact counts for each, 
with a single service encapsulating it, it will LB between the two pods and 
give a count for only one of them. So the operator for MM should  not have 
replicas but a ```count```, so in case i specify 2 counts of MM the operator 
shall deploy 2 deployments of MM with 2 separate services. 
   
   - The operator on each reconcile, can hit this endpoint for each **MM** 
```/druid/worker/v1/tasks``` and then scale down the particular deployment of 
MM ( meaning scale down rs to 0, not to delete the MM ). Scale up can be done 
using ```v1/indexer/pendingTask``` endpoint to increase the count of MM.
   
   - On volume management, addressed on the above comment, in case your storage 
class has reclaimPolicy enabled to true, the PVC shall not be deleted in that 
case. As per my understanding its fine to have ```n number``` MM associated 
with a common pvc volumeMount. As per current state of the operator, i guess 
deployments need to be enhanced to support pvc.
   
   These problems have been associated with kafka operators too, and some of 
the operators have addressed these blockers when using statefulsets on k8s. 
   
   HPA cant solve this problem even with external metrics, have tried those 
approaches. HPA can only work if its supports to scale up for different metrics 
and scale down for different metrics.
   
   Again just to be clear here i am assuming nothing changes in druid :) 
commenting from k8s controllers perspective.
   
   @himanshug @nishantmonu51 would like to know your perspective on this 
approach .
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] AdheipSingh commented on issue #8801: KubernetesTaskRunner for running druid tasks as kubernetes jobs

Reply via email to