[GitHub] [airflow] potiuk edited a comment on issue #17191: Healthcheck endpoint for workers

GitBox Mon, 02 Aug 2021 03:10:17 -0700


potiuk edited a comment on issue #17191:
URL: https://github.com/apache/airflow/issues/17191#issuecomment-890898149



   > @potiuk Only in our case we do not have an ideal situation that is in line 
with the philosophy of K8s. In our case, one component performs two roles: 
webserver, tasks handling service. For webserver, it is more common to use an 
HTTP-based probe, but for the second type of service (which does not provide an 
HTTP endpoint), it is more natural to use exec probe. Kubernetes also does not 
allow us to define of two liveness probes for one container, so we have to 
decide which service we want to monitor directly and which we will monitor only 
as a child process from the main process.
   
   But since our workers already have "log http-webserver" built-in and there 
is one such webserver per worker which is involved with running the worker 
itself - what is the problem of each worker providing health check using HTTP?
   
   I do not see any reason why we could not use it. There are many services 
that provide health check over http, even if their primary task is doing 
something else. Such health-check end-point could internally perform more 
complex checks  than just responding with success - it could communicate with 
celery master process and query it for status etc. etc. I think this would make 
worker a nice self-container service with its own dedicated health-check 
end-point. 
   
   I think the basic premise of our Celery architecture is that we can simply 
start up as many  of those self-contained workers as possible and we can manage 
the number of those workers in a way that is completely independent from other 
Airflow components - delegating deployment, scaling etc. to external deployment.
   
   Very similarly as we do with scheduler and webservers- we can manage number 
of schedulers and webservers independently and scaling works by adding "just 
another scheduler" or "just another webserver" - same with Celery, we should be 
able to add "just another worker". I think it also fits K8S philosophy very 
well where you can define Deployment of different components of application and 
put them together like "lego blocks" - in the way that different components do 
not have to know about each other (specificaly how many of instances of those 
other components we have) and act independently. 
   
   Or maybe we are talking  about different thing altogether ? Maybe there is 
something I do not understand ? I think we do not need to move webserver to a 
different container or anything like that, so maybe we have a misunderstanding 
here. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] potiuk edited a comment on issue #17191: Healthcheck endpoint for workers

Reply via email to