potiuk commented on issue #14198:
URL: https://github.com/apache/airflow/issues/14198#issuecomment-779233679


   First of all - there are very rarely cases that the worker machine should 
have different image/env - you need to have pretty much the same configuration 
on all the types of nodes (webserver, scheduler, worker) otherwise you risk 
having strange problems. 
   
   But even if you do, actually the whole complexity of this feature is that 
you need exactly worker information not scheduler one. So in the webserver you 
should actually display configuration of all the different worker types that 
you have configured in the system (you can have different workers with 
different capabilities in your system - there can be different "queues" 
configured and some workers might have different libraries - for example to 
access GPU-accelerated nodes.  That's why DAG is a simple solution that will 
work as expected. Plugin will also work but it's implementation will be more 
than trivial  and requires some querying of Celery infrastructure to work well. 
This Plugin will be executed in Webserver context - which might be also a bit 
different from either scheduler or workers.
   
   You are not really interested what is on scheduler/webserver in general. I 
believe only workers configuration should be of any use for you as a user. 
Worker is the most important one you need to know about because it might have 
additional configuration (like libraries used to connect to external system, 
authentication configuration, GPUs etc which the other parts will not have). 
And this is where the "execute" of your DAGs is executed. 
   
   The user code is practically never executed on scheduler (except parsing the 
DAG structure) and never on Webserver (with DAG serialization), so there is 
really pretty much no interest on what it is installed for scheduler and 
webserver. All you need to know about are really the different worker types. 
That's why DAG solution will work always and you can make it works for your 
deployment. The Plugin solution that @ashb  mentioned is possible, but 
implementing it in a generic mode that works for various deployments (Celery, 
Kubernetes, Local Executor) might be tricky.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to