potiuk edited a comment on issue #14198: URL: https://github.com/apache/airflow/issues/14198#issuecomment-779233679
First of all - there are very rarely cases that the worker machine should have different image/env - you need to have pretty much the same configuration on all the types of nodes (webserver, scheduler, worker) otherwise you risk having strange problems (for basic python packages). But even if you do, actually the whole complexity of this feature is that you need exactly worker information not scheduler one. So in the webserver you should actually display configuration of all the different worker types that you have configured in the system (you can have different workers with different capabilities in your system - there can be different "queues" configured and some workers might have different libraries - for example to access GPU-accelerated nodes. That's why DAG is a simple solution that will work as expected. Plugin will also work but it's implementation will be more than trivial and requires some querying of Celery / Kubernetes infrastructure to work well. This Plugin will be executed in Webserver context - which might be also a bit different from either scheduler or workers. You are not really interested what is on scheduler/webserver in general. I believe only workers configuration should be of any use for you as a user. Worker is the most important one you need to know about because it might have additional configuration (like libraries used to connect to external system, authentication configuration, GPUs etc which the other parts will not have). And this is where the "execute" of your DAGs is executed. The user code is practically never executed on scheduler (except parsing the DAG structure) and never on Webserver (with DAG serialization), so there is really pretty much no interest on what it is installed for scheduler and webserver. All you need to know about are really the different worker types. That's why DAG solution will work always and you can make it works for your deployment. The Plugin solution that @ashb mentioned is possible, but implementing it in a generic mode that works for various deployments (Celery, Kubernetes, Local Executor) might be tricky. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
