Hello Everyone,

A small follow up after K8S/Celery executors being moved:
https://lists.apache.org/thread/7gyw7ty9vm0pokjxq7y3b1zw6mrlxfm8

We are in the process of moving Celery / Kubernetes executor (Celery almost
complete and I am working on K8S next + some common discovery and config
moving)

But there is one more "questionable" executor - i.e. Dask executor, still
living in Airflow Core.

When it comes to Celery/Kubernetes, we decided to make the two providers
preinstalled, because it makes most sense  - we are also going to get the
basic documentation in the "core" airflow documentation so that it is
easier discoverable and prominently visible - also because of the
vendor-neutrality.

However when it comes to Dask I am not sure about its status and whether we
should make it preinstalled ?

I guess there is no doubt to move it to a provider - this has only the
benefits same as Celery/K8S move. But whether it should be preinstalled
with Airflow - I am not sure. I do not know how frequently Dask executor
(and Dask) is used by people using Airflow, but I personally do not think
it should be as "closely" connected with Airflow as Celery/Kubernetes ones.

If we do not make it preinstalled, it is somewhat (but not too much,
really) breaking change. We still might choose to install dask provider in
the PROD reference image, so it will continue to work if you use the image,
and when you are installing airflow in venv you will only have to specify
`pip install apache-airflow[dask]` or manually install
`apache-airflow-providers-daskexecutor` (for now at least this is the name
I could reserve in PyPI). So this is not really breaking, it just requires
another dependency to be installed. But some pipelines of installing
Airflow might get broken because it won't be pre-installed - so this is a
borderline breaking.

WDYT? Should we make the dask executor pre-installed or not?

J.

Reply via email to