In the last 2 surveys we had a question of "What executor type do you use?" Dask was included in the Other choice and as expected few users use this. While we can not really rely on this survey I think it does give some information about usage.
Do we really want to maintain core functionality for such a small number of users? What is the value in it? And also, can we remove it in a feature release? I'm not 100% sure on that. On Tue, Mar 8, 2022 at 6:09 PM Jarek Potiuk <[email protected]> wrote: > FYI Thanks to Kanthi, the Dask executor back (with all tests) > https://github.com/apache/airflow/pull/22027 > > On Sat, Mar 5, 2022 at 10:03 PM Jarek Potiuk <[email protected]> wrote: > >> FYI. I asked the question at Dask's discourse >> https://dask.discourse.group/t/potential-removal-of-dask-executor-support-in-airflow/433 >> >> But I personally think we can make the "tactical" approach of ours on >> merging "disabling" Dask tests via >> https://github.com/apache/airflow/pull/22017 - it should not hold us >> back I think. >> >> On Sat, Mar 5, 2022 at 9:42 PM Jarek Potiuk <[email protected]> wrote: >> >>> Hello everyone, >>> >>> This is the second time [1] I am raising the question on the devlist >>> (last time the Dask team helped and I am going to reach out to them as >>> well). >>> >>> We have quite a problem with DaskExecutor in Airflow. >>> >>> Previously when I raised it, all tests in Dask Executor have been marked >>> as "skipped" and I asked whether to remove the Dask Executor altogether. >>> The Dask team responded and helped to enable the tests, however since then >>> there was no activity in this area. We have this code in our "dask" extra - >>> and it limits us. For example - we cannot merge the new looker library from >>> Google and (what's even more important) we cannot update airflow to Python >>> 3.10 and MacOS ARM (Due to cloudpickle limitation that prevents us from >>> upgrading apache-beam and numpy). >>> >>> Unfortunately Dask Executor - is part of the "core" of airflow, not a >>> provider. So we cannot really treat it as an "optional" provider.. >>> >>> Because of that, we are using a very old cloudpickle version and Dasks' >>> distributed library. >>> >>> # Dask support is limited, we need Dask team to upgrade support for >>> dask if we were to continue >>> # Supporting it in the future >>> # TODO: upgrade libraries used or maybe deprecate and drop DASK >>> support >>> 'cloudpickle>=1.4.1, <1.5.0', >>> 'dask>=2.9.0, <2021.6.1', # dask 2021.6.1 does not work with >>> `distributed` >>> 'distributed>=2.11.1, <2.20', >>> >>> >>> I tried to fix the tests, but there are many changes in the Dask >>> `distributed` library - including removal of parts of the test harness that >>> is used by some tests. >>> >>> My proposal (and I also created a PR >>> https://github.com/apache/airflow/pull/22017 for that): >>> >>> * remove the limitations from Dask libraries >>> * "skip" all the tests of Dask until they are fixed >>> * ask the Dask team to help with fixing those until we release 2.3.0 - >>> if they won't fix them we will drop support for dask executor (or at least >>> we will not run tests for it and mark it as "untested") >>> * in the latter case we might actually bring back the dependencies that >>> "worked" for "dask" extra in Airflow 2.3.0 - they will not be tested in our >>> unit tests but if someone install "dask" extra it will work (but this will >>> also mean that some older providers will need to be installed - because >>> they will conflict with dask extra) >>> >>> Another possibility might be to simply remove Dask support altogether or >>> move it to a new provider. >>> >>> Let me know what you think. This one pretty much blocks the release of >>> new providers (we are almost ready to add Looker) but more importantly it >>> blocks the effort of supporting Python 3.10 and ARM M1. >>> >>> I hope we can quickly make a tactical decision to merge the PR and work >>> with the Dask team on the next steps and make the final decision later. >>> >>> J. >>> >>> [1] https://lists.apache.org/thread/875fpgb7vfpmtxrmt19jmo8d3p6mgqnh >>> >>
