RNHTTR commented on issue #14163: URL: https://github.com/apache/airflow/issues/14163#issuecomment-794914976
I think it _probably_ would be sufficient to just cast the map object to a list like above, but I'm not sure how that would affect the performance of `_get_many_from_kv_backend` or `_get_many_from_db_backend` which are called in lieu of `_get_many_using_multiprocessing` if app.backend is an instance of DatabaseBackend or BaseKeyValueStoreBackend, respectively (see [get_many](https://github.com/apache/airflow/blob/90ab60bba877c65cb93871b97db13a179820d28b/airflow/executors/celery_executor.py#L546-L556)). In these methods, the map is converted to a set via [_tasks_list_to_task_ids](https://github.com/apache/airflow/blob/90ab60bba877c65cb93871b97db13a179820d28b/airflow/executors/celery_executor.py#L529-L530), so it doesn't really make sense to convert the map to a list only for it to get converted to a set immediately after. So, I think it's best to convert the map to a list or set in the method [_get_many_using_multiprocessing](https://github.com/apache/airflow/blob/90ab60bba877c65cb93871b97db13a179820d28b/airflow/executors/celery_executor.py#L594-L595) before it attempts to get the length of the map, and thus would instead get the length of a list or set. I think this bug snuck by the [test case](https://github.com/apache/airflow/blob/master/tests/executors/test_celery_executor.py#L353-L377) because the test assumes app.backend is set to `DatabaseBackend`, so `_get_many_using_multiprocessing` is never called in the test case. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
