potiuk commented on PR #35473: URL: https://github.com/apache/airflow/pull/35473#issuecomment-1798360252
FYI: @bolkedebruin - so far so good. I have not seen any "self-hosted" test failing with this flaky test since yesterday (and I am looking quite regularly). Looks like my hypothesis about parallel job contention on some resources (I/O most likely) was right (and this test is particularly vulnerable): Usually this test takes 8-19 seconds: https://github.com/apache/airflow/actions/runs/6783603921/job/18438411043?pr=35492#step:6:2660 ``` 9.97s call tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues_no_queue_specified 8.46s call tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues ``` Buit sometimes longer: https://github.com/apache/airflow/actions/runs/6783603921/job/18438410767?pr=35492#step:6:2939 ``` 14.14s call tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutor::test_dask_executor_functions 12.40s call tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues 10.83s call tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues_no_queue_specified ``` And there are cases where they way longer: https://github.com/apache/airflow/actions/runs/6783603921/job/18438412609?pr=35492#step:6:3399 ``` 23.37s call tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues 22.87s call tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutorQueue::test_dask_queues_no_queue_specified 22.14s call tests/providers/daskexecutor/test_dask_executor.py::TestDaskExecutor::test_dask_executor_functions ``` So I guess increasing timeout in this case was the right call to decrease the probability of flakiness. Of course better soulution would be to make the test less `fragile' - but this is an exercise for someone who understands Dask integration better and spend time/assess if the test can be improved. Or maybe follow the Dask provider removal, which would be an ultimate improvement in stability possibly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
