BasPH edited a comment on issue #5159: [AIRFLOW-XXX] Attempt to remove flakeyness of LocalExecutor URL: https://github.com/apache/airflow/pull/5159#issuecomment-485977458 I have very limited time this week but tried digging into it. At least I managed to reproduce locally: I set an optional breakpoint before `self.assertEqual(len(executor.running), 0)`: ```python if len(executor.running) != 0: import ipdb ipdb.set_trace() self.assertEqual(len(executor.running), 0) ``` Let this run a few times and after just a few attempts I hit the breakpoint: ```bash for n in {1..100}; do nosetests tests/executors/test_local_executor.py:LocalExecutorTest.test_execution_limited_parallelism -s; done ``` For the record, at this point: - `len(executor.running)` is 1 at this point in time and `executor.running` contains `{'fail': True}` - `executor.queue.empty()` returns `True` - `executor.result_queue.get()` returns `('fail', 'failed')` (I would expect this thing to be empty because `sync()` is called which loops over the result_queue, clearing out all items...) Thinking out loud: my understanding is there's a race condition where the asserts are already being called before `executor.end()` finishes. I imagine there's something iffy about multiple workers and passing around the JoinableQueue to all workers, but unsure exactly what's causing the issue. Maybe `change_state()` might not finish fast enough, and the result_queue is already empty by then, and as a result the `end()` function is finished? Hope this helps so far :-)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
