[ https://issues.apache.org/jira/browse/AIRFLOW-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365272#comment-16365272 ]
Sébastien Brochet commented on AIRFLOW-2065: -------------------------------------------- I've opened a new PR with my fix: https://github.com/apache/incubator-airflow/pull/3040 Cheers, > Worker logging can raise FileExistsError when more than one process execute > concurrently > ---------------------------------------------------------------------------------------- > > Key: AIRFLOW-2065 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2065 > Project: Apache Airflow > Issue Type: Bug > Components: executor, logging > Affects Versions: 1.9.0 > Reporter: Sébastien Brochet > Priority: Critical > > Hello, > > We started observing random failing during the execution of our dags after > upgrading to 1.9.0. After careful debugging, we noticing the following > exception in the worker logs: > {noformat} > Traceback (most recent call last): > File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", > line 558, in configure > handler = self.configure_handler(handlers[name]) > File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", > line 731, in configure_handler > result = factory(**kwargs) > File > "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/utils/log/file_processor_handler.py", > line 48, in __init__ > os.makedirs(self._get_log_directory()) > File "/projects/airflow-hadoop/anaconda3/lib/python3.6/os.py", line 220, > in makedirs > mkdir(name, mode) > FileExistsError: [Errno 17] File exists: > '/projects/airflow-hadoop/airflow/logs/scheduler/2018-02-05' > During handling of the above exception, another exception occurred: > Traceback (most recent call last): > File "/projects/airflow-hadoop/anaconda3/bin/airflow", line 16, in <module> > from airflow import configuration > File > "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/__init__.py", > line 31, in <module> > from airflow import settings > File > "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/settings.py", > line 148, in <module> > configure_logging() > File > "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/logging_config.py", > line 75, in configure_logging > raise e > File > "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/logging_config.py", > line 70, in configure_logging > dictConfig(logging_config) > File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", > line 795, in dictConfig > dictConfigClass(config).configure() > File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", > line 566, in configure > '%r: %s' % (name, e)) > ValueError: Unable to configure handler 'file.processor': [Errno 17] File > exists: '/projects/airflow-hadoop/airflow/logs/scheduler/2018-02-05 > {noformat} > > As you can see, an exception is raised when trying to create the directory > where to store the executor logs. This can happen if two tasks are scheduled > are the exact same time on the same worker. It appears to be the case here : > > {noformat} > [2018-02-05 02:10:07,886] \{celery_executor.py:50} INFO - Executing command > in Celery: airflow run xxxx pairing_sensor_check 2018-02-04T02:10:00 --local > --pool sensor -sd /projects/airflow-hadoop/airflow/dags/flow.py > [2018-02-05 02:10:07,908] \{celery_executor.py:50} INFO - Executing command > in Celery: airflow run yyy pairing_sensor_check 2018-02-04T02:10:00 --local > --pool sensor -sd /projects/airflow-hadoop/airflow/dags/flow.py > {noformat} > Culprits is here: > [https://github.com/apache/incubator-airflow/blob/v1-9-stable/airflow/utils/log/file_processor_handler.py#L47-L48] > (not fixed in master) > A simple fix would be to wrap the {{makedirs}} command into a {{try}} / > {{catch}} block. > > Thanks, > > Sébastien -- This message was sent by Atlassian JIRA (v7.6.3#76005)