[ 
https://issues.apache.org/jira/browse/AIRFLOW-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353579#comment-16353579
 ] 

Sébastien Brochet commented on AIRFLOW-2065:
--------------------------------------------

Fixing in our production instance the above issue, we noticed another exception 
later in the code, around here: 
[https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/file_processor_handler.py#L105-L108]

 

```

Traceback (most recent call last):
  File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", 
line 558, in configure
    handler = self.configure_handler(handlers[name])
  File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", 
line 731, in configure_handler
    result = factory(**kwargs)
  File 
"/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/utils/log/file_processor_handler.py",
 line 50, in __init__
    self._symlink_latest_log_directory()
  File 
"/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/utils/log/file_processor_handler.py",
 line 106, in _symlink_latest_log_directory
    os.unlink(latest_log_directory_path)
FileNotFoundError: [Errno 2] No such file or directory: 
'/projects/airflow-hadoop/airflow/logs/scheduler/latest'

```

 

Same issue here, two processes are trying to delete the same directory.

> Worker logging can raise FileExistsError when more than one process execute 
> concurrently
> ----------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-2065
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2065
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: executor, logging
>    Affects Versions: 1.9.0
>            Reporter: Sébastien Brochet
>            Priority: Critical
>
> Hello,
>  
> We started observing random failing during the execution of our dags after 
> upgrading to 1.9.0. After careful debugging, we noticing the following 
> exception in the worker logs:
> {noformat}
> Traceback (most recent call last):
>    File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", 
> line 558, in configure
>      handler = self.configure_handler(handlers[name])
>    File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", 
> line 731, in configure_handler
>      result = factory(**kwargs)
>    File 
> "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/utils/log/file_processor_handler.py",
>  line 48, in __init__
>      os.makedirs(self._get_log_directory())
>    File "/projects/airflow-hadoop/anaconda3/lib/python3.6/os.py", line 220, 
> in makedirs
>      mkdir(name, mode)
>  FileExistsError: [Errno 17] File exists: 
> '/projects/airflow-hadoop/airflow/logs/scheduler/2018-02-05'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>    File "/projects/airflow-hadoop/anaconda3/bin/airflow", line 16, in <module>
>      from airflow import configuration
>    File 
> "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/__init__.py",
>  line 31, in <module>
>      from airflow import settings
>    File 
> "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/settings.py",
>  line 148, in <module>
>      configure_logging()
>    File 
> "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/logging_config.py",
>  line 75, in configure_logging
>      raise e
>    File 
> "/projects/airflow-hadoop/anaconda3/lib/python3.6/site-packages/airflow/logging_config.py",
>  line 70, in configure_logging
>      dictConfig(logging_config)
>    File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", 
> line 795, in dictConfig
>      dictConfigClass(config).configure()
>    File "/projects/airflow-hadoop/anaconda3/lib/python3.6/logging/config.py", 
> line 566, in configure
>      '%r: %s' % (name, e))
>  ValueError: Unable to configure handler 'file.processor': [Errno 17] File 
> exists: '/projects/airflow-hadoop/airflow/logs/scheduler/2018-02-05
> {noformat}
>  
> As you can see, an exception is raised when trying to create the directory 
> where to store the executor logs. This can happen if two tasks are scheduled 
> are the exact same time on the same worker. It appears to be the case here :
>  
> {noformat}
>  [2018-02-05 02:10:07,886] \{celery_executor.py:50} INFO - Executing command 
> in Celery: airflow run xxxx pairing_sensor_check 2018-02-04T02:10:00 --local 
> --pool sensor -sd /projects/airflow-hadoop/airflow/dags/flow.py
>  [2018-02-05 02:10:07,908] \{celery_executor.py:50} INFO - Executing command 
> in Celery: airflow run yyy pairing_sensor_check 2018-02-04T02:10:00 --local 
> --pool sensor -sd /projects/airflow-hadoop/airflow/dags/flow.py
> {noformat}
> Culprits is here: 
> [https://github.com/apache/incubator-airflow/blob/v1-9-stable/airflow/utils/log/file_processor_handler.py#L47-L48]
>  (not fixed in master)
> A simple fix would be to wrap the {{makedirs}} command into a {{try}} / 
> {{catch}} block.
>  
> Thanks,
>  
> Sébastien



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to