[ 
https://issues.apache.org/jira/browse/AIRFLOW-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remi Baar updated AIRFLOW-6235:
-------------------------------
           Description: 
*How to reproduce*
 # Install airflow through pip (or conda).
 # Run `airflow initdb`
 # Run `airflow scheduler` 
 # Error occurs.

*Explanation*

By running `initdb` example dags are loaded in the databases. These life inside 
the module folder (and thus outside the `AIRFLOW_HOME` directory or the dag_dir 
if another is specified). 

The scheduler will create log files based on the name of the dag file. The 
relative path is being used form the dag_dir. So if the dag dir is 
'/var/lib/airflow/dags' and the dag is located is at 
'/var/lib/airflow/dags/foo/bar.py` the log file created is 
`/var/lib/airflow/logs/scheduler/2019-12-12/foor/bar.py.log`. This is perfect.

But now the dag is outside the dag directory. For example:
 dag: 
`'/opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'`
 
 the relative path: 
`'../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'
 the log file will be: 
`'/var/lib/airflow/logs/scheduler/2019-12-12/../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py.log'`

As you can see the log files is being placed at a place where it never is 
supposed to be! 

*Lines of code causing this issue:*

[https://github.com/apache/airflow/blob/1.10.6/airflow/utils/log/file_processor_handler.py#L86:L94]

*Proposed solution:*

If the dag is outside the dag_dir, use the full path instead of the relative 
path

  was:
*How to reproduce*
 # Install airflow through pip (or conda).
 # Run `airflow initdb`
 # Run `airflow scheduler` 
 # Error occurs.

*Explanation*

By running `initdb` example dags are loaded in the databases. These life inside 
the module folder (and thus outside the `AIRFLOW_HOME` directory or the dag_dir 
if another is specified). 

The scheduler will create log files based on the name of the dag file. The 
relative path is being used form the dag_dir. So if the dag dir is 
'/var/lib/airflow/dags' and the dag is located is at 
'/var/lib/airflow/dags/foo/bar.py` the log file created is 
`/var/lib/airflow/logs/scheduler/2019-12-12/foor/bar.py.log`. This is perfect.

But now the dag is outside the dag directory. For example:
dag: 
`'/opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'`
 
the relative path: 
`'../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'
the log file will be: 
`'/var/lib/airflow/logs/scheduler/2019-12-12/../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py.log'`

As you can see the log files is being placed at a place where it never is 
supposed to be! 

*Lines of code causing this issue:*

    Remaining Estimate:     (was: 1h)
     Original Estimate:     (was: 1h)

> Logs of dags outside `dag_dir` are placed outside log directory
> ---------------------------------------------------------------
>
>                 Key: AIRFLOW-6235
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6235
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: utils
>    Affects Versions: 1.10.6
>         Environment: Python 3.7, Airflow 1.10.6
>            Reporter: Remi Baar
>            Priority: Major
>
> *How to reproduce*
>  # Install airflow through pip (or conda).
>  # Run `airflow initdb`
>  # Run `airflow scheduler` 
>  # Error occurs.
> *Explanation*
> By running `initdb` example dags are loaded in the databases. These life 
> inside the module folder (and thus outside the `AIRFLOW_HOME` directory or 
> the dag_dir if another is specified). 
> The scheduler will create log files based on the name of the dag file. The 
> relative path is being used form the dag_dir. So if the dag dir is 
> '/var/lib/airflow/dags' and the dag is located is at 
> '/var/lib/airflow/dags/foo/bar.py` the log file created is 
> `/var/lib/airflow/logs/scheduler/2019-12-12/foor/bar.py.log`. This is perfect.
> But now the dag is outside the dag directory. For example:
>  dag: 
> `'/opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'`
>  
>  the relative path: 
> `'../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'
>  the log file will be: 
> `'/var/lib/airflow/logs/scheduler/2019-12-12/../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py.log'`
> As you can see the log files is being placed at a place where it never is 
> supposed to be! 
> *Lines of code causing this issue:*
> [https://github.com/apache/airflow/blob/1.10.6/airflow/utils/log/file_processor_handler.py#L86:L94]
> *Proposed solution:*
> If the dag is outside the dag_dir, use the full path instead of the relative 
> path



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to