[
https://issues.apache.org/jira/browse/AIRFLOW-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Remi Baar updated AIRFLOW-6235:
-------------------------------
Description:
*How to reproduce*
# Install airflow through pip (or conda).
# Run `airflow initdb`
# Run `airflow scheduler`
# Error occurs.
*Explanation*
By running `initdb` example dags are loaded in the databases. These life inside
the module folder (and thus outside the `AIRFLOW_HOME` directory or the dag_dir
if another is specified).
The scheduler will create log files based on the name of the dag file. The
relative path is being used form the dag_dir. So if the dag dir is
'/var/lib/airflow/dags' and the dag is located is at
'/var/lib/airflow/dags/foo/bar.py` the log file created is
`/var/lib/airflow/logs/scheduler/2019-12-12/foor/bar.py.log`. This is perfect.
But now the dag is outside the dag directory. For example:
dag:
`'/opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'`
the relative path:
`'../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'
the log file will be:
`'/var/lib/airflow/logs/scheduler/2019-12-12/../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py.log'`
As you can see the log files is being placed at a place where it never is
supposed to be!
*Lines of code causing this issue:*
[https://github.com/apache/airflow/blob/1.10.6/airflow/utils/log/file_processor_handler.py#L86:L94]
*Proposed solution:*
If the dag is outside the dag_dir, use the full path instead of the relative
path
was:
*How to reproduce*
# Install airflow through pip (or conda).
# Run `airflow initdb`
# Run `airflow scheduler`
# Error occurs.
*Explanation*
By running `initdb` example dags are loaded in the databases. These life inside
the module folder (and thus outside the `AIRFLOW_HOME` directory or the dag_dir
if another is specified).
The scheduler will create log files based on the name of the dag file. The
relative path is being used form the dag_dir. So if the dag dir is
'/var/lib/airflow/dags' and the dag is located is at
'/var/lib/airflow/dags/foo/bar.py` the log file created is
`/var/lib/airflow/logs/scheduler/2019-12-12/foor/bar.py.log`. This is perfect.
But now the dag is outside the dag directory. For example:
dag:
`'/opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'`
the relative path:
`'../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'
the log file will be:
`'/var/lib/airflow/logs/scheduler/2019-12-12/../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py.log'`
As you can see the log files is being placed at a place where it never is
supposed to be!
*Lines of code causing this issue:*
Remaining Estimate: (was: 1h)
Original Estimate: (was: 1h)
> Logs of dags outside `dag_dir` are placed outside log directory
> ---------------------------------------------------------------
>
> Key: AIRFLOW-6235
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6235
> Project: Apache Airflow
> Issue Type: Bug
> Components: utils
> Affects Versions: 1.10.6
> Environment: Python 3.7, Airflow 1.10.6
> Reporter: Remi Baar
> Priority: Major
>
> *How to reproduce*
> # Install airflow through pip (or conda).
> # Run `airflow initdb`
> # Run `airflow scheduler`
> # Error occurs.
> *Explanation*
> By running `initdb` example dags are loaded in the databases. These life
> inside the module folder (and thus outside the `AIRFLOW_HOME` directory or
> the dag_dir if another is specified).
> The scheduler will create log files based on the name of the dag file. The
> relative path is being used form the dag_dir. So if the dag dir is
> '/var/lib/airflow/dags' and the dag is located is at
> '/var/lib/airflow/dags/foo/bar.py` the log file created is
> `/var/lib/airflow/logs/scheduler/2019-12-12/foor/bar.py.log`. This is perfect.
> But now the dag is outside the dag directory. For example:
> dag:
> `'/opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'`
>
> the relative path:
> `'../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py'
> the log file will be:
> `'/var/lib/airflow/logs/scheduler/2019-12-12/../../../../opt/miniconda3/envs/airflow/lib/python3.7/site-packages/airflow/example_dags/test_utils.py.log'`
> As you can see the log files is being placed at a place where it never is
> supposed to be!
> *Lines of code causing this issue:*
> [https://github.com/apache/airflow/blob/1.10.6/airflow/utils/log/file_processor_handler.py#L86:L94]
> *Proposed solution:*
> If the dag is outside the dag_dir, use the full path instead of the relative
> path
--
This message was sent by Atlassian Jira
(v8.3.4#803005)