george-zubrienko opened a new issue #21387:
URL: https://github.com/apache/airflow/issues/21387


   ### Apache Airflow version
   
   2.2.3 (latest released)
   
   ### What happened
   
   This problem popped up when we switched from using persistent volume 
provisioned from helm chart via PVC (`log.persistence.enabled`) to WASB logging.
   
   While a task is running, the only logs we see are these:
   
   ```
   *** Trying to get logs (last 100 lines) from worker pod 
anaplancopyanaplantestsamplestrino.cfbd54145b32466bb6bedef603ed89a1 ***
   
   [2022-02-07, 11:24:15 UTC] {dagbag.py:500} INFO - Filling up the DagBag from 
/opt/airflow/dags/ecco_airflow/dags/bronze/anaplan.py
   ```
   Then when the task completes and its log is moved by WASB provider, UI shows 
data from remote log file as expected.
   
   Problem here is that log is read from **pod** - as you can see in the sample 
above, instead of a **local log file**.
   
   Since WASB provider can find the local log file in order to move it, it 
exists. I also created a simple dag to prove it:
   
   ```python
   import os
   import time
   
   from airflow import DAG
   from airflow.operators.python import PythonOperator
   from airflow.utils.dates import days_ago
   from datetime import timedelta
   from airflow.configuration import conf
   from airflow.utils.log.file_task_handler import FileTaskHandler
   
   from ecco_airflow.utils.k8s import executor_config
   
   default_args = {
       'owner': 'test',
       'depends_on_past': False,
       'email': ['[email protected]'],
       'email_on_failure': False,
       'email_on_retry': False,
       'retries': 1,
       'retry_delay': timedelta(days=1),
   }
   
   with DAG(
           dag_id='test_dag',
           default_args=default_args,
           description='Test logs fetch',
           schedule_interval="0 0 * * *",
           start_date=days_ago(1),
           tags=[]
   ) as dag:
   
       def test_read(**context):
           time.sleep(30)
           BASE_LOG_FOLDER: str = conf.get('logging', 'BASE_LOG_FOLDER')
           FILENAME_TEMPLATE: str = conf.get('logging', 'LOG_FILENAME_TEMPLATE')
           h = FileTaskHandler(BASE_LOG_FOLDER, FILENAME_TEMPLATE)
   
   
           log_relative_path = h._render_filename(context['ti'], 
context['ti'].try_number)
   
           print(log_relative_path)
   
           location = os.path.join(h.local_base, log_relative_path)
   
           print(location)
           log = ""
           if os.path.exists(location):
               try:
                   with open(location, encoding="utf-8", 
errors="surrogateescape") as file:
                       log += f"*** Reading local file: {location}\n"
                       log += "".join(file.readlines())
               except Exception as e:
                   log = f"*** Failed to load local log file: {location}\n"
                   log += f"*** {str(e)}\n"
               print(log)
           else:
               print(-1)
   
           time.sleep(30)
   
       t = PythonOperator(
           task_id='test_logs',
           python_callable=test_read,
           op_kwargs={},
           executor_config=executor_config(
               nodepool_names=['general'],
               secret_env_vars={},
               cpu_memory_limit={
                   'cpu': '100m',
                   'memory': '500Mi'
               }),
       )
   
   ```
   
   What happens here, while we sleep, **local log file** is present on the 
machine, but Airflow still tries to read log from a pod.
   Assuming I'm looking in a [correct 
place](https://github.com/apache/airflow/blob/2.2.3/airflow/utils/log/file_task_handler.py),
 what happens is, local log cannot be found, according to this code 
https://github.com/apache/airflow/blob/2.2.3/airflow/utils/log/file_task_handler.py#L96-L143
   
   However, final log for the task in the dag above will print 
   ```
   [2022-02-07, 11:05:07 UTC] {taskinstance.py:1259} INFO - Executing 
<Task(PythonOperator): test_logs> on 2022-02-06 00:00:00+00:00
   [2022-02-07, 11:05:07 UTC] {standard_task_runner.py:52} INFO - Started 
process 11 to run task
   [2022-02-07, 11:05:07 UTC] {standard_task_runner.py:76} INFO - Running: 
['airflow', 'tasks', 'run', 'test_dag', 'test_logs', 
'scheduled__2022-02-06T00:00:00+00:00', '--job-id', '15306', '--raw', 
'--subdir', 'DAGS_FOLDER/ecco_airflow/dags/test_dag.py', '--cfg-path', 
'/tmp/tmp_iv6qjpv', '--error-file', '/tmp/tmp_cc1dftp']
   [2022-02-07, 11:05:07 UTC] {standard_task_runner.py:77} INFO - Job 15306: 
Subtask test_logs
   [2022-02-07, 11:05:07 UTC] {logging_mixin.py:109} INFO - Running 
<TaskInstance: test_dag.test_logs scheduled__2022-02-06T00:00:00+00:00 
[running]> on host testdagtestlogs.2466ee8d3bf244d9933d502894766d3c
   [2022-02-07, 11:05:08 UTC] {taskinstance.py:1424} INFO - Exporting the 
following env vars:
   [email protected]
   AIRFLOW_CTX_DAG_OWNER=test
   AIRFLOW_CTX_DAG_ID=test_dag
   AIRFLOW_CTX_TASK_ID=test_logs
   AIRFLOW_CTX_EXECUTION_DATE=2022-02-06T00:00:00+00:00
   AIRFLOW_CTX_DAG_RUN_ID=scheduled__2022-02-06T00:00:00+00:00
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'execution_date' from the template 
is deprecated and will be removed in a future version. Please use 
'data_interval_start' or 'logical_date' instead.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'next_ds' from the template is 
deprecated and will be removed in a future version. Please use '{{ 
data_interval_end | ds }}' instead.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'next_ds_nodash' from the template 
is deprecated and will be removed in a future version. Please use '{{ 
data_interval_end | ds_nodash }}' instead.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'next_execution_date' from the 
template is deprecated and will be removed in a future version. Please use 
'data_interval_end' instead.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'prev_ds' from the template is 
deprecated and will be removed in a future version.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'prev_ds_nodash' from the template 
is deprecated and will be removed in a future version.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'prev_execution_date' from the 
template is deprecated and will be removed in a future version.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'prev_execution_date_success' from 
the template is deprecated and will be removed in a future version. Please use 
'prev_data_interval_start_success' instead.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'tomorrow_ds' from the template is 
deprecated and will be removed in a future version.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'tomorrow_ds_nodash' from the 
template is deprecated and will be removed in a future version.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'yesterday_ds' from the template is 
deprecated and will be removed in a future version.
   [2022-02-07, 11:05:08 UTC] {logging_mixin.py:109} WARNING - 
/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/context.py:152 
AirflowContextDeprecationWarning: Accessing 'yesterday_ds_nodash' from the 
template is deprecated and will be removed in a future version.
   [2022-02-07, 11:05:38 UTC] {logging_mixin.py:109} INFO - 
test_dag/test_logs/2022-02-06T00:00:00+00:00/5.log
   [2022-02-07, 11:05:38 UTC] {logging_mixin.py:109} INFO - 
/opt/airflow/logs/test_dag/test_logs/2022-02-06T00:00:00+00:00/5.log
   [2022-02-07, 11:05:38 UTC] {logging_mixin.py:109} INFO - *** Reading local 
file: /opt/airflow/logs/test_dag/test_logs/2022-02-06T00:00:00+00:00/5.log
   [2022-02-07, 11:05:06 UTC] {taskinstance.py:1032} INFO - Dependencies all 
met for <TaskInstance: test_dag.test_logs scheduled__2022-02-06T00:00:00+00:00 
[queued]>
   [2022-02-07, 11:05:06 UTC] {taskinstance.py:1032} INFO - Dependencies all 
met for <TaskInstance: test_dag.test_logs scheduled__2022-02-06T00:00:00+00:00 
[queued]>
   [2022-02-07, 11:05:06 UTC] {taskinstance.py:1238} INFO - 
   ```
   
   twice, as expected.
   
   This doesn't happen when a persistent volume is used to mount logs instead 
of `emptyDir`.
   
   
   
   ### What you expected to happen
   
   Logs from long-running tasks with KubernetesExecutor are read from local log 
file located under `$AIRFLOW_HOME/logs/...`, regardless how that path is 
mounted to a container - PV or emptyDir
   
   ### How to reproduce
   
   Change logs mount to `emptyDir`, enable remote logging to WASB and try to 
execute a long-running task. While a task is running, Airflow will always 
resort to reading logs from a pod, rather than from `$AIRFLOW_HOME/logs/...` 
location where the actual log file is.
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-cncf-kubernetes==2.2.0
   apache-airflow-providers-microsoft-azure==3.4.0
   
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   Airflow is deployed via `helm_release` terraform plugin, chart version 
`1.3.0` with custom image, Dockerfile for it:
   ```
   FROM apache/airflow:2.2.3-python3.9
   
   COPY requirements.txt /tmp/requirements.txt
   
   RUN pip3 install --user -r /tmp/requirements.txt
   ```
   
   Requirements:
   
   ```
   apache-airflow-providers-databricks==2.2.0
   authlib==0.15.5
   ```
   
   ### Anything else
   
   Problem occurs at least on PythonOperator when using KubernetesExecutor
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to