mhaalme opened a new issue #14950:
URL: https://github.com/apache/airflow/issues/14950
**Apache Airflow version**: 2.0.1
**Environment**:
- **OS**: Oracle Linux Server 7.9
**What happened**:
One of our new BashOperator tasks failed and no useful information could be
found about the cause of failure in the task logs. This seemed odd, so I
temporarily turned off JSON formatting, and the non-JSON logs contained a lot
more information. With JSON logging turned on, only the first line was logged
and the entire traceback was discarded.
I've replaced the values of dag_id, task_id and the bash_command path in the
following logs with similar placeholders. Their actual values are not relevant
in reproducing the error.
ERROR level logs if json_format=True:
```
{"asctime": "2021-03-19 15:33:05,620", "filename": "taskinstance.py",
"lineno": 1455, "levelname": "ERROR", "message": "sh /path/to/my/script.sh",
"created": 1616160785.620597, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_19T13_33_04_356460", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-19T13:33:04.356460+00:00-1", "offset":
1616160785567723008}
```
ERROR level logs if json_format=False:
```
[2021-03-19 15:37:19,468] {{taskinstance.py:1455}} ERROR - sh
/path/to/my/script.sh
Traceback (most recent call last):
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py",
line 1112, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py",
line 1249, in _prepare_and_execute_task_with_callbacks
self.render_templates(context=context)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py",
line 1750, in render_templates
self.task.render_template_fields(context)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/baseoperator.py",
line 859, in render_template_fields
self._do_render_template_fields(self, self.template_fields, context,
jinja_env, set())
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/baseoperator.py",
line 872, in _do_render_template_fields
rendered_content = self.render_template(content, context, jinja_env,
seen_oids)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/baseoperator.py",
line 907, in render_template
return jinja_env.get_template(content).render(**context)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/environment.py",
line 883, in get_template
return self._load_template(name, self.make_globals(globals))
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/environment.py",
line 857, in _load_template
template = self.loader.load(self, name, globals)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/loaders.py", line
115, in load
source, filename, uptodate = self.get_source(environment, name)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/jinja2/loaders.py", line
197, in get_source
raise TemplateNotFound(template)
jinja2.exceptions.TemplateNotFound: sh /path/to/my/script.sh
```
The actual error itself is irrelevant. I understand that it happened because
the command was interpreted as a template and I've fixed the command. The
problem is that the error cause was not logged properly with json logging
turned on, which made it a lot more difficult to identify the cause of the
error.
**What you expected to happen**:
Log messages should contain the same information regardless of log format.
The entire traceback should be included in json logs, preferably within a
single json record with escaped linebreaks.
**How to reproduce it**:
1. Set json_format=True in airflow.cfg
2. Create a dag with one BashOperator task. Set bash_command to something
that cannot be successfully run, e.g the wrong path to a template.
3. Run the task and check task logs
4. Repeat with json_format=False and compare
**Anything else we need to know**:
I don't know if this problem is specific to BashOperator or happens with
other operators as well.
The same behaviour can be observed if another kind of invalid command is
supplied, e.g a bash_command to execute a file that does not exist. Please see
the ERROR level log line in the following examples.
json_format=False
```
[2021-03-23 09:57:29,977] {{taskinstance.py:851}} INFO - Dependencies all
met for <TaskInstance: my_dag.my_task 2021-03-23T07:56:28.535261+00:00 [queued]>
[2021-03-23 09:57:29,997] {{taskinstance.py:851}} INFO - Dependencies all
met for <TaskInstance: my_dag.my_task 2021-03-23T07:56:28.535261+00:00 [queued]>
[2021-03-23 09:57:29,997] {{taskinstance.py:1042}} INFO -
--------------------------------------------------------------------------------
[2021-03-23 09:57:29,997] {{taskinstance.py:1043}} INFO - Starting attempt 1
of 1
[2021-03-23 09:57:29,997] {{taskinstance.py:1044}} INFO -
--------------------------------------------------------------------------------
[2021-03-23 09:57:30,012] {{taskinstance.py:1063}} INFO - Executing
<Task(BashOperator): my_task> on 2021-03-23T07:56:28.535261+00:00
[2021-03-23 09:57:30,015] {{standard_task_runner.py:52}} INFO - Started
process 3461 to run task
[2021-03-23 09:57:30,021] {{standard_task_runner.py:76}} INFO - Running:
['airflow', 'tasks', 'run', 'my_dag', 'my_task',
'2021-03-23T07:56:28.535261+00:00', '--job-id', '169438', '--pool',
'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/dags/my_dag.py',
'--cfg-path', '/tmp/tmpcgczuh72', '--error-file', '/tmp/tmpj7gc2ajl']
[2021-03-23 09:57:30,023] {{standard_task_runner.py:77}} INFO - Job 169438:
Subtask my_task
[2021-03-23 09:57:30,092] {{logging_mixin.py:104}} INFO - Running
<TaskInstance: my_dag.my_task 2021-03-23T07:56:28.535261+00:00 [running]> on
host airflow.host
[2021-03-23 09:57:30,183] {{taskinstance.py:1257}} INFO - Exporting the
following env vars:
AIRFLOW_CTX_DAG_OWNER=Teamplatform
AIRFLOW_CTX_DAG_ID=my_dag
AIRFLOW_CTX_TASK_ID=my_task
AIRFLOW_CTX_EXECUTION_DATE=2021-03-23T07:56:28.535261+00:00
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-03-23T07:56:28.535261+00:00
[2021-03-23 09:57:30,185] {{bash.py:135}} INFO - Tmp dir root location:
/tmp
[2021-03-23 09:57:30,185] {{bash.py:158}} INFO - Running command:
/wrong/path/to/zabbix_sender -z zabbix.host -s airflow.host -k
airflow.scheduler.delay -o 61
[2021-03-23 09:57:30,192] {{bash.py:169}} INFO - Output:
[2021-03-23 09:57:30,193] {{bash.py:173}} INFO - bash:
/wrong/path/to/zabbix_sender: No such file or directory
[2021-03-23 09:57:30,194] {{bash.py:177}} INFO - Command exited with return
code 127
[2021-03-23 09:57:30,209] {{taskinstance.py:1455}} ERROR - Bash command
failed. The command returned a non-zero exit code.
Traceback (most recent call last):
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py",
line 1112, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py",
line 1285, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/taskinstance.py",
line 1315, in _execute_task
result = task_copy.execute(context=context)
File
"/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/operators/bash.py",
line 180, in execute
raise AirflowException('Bash command failed. The command returned a
non-zero exit code.')
airflow.exceptions.AirflowException: Bash command failed. The command
returned a non-zero exit code.
[2021-03-23 09:57:30,212] {{taskinstance.py:1503}} INFO - Marking task as
FAILED. dag_id=my_dag, task_id=my_task, execution_date=20210323T075628,
start_date=20210323T075729, end_date=20210323T075730
[2021-03-23 09:57:30,275] {{local_task_job.py:146}} INFO - Task exited with
return code 1
end_of_log
```
json_format=True
```
{"asctime": "2021-03-23 09:54:29,572", "filename": "taskinstance.py",
"lineno": 851, "levelname": "INFO", "message": "Dependencies all met for
<TaskInstance: my_dag.my_task 2021-03-23T07:53:28.535261+00:00 [queued]>",
"created": 1616486069.5725589, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py",
"lineno": 851, "levelname": "INFO", "message": "Dependencies all met for
<TaskInstance: my_dag.my_task 2021-03-23T07:53:28.535261+00:00 [queued]>",
"created": 1616486069.5971286, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py",
"lineno": 1042, "levelname": "INFO", "message":
"\n--------------------------------------------------------------------------------",
"created": 1616486069.5976064, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py",
"lineno": 1043, "levelname": "INFO", "message": "Starting attempt 1 of 1",
"created": 1616486069.5978124, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
{"asctime": "2021-03-23 09:54:29,597", "filename": "taskinstance.py",
"lineno": 1044, "levelname": "INFO", "message":
"\n--------------------------------------------------------------------------------",
"created": 1616486069.5979714, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
{"asctime": "2021-03-23 09:54:29,611", "filename": "taskinstance.py",
"lineno": 1063, "levelname": "INFO", "message": "Executing <Task(BashOperator):
my_task> on 2021-03-23T07:53:28.535261+00:00", "created": 1616486069.611475,
"dag_id": "my_dag", "task_id": "my_task", "execution_date":
"2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
{"asctime": "2021-03-23 09:54:29,616", "filename":
"standard_task_runner.py", "lineno": 52, "levelname": "INFO", "message":
"Started process 2610 to run task", "created": 1616486069.6161618, "dag_id":
"my_dag", "task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261",
"try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
{"asctime": "2021-03-23 09:54:29,622", "filename":
"standard_task_runner.py", "lineno": 76, "levelname": "INFO", "message":
"Running: ['airflow', 'tasks', 'run', 'my_dag', 'my_task',
'2021-03-23T07:53:28.535261+00:00', '--job-id', '169433', '--pool',
'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/dags/my_dag.py',
'--cfg-path', '/tmp/tmp09133rp4', '--error-file', '/tmp/tmpyr5c5hkv']",
"created": 1616486069.622188, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
{"asctime": "2021-03-23 09:54:29,623", "filename":
"standard_task_runner.py", "lineno": 77, "levelname": "INFO", "message": "Job
169433: Subtask my_task", "created": 1616486069.6236992, "dag_id": "my_dag",
"task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261",
"try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
{"asctime": "2021-03-23 09:54:29,769", "filename": "logging_mixin.py",
"lineno": 104, "levelname": "INFO", "message": "Running <TaskInstance:
my_dag.my_task 2021-03-23T07:53:28.535261+00:00 [running]> on host
airflow.host", "created": 1616486069.7698724, "dag_id": "my_dag", "task_id":
"my_task", "execution_date": "2021_03_23T07_53_28_535261", "try_number": "1",
"log_id": "my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069723135744}
{"asctime": "2021-03-23 09:54:29,869", "filename": "taskinstance.py",
"lineno": 1257, "levelname": "INFO", "message": "Exporting the following env
vars:\nAIRFLOW_CTX_DAG_OWNER=Teamplatform\nAIRFLOW_CTX_DAG_ID=my_dag\nAIRFLOW_CTX_TASK_ID=my_task\nAIRFLOW_CTX_EXECUTION_DATE=2021-03-23T07:53:28.535261+00:00\nAIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-03-23T07:53:28.535261+00:00",
"created": 1616486069.8690863, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069723135744}
{"asctime": "2021-03-23 09:54:29,870", "filename": "bash.py", "lineno": 135,
"levelname": "INFO", "message": "Tmp dir root location: \n /tmp", "created":
1616486069.8703568, "dag_id": "my_dag", "task_id": "my_task", "execution_date":
"2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069723135744}
{"asctime": "2021-03-23 09:54:29,871", "filename": "bash.py", "lineno": 158,
"levelname": "INFO", "message": "Running command: /wrong/path/to/zabbix_sender
-z zabbix.host -s airflow.host -k airflow.scheduler.delay -o 61", "created":
1616486069.871164, "dag_id": "my_dag", "task_id": "my_task", "execution_date":
"2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069723135744}
{"asctime": "2021-03-23 09:54:29,880", "filename": "bash.py", "lineno": 169,
"levelname": "INFO", "message": "Output:", "created": 1616486069.8801942,
"dag_id": "my_dag", "task_id": "my_task", "execution_date":
"2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069723135744}
{"asctime": "2021-03-23 09:54:29,882", "filename": "bash.py", "lineno": 173,
"levelname": "INFO", "message": "bash: /wrong/path/to/zabbix_sender: No such
file or directory", "created": 1616486069.8820558, "dag_id": "my_dag",
"task_id": "my_task", "execution_date": "2021_03_23T07_53_28_535261",
"try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069723135744}
{"asctime": "2021-03-23 09:54:29,882", "filename": "bash.py", "lineno": 177,
"levelname": "INFO", "message": "Command exited with return code 127",
"created": 1616486069.882363, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069723135744}
{"asctime": "2021-03-23 09:54:29,899", "filename": "taskinstance.py",
"lineno": 1455, "levelname": "ERROR", "message": "Bash command failed. The
command returned a non-zero exit code.", "created": 1616486069.8990078,
"dag_id": "my_dag", "task_id": "my_task", "execution_date":
"2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069723135744}
{"asctime": "2021-03-23 09:54:29,901", "filename": "taskinstance.py",
"lineno": 1503, "levelname": "INFO", "message": "Marking task as FAILED.
dag_id=my_dag, task_id=my_task, execution_date=20210323T075328,
start_date=20210323T075429, end_date=20210323T075429", "created":
1616486069.9011128, "dag_id": "my_dag", "task_id": "my_task", "execution_date":
"2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069723135744}
{"asctime": "2021-03-23 09:54:29,961", "filename": "local_task_job.py",
"lineno": 146, "levelname": "INFO", "message": "Task exited with return code
1", "created": 1616486069.9613628, "dag_id": "my_dag", "task_id": "my_task",
"execution_date": "2021_03_23T07_53_28_535261", "try_number": "1", "log_id":
"my_dag-my_task-2021-03-23T07:53:28.535261+00:00-1", "offset":
1616486069531250688}
end_of_log
```
In that case, the bash command output is logged anyway in INFO level and so
it's not that big of an issue compared to the case I've described above. Either
way it would be good to have this information in json logs as well.
Our logging configuration:
```
[logging]
base_log_folder = <path to log folder>
remote_logging = True
remote_log_conn_id =
google_key_path =
remote_base_log_folder =
encrypt_s3_logs = False
logging_level = INFO
fab_logging_level = WARN
logging_config_class =
colored_console_log = True
colored_log_format = [%(blue)s%(asctime)s%(reset)s]
{{%(blue)s%(filename)s:%(reset)s%(lineno)d}}
%(log_color)s%(levelname)s%(reset)s - %(log_color)s%(message)s%(reset)s
colored_formatter_class =
airflow.utils.log.colored_log.CustomTTYColoredFormatter
log_format = [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s -
%(message)s
simple_log_format = %(asctime)s %(levelname)s - %(message)s
task_log_prefix_template =
log_filename_template = run-{{ ti.dag_id }}.log
log_processor_filename_template = {{ filename }}.log
dag_processor_manager_log_location = <path to log file>
task_log_reader = task
extra_loggers =
[elasticsearch]
host = <elasticsearch connection string>
log_id_template =
{{ti.dag_id}}-{{ti.task_id}}-{{execution_date}}-{{try_number}}
end_of_log_mark = end_of_log
frontend =
write_stdout = False
json_format = True
json_fields = asctime, filename, lineno, levelname, message, created
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]