hterik opened a new issue #19983:
URL: https://github.com/apache/airflow/issues/19983


   ### Description
   
   Given a simple dag like this:
   ```py
   from airflow import DAG
   from airflow.utils.dates import days_ago
   from airflow.operators.bash import BashOperator
   
   with DAG("instantfail",
            description="instantfail",
            tags=["example"],
            catchup=False,
            schedule_interval=None,
            start_date=days_ago(2),
           ) as dag:
   
       BashOperator(bash_command="echo hello && false", task_id="b")
   ```
   
   The error log from the failing bash command, `false`, becomes incredibly big 
with lots of irrelevant information for the writer of the DAG, if that's not 
enough, the error is printed twice:
   ```
   [2021-12-02, 17:07:01 UTC] {subprocess.py:85} INFO - Output:
   [2021-12-02, 17:07:01 UTC] {subprocess.py:89} INFO - hello
   [2021-12-02, 17:07:01 UTC] {subprocess.py:93} INFO - Command exited with 
return code 1
   [2021-12-02, 17:07:01 UTC] {taskinstance.py:1703} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/models/taskinstance.py",
 line 1332, in _run_raw_task
       self._execute_task_with_callbacks(context)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/models/taskinstance.py",
 line 1458, in _execute_task_with_callbacks
       result = self._execute_task(context, self.task)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/models/taskinstance.py",
 line 1514, in _execute_task
       result = execute_callable(context=context)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/operators/bash.py",
 line 187, in execute
       raise AirflowException(
   airflow.exceptions.AirflowException: Bash command failed. The command 
returned a non-zero exit code 1.
   [2021-12-02, 17:07:01 UTC] {taskinstance.py:1270} INFO - Marking task as 
FAILED. dag_id=instantfail, task_id=b, execution_date=20211202T160659, 
start_date=20211202T160701, end_date=20211202T160701
   [2021-12-02, 17:07:01 UTC] {standard_task_runner.py:88} ERROR - Failed to 
execute job 292 for task b
   Traceback (most recent call last):
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/task/task_runner/standard_task_runner.py",
 line 85, in _start_by_fork
       args.func(args, dag=self.dag)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/cli/cli_parser.py",
 line 48, in command
       return func(*args, **kwargs)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/utils/cli.py",
 line 92, in wrapper
       return f(*args, **kwargs)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/cli/commands/task_command.py",
 line 292, in task_run
       _run_task_by_selected_method(args, dag, ti)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/cli/commands/task_command.py",
 line 107, in _run_task_by_selected_method
       _run_raw_task(args, ti)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/cli/commands/task_command.py",
 line 180, in _run_raw_task
       ti._run_raw_task(
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/utils/session.py",
 line 70, in wrapper
       return func(*args, session=session, **kwargs)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/models/taskinstance.py",
 line 1332, in _run_raw_task
       self._execute_task_with_callbacks(context)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/models/taskinstance.py",
 line 1458, in _execute_task_with_callbacks
       result = self._execute_task(context, self.task)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/models/taskinstance.py",
 line 1514, in _execute_task
       result = execute_callable(context=context)
     File 
"/homedir/.local/share/virtualenvs/airflow-uWHa5xnK/lib/python3.9/site-packages/airflow/operators/bash.py",
 line 187, in execute
       raise AirflowException(
   airflow.exceptions.AirflowException: Bash command failed. The command 
returned a non-zero exit code 1.
   [2021-12-02, 17:07:01 UTC] {local_task_job.py:154} INFO - Task exited with 
return code 1
   [2021-12-02, 17:07:01 UTC] {local_task_job.py:264} INFO - 0 downstream tasks 
scheduled from follow-on schedule check
   ```
   Often operators can be be more advanced, especially PythonOperator and 
taskflow which frequently creates equally deep stack traces themselves, which 
is usually what you are interested in, but together it accumulates into several 
pages of error.
   
   I'm usually the guy who thinks a stack trace is the best error message and 
like the context it provides, but here i think it goes a bit too far.
   
   ------------------------------------------------------------
   
   1. Why is the error printed twice?
   2. The writer of a dag should be able to assume the execution path up to 
execute_callable() was sane. If i'm debugging a DAG or an Operator i don't want 
to debug Airflow task runner internals. I don't know how true this is but this 
should be the goal at least.
   
   Expected log output should only contain 
   ```
     File "airflow/operators/bash.py", line 187, in execute
       raise AirflowException(
   airflow.exceptions.AirflowException: Bash command failed. The command 
returned a non-zero exit code 1.
   ```
   
   Can `TaskInstance._execute_task` be made to catch the operators exception 
and rethrow as something else, that can be ignored at the end by 
`standard_task_runner`?
   
   Or is this something that simply should be solved by logging configuration?
   
   
   
   ### Use case/motivation
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to