patgarz opened a new issue, #25964:
URL: https://github.com/apache/airflow/issues/25964

   ### Apache Airflow version
   
   Other Airflow 2 version
   
   ### What happened
   
   We're using `on_failure_callback` to trigger alerts when a TaskInstance 
fails. The exception details, including stack trace, were available in Airflow 
1(.10.14), however, upon upgrading to Airflow 2(.2.5), this is no longer 
working. For example, in Airflow 1, we were able to do something like this:
   
   ```py
   import traceback
   
   ...
   
   exception = context.get('exception')
   formatted_exception = ''.join(
      traceback.format_exception(etype=type(exception), 
        value=exception, tb=exception.__traceback__
      )
   ).strip()
   ```
   
   However, when we attempt to do this in Airflow 2, this produces an error:
   
   ```text
   {taskinstance.py:1607} ERROR - Error when executing on_failure_callback
   Traceback (most recent call last):
     File 
"/app/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 
1605, in _run_finished_callback
       task.on_failure_callback(context)
     File "[SCRUBBED]", line 381, in send_error_alert
       formatted_exception = 
''.join(traceback.format_exception(etype=type(exception), value=exception, 
tb=exception.__traceback__)).strip()
   AttributeError: 'str' object has no attribute '__traceback__'
   ```
   
   In this case the `context['exception']` is returning a string, but this 
isn't consistent in other tests (this is coming from an `EmrCreateOperator`, 
whereas other tests --- see below sections --- are coming from a 
`PythonOperator`). What is consistent, though, is that nothing besides the 
error description text seems to be accessible.
   
   ### What you think should happen instead
   
   Testing in base Python:
   
   ```python
   import logging
   
   try:
       print(undefined_var)
   except BaseException as e:
       logging.exception(e)
   ```
   has results that include not only the error message, but the full trace and 
error type
   
   ```text
   ERROR:root:name 'undefined_var' is not defined
   Traceback (most recent call last):
     File "<stdin>", line 2, in <module>
   NameError: name 'undefined_var' is not defined
   ```
   
   ### How to reproduce
   
   DAG Code:
   
   ```python
   from datetime import datetime, timedelta
   
   from airflow import DAG
   from airflow.models import TaskInstance
   from airflow.operators.python import PythonOperator
   # or .python_operator for Airflow 1
   
   def _alarm(context):
       import logging
       import traceback
   
   
       logging.info("Task instance on_failure")
       traceback.print_exc()
       exception = context.get('exception')
       logging.info(f"exception type :: {type(exception)}")
       logging.info(f"exception :: {exception}")
       logging.exception(exception)
       logging.info(f"{exception.__traceback__}")
   
   
   default_args = {
       "owner": "xxxx",
       "email_on_failure": False,
       #"on_failure_callback": _alarm,
   }
   
   def bad_python():
   
       print(undefined_var)
   
   with DAG(
       dag_id="failure_callback_testing",
       start_date=datetime(2021, 9, 7),
       schedule_interval=None,
       default_args=default_args,
       catchup=False,
       dagrun_timeout=timedelta(seconds=15),
   ) as dag:
   
       will_fail = PythonOperator(
           task_id="will_fail",
           python_callable=bad_python,
           on_failure_callback=_alarm,
       )
   will_fail
   ```
   
   has results (Airflow 2):
   
   ```text
   [2022-08-25, 14:29:39 EDT] {taskinstance.py:1774} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File 
"/app/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 
174, in execute
       return_value = self.execute_callable()
     File 
"/app/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 
188, in execute_callable
       return self.python_callable(*self.op_args, **self.op_kwargs)
     File "[SCRUBBED]", line 28, in bad_python
       print(undefined_var)
   NameError: name 'undefined_var' is not defined
   [2022-08-25, 14:29:39 EDT] {taskinstance.py:1278} INFO - Marking task as 
FAILED. dag_id=failure_callback_testing, task_id=will_fail, 
execution_date=20220825T182935, start_date=20220825T182939, 
end_date=20220825T182939
   [2022-08-25, 14:29:39 EDT] {standard_task_runner.py:93} ERROR - Failed to 
execute job 79590 for task will_fail (name 'undefined_var' is not defined; 
29411)
   [2022-08-25, 14:29:39 EDT] {local_task_job.py:154} INFO - Task exited with 
return code 1
   [2022-08-25, 14:29:39 EDT] {test_failure_callback.py:12} INFO - Task 
instance on_failure
   [2022-08-25, 14:29:39 EDT] {logging_mixin.py:109} WARNING - NoneType: None
   [2022-08-25, 14:29:39 EDT] {test_failure_callback.py:15} INFO - exception 
type :: <class 'NameError'>
   [2022-08-25, 14:29:39 EDT] {test_failure_callback.py:16} ERROR - name 
'undefined_var' is not defined
   NoneType: None
   [2022-08-25, 14:29:39 EDT] {test_failure_callback.py:17} INFO - None
   [2022-08-25, 14:29:40 EDT] {local_task_job.py:264} INFO - 0 downstream tasks 
scheduled from follow-on schedule check
   ```
   
   but prints the expected in Airflow 1:
   
   ```text
   [2022-08-25 18:29:30,313] {taskinstance.py:1150} ERROR - name 
'undefined_var' is not defined
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 
984, in _run_raw_task
       result = task_copy.execute(context=context)
     File 
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py", 
line 113, in execute
       return_value = self.execute_callable()
     File 
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py", 
line 118, in execute_callable
       return self.python_callable(*self.op_args, **self.op_kwargs)
     File "[SCRUBBED]", line 29, in bad_python
       print(undefined_var)
   NameError: name 'undefined_var' is not defined
   [2022-08-25 18:29:30,315] {taskinstance.py:1187} INFO - Marking task as 
FAILED. dag_id=failure_callback_testing, task_id=will_fail, 
execution_date=20220825T182922, start_date=20220825T182929, 
end_date=20220825T182930
   [2022-08-25 18:29:30,315] {test_failure_callback.py:12} INFO - Task instance 
on_failure
   [2022-08-25 18:29:30,315] {logging_mixin.py:112} WARNING - Traceback (most 
recent call last):
   [2022-08-25 18:29:30,315] {logging_mixin.py:112} WARNING -   File 
"/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 
984, in _run_raw_task
       result = task_copy.execute(context=context)
   [2022-08-25 18:29:30,315] {logging_mixin.py:112} WARNING -   File 
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py", 
line 113, in execute
       return_value = self.execute_callable()
   [2022-08-25 18:29:30,316] {logging_mixin.py:112} WARNING -   File 
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py", 
line 118, in execute_callable
       return self.python_callable(*self.op_args, **self.op_kwargs)
   [2022-08-25 18:29:30,316] {logging_mixin.py:112} WARNING -   File 
"[SCRUBBED]", line 29, in bad_python
       print(undefined_var)
   [2022-08-25 18:29:30,316] {logging_mixin.py:112} WARNING - NameError: name 
'undefined_var' is not defined
   [2022-08-25 18:29:30,316] {test_failure_callback.py:15} INFO - exception 
type :: <class 'NameError'>
   [2022-08-25 18:29:30,316] {test_failure_callback.py:16} INFO - exception :: 
name 'undefined_var' is not defined
   [2022-08-25 18:29:30,316] {test_failure_callback.py:17} ERROR - name 
'undefined_var' is not defined
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 
984, in _run_raw_task
       result = task_copy.execute(context=context)
     File 
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py", 
line 113, in execute
       return_value = self.execute_callable()
     File 
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py", 
line 118, in execute_callable
       return self.python_callable(*self.op_args, **self.op_kwargs)
     File "[SCRUBBED]", line 29, in bad_python
       print(undefined_var)
   NameError: name 'undefined_var' is not defined
   [2022-08-25 18:29:30,316] {test_failure_callback.py:18} INFO - <traceback 
object at 0x7f18b69e1100>
   [2022-08-25 18:29:34,921] {local_task_job.py:102} INFO - Task exited with 
return code 1
   ```
   
   ### Operating System
   
   Unbuntu 18.04.6
   
   ### Versions of Apache Airflow Providers
   
   ```text
   apache-airflow-providers-amazon==3.2.0
   apache-airflow-providers-celery==2.1.3
   apache-airflow-providers-datadog==2.0.4
   apache-airflow-providers-ftp==2.1.2
   apache-airflow-providers-http==2.1.2
   apache-airflow-providers-imap==2.2.3
   apache-airflow-providers-postgres==4.1.0
   apache-airflow-providers-redis==2.0.4
   apache-airflow-providers-slack==4.2.3
   apache-airflow-providers-snowflake==2.6.0
   apache-airflow-providers-sqlite==2.1.3
   apache-airflow-providers-ssh==2.4.3
   ```
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   * Pickling is disabled (default setting)
   * PostgreSQL backend (12.8)
   * LocalExecutors
   
   ### Anything else
   
   Worked in Airflow 1, have a strong suspicion it's related to pickling being 
disabled, but being a not-Python-person, I'm not entirely sure what that means 
so I'm not confident that's the issue.
   
   Would be willing to submit a PR but unfortunately:
   
   1. I wouldn't really know where to start, and 
   2. My org unfortunately has fairly strict OSS contribution guidelines and 
I'm not sure how arduous the process to get something approved would be
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to