patgarz opened a new issue, #25964:
URL: https://github.com/apache/airflow/issues/25964
### Apache Airflow version
Other Airflow 2 version
### What happened
We're using `on_failure_callback` to trigger alerts when a TaskInstance
fails. The exception details, including stack trace, were available in Airflow
1(.10.14), however, upon upgrading to Airflow 2(.2.5), this is no longer
working. For example, in Airflow 1, we were able to do something like this:
```py
import traceback
...
exception = context.get('exception')
formatted_exception = ''.join(
traceback.format_exception(etype=type(exception),
value=exception, tb=exception.__traceback__
)
).strip()
```
However, when we attempt to do this in Airflow 2, this produces an error:
```text
{taskinstance.py:1607} ERROR - Error when executing on_failure_callback
Traceback (most recent call last):
File
"/app/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line
1605, in _run_finished_callback
task.on_failure_callback(context)
File "[SCRUBBED]", line 381, in send_error_alert
formatted_exception =
''.join(traceback.format_exception(etype=type(exception), value=exception,
tb=exception.__traceback__)).strip()
AttributeError: 'str' object has no attribute '__traceback__'
```
In this case the `context['exception']` is returning a string, but this
isn't consistent in other tests (this is coming from an `EmrCreateOperator`,
whereas other tests --- see below sections --- are coming from a
`PythonOperator`). What is consistent, though, is that nothing besides the
error description text seems to be accessible.
### What you think should happen instead
Testing in base Python:
```python
import logging
try:
print(undefined_var)
except BaseException as e:
logging.exception(e)
```
has results that include not only the error message, but the full trace and
error type
```text
ERROR:root:name 'undefined_var' is not defined
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
NameError: name 'undefined_var' is not defined
```
### How to reproduce
DAG Code:
```python
from datetime import datetime, timedelta
from airflow import DAG
from airflow.models import TaskInstance
from airflow.operators.python import PythonOperator
# or .python_operator for Airflow 1
def _alarm(context):
import logging
import traceback
logging.info("Task instance on_failure")
traceback.print_exc()
exception = context.get('exception')
logging.info(f"exception type :: {type(exception)}")
logging.info(f"exception :: {exception}")
logging.exception(exception)
logging.info(f"{exception.__traceback__}")
default_args = {
"owner": "xxxx",
"email_on_failure": False,
#"on_failure_callback": _alarm,
}
def bad_python():
print(undefined_var)
with DAG(
dag_id="failure_callback_testing",
start_date=datetime(2021, 9, 7),
schedule_interval=None,
default_args=default_args,
catchup=False,
dagrun_timeout=timedelta(seconds=15),
) as dag:
will_fail = PythonOperator(
task_id="will_fail",
python_callable=bad_python,
on_failure_callback=_alarm,
)
will_fail
```
has results (Airflow 2):
```text
[2022-08-25, 14:29:39 EDT] {taskinstance.py:1774} ERROR - Task failed with
exception
Traceback (most recent call last):
File
"/app/.local/lib/python3.8/site-packages/airflow/operators/python.py", line
174, in execute
return_value = self.execute_callable()
File
"/app/.local/lib/python3.8/site-packages/airflow/operators/python.py", line
188, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "[SCRUBBED]", line 28, in bad_python
print(undefined_var)
NameError: name 'undefined_var' is not defined
[2022-08-25, 14:29:39 EDT] {taskinstance.py:1278} INFO - Marking task as
FAILED. dag_id=failure_callback_testing, task_id=will_fail,
execution_date=20220825T182935, start_date=20220825T182939,
end_date=20220825T182939
[2022-08-25, 14:29:39 EDT] {standard_task_runner.py:93} ERROR - Failed to
execute job 79590 for task will_fail (name 'undefined_var' is not defined;
29411)
[2022-08-25, 14:29:39 EDT] {local_task_job.py:154} INFO - Task exited with
return code 1
[2022-08-25, 14:29:39 EDT] {test_failure_callback.py:12} INFO - Task
instance on_failure
[2022-08-25, 14:29:39 EDT] {logging_mixin.py:109} WARNING - NoneType: None
[2022-08-25, 14:29:39 EDT] {test_failure_callback.py:15} INFO - exception
type :: <class 'NameError'>
[2022-08-25, 14:29:39 EDT] {test_failure_callback.py:16} ERROR - name
'undefined_var' is not defined
NoneType: None
[2022-08-25, 14:29:39 EDT] {test_failure_callback.py:17} INFO - None
[2022-08-25, 14:29:40 EDT] {local_task_job.py:264} INFO - 0 downstream tasks
scheduled from follow-on schedule check
```
but prints the expected in Airflow 1:
```text
[2022-08-25 18:29:30,313] {taskinstance.py:1150} ERROR - name
'undefined_var' is not defined
Traceback (most recent call last):
File
"/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line
984, in _run_raw_task
result = task_copy.execute(context=context)
File
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py",
line 113, in execute
return_value = self.execute_callable()
File
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py",
line 118, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "[SCRUBBED]", line 29, in bad_python
print(undefined_var)
NameError: name 'undefined_var' is not defined
[2022-08-25 18:29:30,315] {taskinstance.py:1187} INFO - Marking task as
FAILED. dag_id=failure_callback_testing, task_id=will_fail,
execution_date=20220825T182922, start_date=20220825T182929,
end_date=20220825T182930
[2022-08-25 18:29:30,315] {test_failure_callback.py:12} INFO - Task instance
on_failure
[2022-08-25 18:29:30,315] {logging_mixin.py:112} WARNING - Traceback (most
recent call last):
[2022-08-25 18:29:30,315] {logging_mixin.py:112} WARNING - File
"/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line
984, in _run_raw_task
result = task_copy.execute(context=context)
[2022-08-25 18:29:30,315] {logging_mixin.py:112} WARNING - File
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py",
line 113, in execute
return_value = self.execute_callable()
[2022-08-25 18:29:30,316] {logging_mixin.py:112} WARNING - File
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py",
line 118, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
[2022-08-25 18:29:30,316] {logging_mixin.py:112} WARNING - File
"[SCRUBBED]", line 29, in bad_python
print(undefined_var)
[2022-08-25 18:29:30,316] {logging_mixin.py:112} WARNING - NameError: name
'undefined_var' is not defined
[2022-08-25 18:29:30,316] {test_failure_callback.py:15} INFO - exception
type :: <class 'NameError'>
[2022-08-25 18:29:30,316] {test_failure_callback.py:16} INFO - exception ::
name 'undefined_var' is not defined
[2022-08-25 18:29:30,316] {test_failure_callback.py:17} ERROR - name
'undefined_var' is not defined
Traceback (most recent call last):
File
"/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line
984, in _run_raw_task
result = task_copy.execute(context=context)
File
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py",
line 113, in execute
return_value = self.execute_callable()
File
"/usr/local/lib/python3.8/dist-packages/airflow/operators/python_operator.py",
line 118, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "[SCRUBBED]", line 29, in bad_python
print(undefined_var)
NameError: name 'undefined_var' is not defined
[2022-08-25 18:29:30,316] {test_failure_callback.py:18} INFO - <traceback
object at 0x7f18b69e1100>
[2022-08-25 18:29:34,921] {local_task_job.py:102} INFO - Task exited with
return code 1
```
### Operating System
Unbuntu 18.04.6
### Versions of Apache Airflow Providers
```text
apache-airflow-providers-amazon==3.2.0
apache-airflow-providers-celery==2.1.3
apache-airflow-providers-datadog==2.0.4
apache-airflow-providers-ftp==2.1.2
apache-airflow-providers-http==2.1.2
apache-airflow-providers-imap==2.2.3
apache-airflow-providers-postgres==4.1.0
apache-airflow-providers-redis==2.0.4
apache-airflow-providers-slack==4.2.3
apache-airflow-providers-snowflake==2.6.0
apache-airflow-providers-sqlite==2.1.3
apache-airflow-providers-ssh==2.4.3
```
### Deployment
Other Docker-based deployment
### Deployment details
* Pickling is disabled (default setting)
* PostgreSQL backend (12.8)
* LocalExecutors
### Anything else
Worked in Airflow 1, have a strong suspicion it's related to pickling being
disabled, but being a not-Python-person, I'm not entirely sure what that means
so I'm not confident that's the issue.
Would be willing to submit a PR but unfortunately:
1. I wouldn't really know where to start, and
2. My org unfortunately has fairly strict OSS contribution guidelines and
I'm not sure how arduous the process to get something approved would be
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]