pankajkoti opened a new issue, #66416:
URL: https://github.com/apache/airflow/issues/66416
### Under which category would you file this issue?
Task SDK
### Apache Airflow version
3.2.1
### What happened and how to reproduce it?
When a task whose operator sets `overwrite_rtif_after_execution = True`
raises an exception during `execute()`, the task supervisor/finalize path
attempts to update the rendered template fields *after* the failure has already
been reported. The SDK then sends a request to the API server that is no longer
valid for the TI's state and gets back:
`AirflowRuntimeError: API_SERVER_ERROR: {'status_code': 404, 'message': 'Not
Found', 'detail':
{'detail': 'Not Found'}}`
This surfaces as a top-level error in the task log right after the original
`RuntimeError`, so the user sees two stacked tracebacks where they should see
only the original failure. This also affects remote logging where users don't
see the remote logs upon retries for the earlier failed attempts because maybe
the upload to remote logging is aborted/does not happen(?).
This was originally reported by Cosmos users in
astronomer/astronomer-cosmos#2021 because the Cosmos local execution operator
opts in to `overwrite_rtif_after_execution = True` on Airflow 3.x to refresh
the rendered `compiled_sql` after the dbt invocation. However, the failure is
not Cosmos-specific: any operator that sets this flag and then raises will hit
the same path.
#62070 wrapped the `SetRenderedFields` call in `finalize()` with try/except
so the original task failure is not masked. #63705 then simplified the error
logging to avoid a `RecursionError` in the `structlog` JSON fallback when the
error context is logged. Even with both merged, on `3.2.1` we still see `Failed
to set rendered fields during finalization` followed by `AirflowRuntimeError:
API_SERVER_ERROR: 404 Not Found`. #63719 ("Only update RTIF for terminal task
states") is also being attempted as a solution but it's in draft.
### Minimal reproduction (no Cosmos required)
```python
# dags/repro_rtif_finalize.py
from __future__ import annotations
import pendulum
from airflow.sdk import DAG
from airflow.sdk.bases.operator import BaseOperator
class FailingOverwriteRTIFOperator(BaseOperator):
"""Minimal operator that triggers the finalize-time RTIF update
path."""
template_fields = ("message",)
overwrite_rtif_after_execution = True
def __init__(self, *, message: str = "hello {{ ds }}", **kwargs):
super().__init__(**kwargs)
self.message = message
def execute(self, context):
# Simulate any runtime failure during execute (DB error, network,
etc.)
raise RuntimeError("Intentional failure to reproduce RTIF finalize
bug")
with DAG(
dag_id="repro_rtif_finalize",
start_date=pendulum.datetime(2026, 1, 1, tz="UTC"),
schedule=None,
catchup=False,
):
FailingOverwriteRTIFOperator(task_id="boom")
```
### How to reproduce
1. Drop the DAG above into a fresh Airflow 3.x environment (no special
executor or logging configuration required).
2. Trigger repro_rtif_finalize once.
3. Look at the task log for the first attempt. You will see the intended
RuntimeError from execute(), followed by:
- on 3.1.0–3.1.7: Top level error: `AirflowRuntimeError:
API_SERVER_ERROR: {'status_code': 404,
'message': 'Not Found', 'detail': {'detail': 'Not Found'}}`
- on 3.1.8+ (with #62070 + #63705 applied):
`Failed to set rendered fields during finalization` ...
`AirflowRuntimeError: API_SERVER_ERROR: 404 Not
Found`
### What you think should happen instead?
A failing task whose operator declares `overwrite_rtif_after_execution =
True` should not produce a finalize-time `AirflowRuntimeError` i'm.
Conceptually, rendered template fields should not be re-pushed to the API
server when the TI has already moved into a failure state for which that
endpoint is not valid (this matches the direction of #63719) or a better
solution?
### Operating System
_No response_
### Deployment
None
### Apache Airflow Provider(s)
_No response_
### Versions of Apache Airflow Providers
_No response_
### Official Helm Chart version
Not Applicable
### Kubernetes Version
_No response_
### Helm Chart configuration
_No response_
### Docker Image customizations
_No response_
### Anything else?
- Original Cosmos report with the full traceback:
astronomer/astronomer-cosmos#2021
- Related PRs: #62070 (merged, 3.1.8), #63705 (merged), #63719 (draft).
- Cosmos call site that opts in to the flag for context:
`cosmos/operators/local.py` `_override_rtif`
(`self.overwrite_rtif_after_execution = True` on Airflow 3.x) ->
https://github.com/astronomer/astronomer-cosmos/blob/d33115b69da5573b33123c310a5a7b6fbc02a364/cosmos/operators/local.py#L420.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]