manipatnam opened a new issue, #67451:
URL: https://github.com/apache/airflow/issues/67451
### Under which category would you file this issue?
Airflow Core
### Apache Airflow version
3.2.1
### What happened and how to reproduce it?
When fetching task instances with `order_by=rendered_map_index` via:
```
GET
/api/v2/dags/{dag_id}/dagRuns/{run_id}/taskInstances?order_by=rendered_map_index
```
the results appear to be in correct order, but **retried mapped task
instances are silently pushed off the first page** and cannot be seen without
manual pagination.
## How to reproduce
1. Create a DAG with a mapped task that does **not** set
`map_index_template` (the default)
2. Run it, then **retry some of the mapped task instances**
3. Call:
```
GET /api/v2/dags/{dag_id}/dagRuns/{run_id}/taskInstances
?task_id={mapped_task_id}
&order_by=rendered_map_index
&limit=50&offset=0
```
4. Observe: retried `map_index` values are missing from the response even
though `total_entries` shows they exist
**Confirm in the DB:**
```sql
-- Confirm all rendered_map_index are NULL
SELECT
COUNT(*) AS total,
COUNT(rendered_map_index) AS non_null,
COUNT(*) - COUNT(rendered_map_index) AS null_count
FROM task_instance
WHERE dag_id = '...' AND run_id = '...' AND task_id = '...';
-- Retried TIs have larger UUIDs and sort to the end
SELECT
map_index,
row_number() OVER (ORDER BY rendered_map_index ASC, id ASC) AS
actual_position,
row_number() OVER (ORDER BY map_index ASC) AS
expected_position
FROM task_instance
WHERE dag_id = '...' AND run_id = '...' AND task_id = '...';
```
Retried `map_index` values will show `actual_position` near the end (e.g.
78–86)
while their `expected_position` is 3, 9, 10, etc.
### What you think should happen instead?
Sorting by `rendered_map_index` should produce a **stable, predictable
order** regardless of
whether a task instance has been retried or not.
- When a DAG uses `map_index_template`, task instances should be ordered by
their human-readable
label (e.g. alphabetically by city name, region, etc.)
- When a DAG does **not** use `map_index_template` (the common case), task
instances should be
ordered numerically by their `map_index` — 0, 1, 2, 3, 4... — with no gaps
caused by retries
- The API response field `rendered_map_index` and the actual sort order
should be consistent —
if the response shows `"rendered_map_index": "2"`, that row must actually
be sorted as if its
value is `"2"`, not silently sorted by an internal UUID
- Sorting by `rendered_map_index` over more than 9 mapped instances should
return them in
numeric order (0, 1, 2, ..., 10, 11, ...), not lexicographic order (0, 1,
10, 11, ..., 2, 20, ...)
### Operating System
Debian
### Deployment
Astronomer
### Apache Airflow Provider(s)
_No response_
### Versions of Apache Airflow Providers
_No response_
### Official Helm Chart version
Not Applicable
### Kubernetes Version
_No response_
### Helm Chart configuration
_No response_
### Docker Image customizations
_No response_
### Anything else?
### Root cause
`rendered_map_index` is defined as a `hybrid_property` on `TaskInstance` but
has **no `.expression` defined**:
```python
@hybrid_property
def rendered_map_index(self) -> str | None:
if self._rendered_map_index is not None:
return self._rendered_map_index
if self.map_index >= 0:
return str(self.map_index) # Python only — SQL never sees this
fallback
return None
```
When `SortParam` resolves this at the class level for SQL, it gets the raw
DB column (`String(250)`).
Since `rendered_map_index` is **only written to the DB when the DAG author
sets `map_index_template`**
on the operator, it is `NULL` for the vast majority of tasks.
With all values `NULL`, `ORDER BY rendered_map_index ASC` has no effect and
the actual sort falls
through to the UUID tiebreaker (`id ASC`). UUID v7 encodes creation time —
retried task instances
get new UUIDs at retry time (later timestamps), so they sort to the **end of
the result set** and
fall outside `LIMIT 50`.
The API response then calls the Python `hybrid_property` getter during
Pydantic serialization,
which returns `str(map_index)` as a fallback — so the JSON looks correct
(showing `"rendered_map_index": "0"`, `"1"`, `"3"`...) and completely hides
the fact that
`map_index=2` was skipped.
### Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]