kaxil opened a new pull request, #67932:
URL: https://github.com/apache/airflow/pull/67932
The worker-side walk that registers an operator's structured-output classes
for XCom deserialization (`_register_deserialization_allowed_classes`, reading
each operator's `output_type`) only registers the top-level declared type.
`iter_pydantic_models` walks the *annotation shape* (`Optional` / `Union` /
`list[...]`) but never recurses into a model's own fields, so a model nested
inside the declared type is never added to the per-process deserialization
allow-list.
With a model that nests another model:
```python
class SubQuestion(BaseModel): ...
class DecomposedQuestion(BaseModel):
sub_questions: list[SubQuestion]
@task.llm(output_type=DecomposedQuestion)
def decompose(...): ...
```
a downstream task that emits the nested model to XCom (`return
decomposed.sub_questions` -> `list[SubQuestion]`) fails when its input is
resolved on the consumer:
```
ImportError: ...SubQuestion was not found in allow list for deserialization
imports.
```
`DecomposedQuestion` is registered (it is the declared `output_type`), but
`SubQuestion`, reachable only through its field, is not.
## Fix
After yielding a model, push its field annotations onto the walk stack so
every reachable model is yielded and registered. The existing `seen` set makes
self-referential and mutually recursive model graphs terminate.
Behavior on the example above, exercising the real walk:
- before: allow-list = `{DecomposedQuestion}`; deserializing
`list[SubQuestion]` raises the ImportError
- after: allow-list = `{DecomposedQuestion, SubQuestion}`; both hops
deserialize
New unit tests cover field recursion (including container-typed fields) and
self-reference termination; the rest of the serde suite passes unchanged.
## Context
Surfaced while fixing the common.ai 10-K example DAGs (#67930). Those
examples now side-step it by pushing dicts (`serialize_output=True`), but the
underlying gap affects any DAG that passes a nested Pydantic model between
tasks, so it is worth fixing in core.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]