kaxil commented on PR #62816:
URL: https://github.com/apache/airflow/pull/62816#issuecomment-4018124877
## Code Review (updated PR)
Big improvement from the earlier iterations. The separate connection types
with `_get_provider_kwargs()` as the extension point is the right pattern.
### [HIGH] `model_id` no longer forwarded from operator to hook
`LLMOperator.llm_hook` changed from:
```python
return PydanticAIHook.for_connection(self.llm_conn_id,
model_id=self.model_id)
```
to:
```python
return PydanticAIHook.get_hook(self.llm_conn_id)
```
`get_hook()` (inherited from `BaseHook`) doesn't accept `model_id`. So if a
user sets `model_id="openai:gpt-5"` on `LLMOperator`, it's silently dropped —
the hook never sees it. The hook falls through to `extra.get("model", "")` and
either uses the connection's model or raises "No model specified."
The test confirms the regression:
```python
# Before: model_id forwarded
mock_hook_cls.for_connection.assert_called_once_with("my_llm",
model_id="openai:gpt-5")
# After: model_id dropped
mock_hook_cls.get_hook.assert_called_once_with("my_llm")
```
**Fix:** Either keep `for_connection()` (it was a clean API), or set
`model_id` on the hook after `get_hook()` returns:
```python
@cached_property
def llm_hook(self) -> PydanticAIHook:
hook = PydanticAIHook.get_hook(self.llm_conn_id)
hook.model_id = self.model_id
return hook
```
---
### [MEDIUM] TypeError fallback removed — defensive safety net gone
The base hook's `_provider_factory` lost the `try/except TypeError` fallback:
```python
# Before (defensive):
try:
return provider_cls(**kwargs)
except TypeError:
return infer_provider(provider_name)
# After (brittle):
return infer_provider_class(pname)(**_kwargs)
```
With the subclass architecture, this matters less because the base hook
should only handle standard providers. But if someone uses a generic
`pydanticai` connection with an exotic provider that rejects
`api_key`/`base_url`, it crashes instead of falling back gracefully. The
`infer_provider` import was also removed, so restoring this later requires
adding it back.
**Suggestion:** Keep the fallback as a safety net. It costs nothing and
prevents surprises.
---
### [MEDIUM] `conn_type` readability — `pydanticaiazure` vs
`pydanticai_azure`
The conn_types changed from `pydanticai_azure` to `pydanticaiazure` (same
for bedrock, vertex). This is hard to read. While `gcpbigquery` exists as a
precedent, `google_cloud_platform` also uses underscores. There's no strict
convention.
`pydanticai_azure` is more readable and consistent with `default_conn_name =
"pydanticai_azure_default"`. The mismatch between conn_type (`pydanticaiazure`)
and default_conn_name (`pydanticai_azure_default`) is confusing.
---
### [LOW] Bedrock/Vertex `get_ui_field_behaviour` placeholder text still
shows raw JSON
The Bedrock and Vertex hooks have `conn-fields` defined in provider.yaml
(proper form widgets), but their `get_ui_field_behaviour().placeholders` still
show raw JSON in the `extra` placeholder. Since the fields now have dedicated
form widgets, the `extra` placeholder is misleading — users should fill in the
individual fields, not paste JSON into extra.
---
### [LOW] Bedrock `_get_provider_kwargs` has `api_key` and `base_url` in
both signature and extra keys
The method receives `api_key` (from `conn.password`) and `base_url` (from
`conn.host`), but ignores them. Instead, `api_key` and `base_url` are read from
`extra`. The comment explains this, but the same logical field has two possible
sources. Since the Bedrock connection hides `host` and `password` in the UI
this is consistent — just worth documenting.
---
**Summary**: 1 High (model_id dropped), 2 Medium (TypeError fallback,
conn_type naming), 2 Low. The architecture is solid — the `model_id` regression
is the main blocker.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]