kaxil commented on PR #67189:
URL: https://github.com/apache/airflow/pull/67189#issuecomment-4503995033

   Folded into #67121 as part of the comprehensive LlamaIndex rewrite -- 
closing in favor of that.
   
   The example_llamaindex_rag.py file landed in apache/airflow:aip99-llamaindex 
at 82fa97ba0d with the following adjustments to match the rewritten operator 
API:
   
   - Uses the renamed `LlamaIndexEmbeddingOperator` / 
`LlamaIndexRetrievalOperator` classes (per Kaxil's r3267387604 -- generic 
`*Operator` names risked collision with future framework operators).
   - Drops `documents="{{ ti.xcom_pull(...) }}"` Jinja templating -- 
`template_fields` was removed because `list[dict]` doesn't survive Jinja 
stringification and templating doc text would also expand `{{ var.value.api_key 
}}` tokens inside user documents (secret-leak vector). Binds via 
`loader.output` direct instead.
   - LlamaIndex operators use the new `llamaindex_default` conn (was 
`pydanticai_default`); the synthesis-step `LLMOperator` keeps 
`pydanticai_default` because it's pydantic-ai-backed (different framework, 
intentional split documented in the module docstring).
   - Adds explicit `embed_model="text-embedding-3-small"` to each 
embedding/retrieval call -- the rewritten operator validates that `embed_model` 
is set (either on the operator, or via `extra["embed_model"]` on the 
connection).
   - Fixes the string-reference task chains (`load >> "build_index"` -> `load 
>> build_index`) which weren't valid task dependencies.
   
   Thanks for the example DAGs -- they're the load-bearing demos for the 
LlamaIndex RAG story.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to