Evals will be part of it as this will be built on top of PydanticAI that
supports it.

On Mon, 29 Dec 2025 at 19:03, Giorgio Zoppi <[email protected]> wrote:

> Hey Pavan.
> If you are going to introduce this have you thought at the evaluation
> framework?
> How  do you evaluate the LLm operator?
>
> On Mon, Dec 29, 2025, 09:40 Pavankumar Gopidesu <[email protected]>
> wrote:
>
> > Thanks Jens and Jarek, agree on both points raised in comments.
> >
> > I am happy to defer the embedding of the HITL to separate AIP.
> >
> > To Jens:
> >  Yes it's planned for phases wise, our plan starts with only provider
> > changes.
> >
> > Regards
> > Pavan
> >
> > On Sun, Dec 28, 2025 at 2:03 PM Jarek Potiuk <[email protected]> wrote:
> > >
> > > I also looked at it and I love it as well. I think of it as a missing
> > > abstraction between current Airflow users and current LLM app
> > developers, I
> > > also proposed something a little bit bolder there, which I think shows
> > the
> > > true potential of that approach.
> > > I added comment in the doc, but I will copy it here for better
> visibility
> > >
> > > ---
> > >
> > > After thinking quite a bit about the proposal, I actually love it and I
> > > think that should be the next frontier of making Airflow abstractions
> > more
> > > approachable and usable by those who want to implement various patterns
> > of
> > > interacting with LLMS.
> > >
> > > And I have a little different opinion than Jens regarding HITL. I see
> > those
> > > common LLM operators as slightly "higher" level operators that might
> > > implement a set of common LLM-related patterns that are currently
> either
> > > difficult or impossible to express via putting together things via Dag
> > and
> > > individual tasks. In this sense, the capability of making HITL call-out
> > for
> > > approval or selection from within such an operator - without completing
> > the
> > > operator and even running those "call-outs" more than once, actually
> even
> > > unbounded number of times during a single operator's execution.
> > >
> > > Actually it's a great way for us to implement some "cyclicness" -
> without
> > > breaking the "acyclic" property of our Dags (for now at least). Making
> > Dag
> > > "cyclic" is quite a dramatic change, and possibly we do not even have
> to
> > do
> > > it, because the "cyclic" part can be likely encompassed within the
> > > specialized LLM operators. I can imagine an operator that performs LLM
> > > querying and refining it via additional interactions with LLMs
> > "internally"
> > > - during a single operator's execution. And some of those iterations
> > might
> > > result in HITL "call-out" - even multiple times during one execution.
> > >
> > > Also one more proposal I have here is to use an API similar to HITL (or
> > > maybe repurpose HITL for that) - to report PROGRESS of such a task.
> This
> > is
> > > the typical property of good LLM task that it provides some feedback to
> > the
> > > user - it might be HITL when it asks for something but also it might be
> > > HOOTL (Human Outside Of The Loop) - where the task is simply reporting
> > it's
> > > progress and allows the user to perform asynchronous actions based on
> > that
> > > progress → for example abort the execution (to stop the Dag) or mark it
> > as
> > > "skipped" (to trigger - skip processing path), or mark it as "success"
> to
> > > simulate things being completed when they are not. While the three
> > "async"
> > > operations we already have, we do not currently have "progress"
> targeted
> > > for the kind of actor who is also HITL "actor" - someone who is not
> > > interested in detailed logs, but rather want to monitor progress and
> > assess
> > > quality of the output - even if it is just a partial output in the
> > > iterative process).
> > >
> > > I think that it will be easier and much more "surgical" (and applied in
> > the
> > > right place) to embed this "iterative" feedback / progress than to
> modify
> > > the "acyclic" property into our Dags.
> > >
> > > Also - this kind of Progress interface can also be used to publish the
> > > "async" tasks progress as the next step of [WIP] AIP-98: Add async
> > support
> > > for PythonOperator in Airflow 3:
> > >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-98%3A+Add+async+support+for+PythonOperator+in+Airflow+3
> > > that we discussed with David  .
> > >
> > > J.
> > >
> > >
> > >
> > > On Sun, Dec 28, 2025 at 2:16 PM Jens Scheffler <[email protected]>
> > wrote:
> > >
> > > > I like the AIP very much and in my view can be made completely in a
> > > > Provider package... with some comments (I assume non blocking) and
> > would
> > > > propose to really start in increments and then adjust by learning on
> > the
> > > > path.
> > > >
> > > > On 12/27/25 22:00, Pavankumar Gopidesu wrote:
> > > > > Thanks Giorgio Zoppi, for reviewing the AIP, yes its already
> planned
> > > > > part of this AIP, see the [1] example , where you can disable hitl
> > > > > step or enable it. So its integrated part of the Operator with the
> > > > > help of HITL operator.
> > > > >
> > > > > ```
> > > > > LLMDataQualityOperator(
> > > > >
> > > > >      task_id="customer_quality_analysis",
> > > > >
> > > > >      data_sources=[customer_s3],
> > > > >
> > > > >      prompt="Generate data quality validation queries",
> > > > >
> > > > >      require_approval=True,  # Built-in HITL
> > > > >
> > > > >      approval_timeout=timedelta(hours=2)
> > > > >
> > > > > )
> > > > > ```
> > > > >
> > > > > [1]:
> > > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406618285
> > > > >
> > > > > Regards,
> > > > > Pavan
> > > > >
> > > > > On Sat, Dec 27, 2025 at 9:16 AM Giorgio Zoppi <
> > [email protected]>
> > > > wrote:
> > > > >> Hello,
> > > > >> Just 1c, skimming AIP,
> > > > >> You might  want to explore on how to avoid human approval for
> > generated
> > > > >> query using llm as judge to eval the quality. The nice thing of
> data
> > > > >> pipelines is automation
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Wed, Dec 24, 2025, 10:23 Pavankumar Gopidesu <
> > > > [email protected]>
> > > > >> wrote:
> > > > >>
> > > > >>> Hello everyone,
> > > > >>>
> > > > >>> The thread has been quiet for some time, and I would like to
> > restart
> > > > >>> the discussion with the AIP.
> > > > >>>
> > > > >>> First, a sincere thank you to Kaxil for presenting the idea at
> > Airflow
> > > > >>> Summit 2025. The session was very well received, and many
> attendees
> > > > >>> expressed strong interest in the proposal. Unfortunately, I was
> > unable
> > > > >>> to attend the summit due to visa issues, but I am hopeful I will
> be
> > > > >>> able to join next year.
> > > > >>>
> > > > >>> The demo included well-structured prototypes. For those who were
> > > > >>> unable to attend the session, please refer to the recorded talk
> > here
> > > > >>> [1].
> > > > >>>
> > > > >>> I have also drafted the complete AIP proposal, which is available
> > here
> > > > >>> [2]. I would greatly appreciate your reviews and look forward to
> > > > >>> feedback and further discussion.
> > > > >>>
> > > > >>> Finally, to those celebrating Christmas, I wish you a very happy
> > > > >>> Christmas and a wonderful holiday season.
> > > > >>>
> > > > >>> Regards
> > > > >>> Pavan
> > > > >>>
> > > > >>> [1] https://www.youtube.com/watch?v=XSAzSDVUi2o
> > > > >>> [2]
> > > > >>>
> > > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406618285
> > > > >>>
> > > > >>> On Wed, Oct 15, 2025 at 6:13 AM Amogh Desai <
> [email protected]
> > >
> > > > wrote:
> > > > >>>> Thanks Pavan and Kaxil, seems like an interesting idea and a
> > pretty
> > > > >>>> reasonable problem to solve.
> > > > >>>>
> > > > >>>> I also like the idea of starting with
> > > > >>> `apache-airflow-providers-common-ai`
> > > > >>>> and expanding as / when needed.
> > > > >>>>
> > > > >>>> Looking forward to when the recording will be out, missed
> > attending
> > > > this
> > > > >>>> session at the Airflow Summit.
> > > > >>>>
> > > > >>>> Thanks & Regards,
> > > > >>>> Amogh Desai
> > > > >>>>
> > > > >>>>
> > > > >>>> On Thu, Oct 9, 2025 at 10:49 AM Kaxil Naik <[email protected]
> >
> > > > wrote:
> > > > >>>>
> > > > >>>>> Yea I think it should be apache-airflow-providers-common-ai
> > > > >>>>>
> > > > >>>>> On Wed, 8 Oct 2025 at 02:04, Pavankumar Gopidesu <
> > > > >>> [email protected]>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Yes its new provider starting with completely experimental, we
> > dont
> > > > >>>>>> want to break functionalities with existing providers :)
> > > > >>>>>>
> > > > >>>>>> Mostly its sql based operators, so named it as sql-ai but
> agree
> > we
> > > > >>> can
> > > > >>>>>> make it generic without specifying sql in it :)
> > > > >>>>>>
> > > > >>>>>> Pavan
> > > > >>>>>>
> > > > >>>>>> On Tue, Oct 7, 2025 at 3:48 PM Ryan Hatter via dev
> > > > >>>>>> <[email protected]> wrote:
> > > > >>>>>>> Would this really necessitate a new provider? Should this
> just
> > be
> > > > >>> baked
> > > > >>>>>>> into the common SQL provider?
> > > > >>>>>>>
> > > > >>>>>>> Alternatively, instead of a narrow `sql-ai` provider, why not
> > have
> > > > >>> a
> > > > >>>>>>> generic common ai provider with a SQL package, which would
> > allow
> > > > >>> for us
> > > > >>>>>> to
> > > > >>>>>>> build AI-based subpackages into the provider other than just
> > SQL?
> > > > >>>>>>>
> > > > >>>>>>> On Mon, Oct 6, 2025 at 4:31 PM Pavankumar Gopidesu <
> > > > >>>>>> [email protected]>
> > > > >>>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> @Giorgio Yes indeed that's also a good thought to
> integrate. I
> > > > >>> will
> > > > >>>>>> keep in
> > > > >>>>>>>> mind to think about when I draft AIP and message about this
> a
> > bit
> > > > >>>>> more
> > > > >>>>>> :)
> > > > >>>>>>>> Yes please join. We have great demos packed on this topic :)
> > > > >>>>>>>>
> > > > >>>>>>>> @kaxil , Yes that's a great blog post from the wren AI and
> > > > >>> leveraging
> > > > >>>>>> the
> > > > >>>>>>>> Apache DataFusion as a query engine to connect to different
> > data
> > > > >>>>>> sources.
> > > > >>>>>>>> Pavan
> > > > >>>>>>>>
> > > > >>>>>>>> On Tue, Sep 30, 2025 at 7:37 PM Giorgio Zoppi <
> > > > >>>>> [email protected]
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Hey Pavan,
> > > > >>>>>>>>> Some notes:
> > > > >>>>>>>>> 1. LLM can be also very useful in detecting root causes of
> > your
> > > > >>>>> error
> > > > >>>>>>>> while
> > > > >>>>>>>>> developing and design a pipeline. I explain me better, we'd
> > in
> > > > >>> the
> > > > >>>>>> past
> > > > >>>>>>>>> several
> > > > >>>>>>>>> Spark processes, when it is all green is ok, but when on
> > > > >>> fails, it
> > > > >>>>>> will
> > > > >>>>>>>> be
> > > > >>>>>>>>> nice to have a tool integrated to ask why.
> > > > >>>>>>>>> 2. Ideally such operator could be a
> > > > >>> ModelContextProtocolOperator
> > > > >>>>> and
> > > > >>>>>> you
> > > > >>>>>>>>> would not need nothing else that put an LLM as parameter
> with
> > > > >>> that
> > > > >>>>>>>>> operator,
> > > > >>>>>>>>> and just call for tools, execute query, and so on. This
> would
> > > > >>> be
> > > > >>>>> more
> > > > >>>>>>>>> powerful, because you create an abstraction between
> devices,
> > > > >>>>>> databases,
> > > > >>>>>>>>> server and so on, so each source of data can be injected on
> > the
> > > > >>>>>> pipeline.
> > > > >>>>>>>>> 3.  Good job! Looking forward to see the presentation.
> > > > >>>>>>>>> Best Regards,
> > > > >>>>>>>>> Giorgio
> > > > >>>>>>>>>
> > > > >>>>>>>>> Il giorno mar 30 set 2025 alle ore 14:51 Pavankumar
> Gopidesu
> > <
> > > > >>>>>>>>> [email protected]> ha scritto:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Hi everyone,
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> We're exploring adding LLM-powered SQL operators to
> Airflow
> > > > >>> and
> > > > >>>>>> would
> > > > >>>>>>>>> love
> > > > >>>>>>>>>> community input before writing an AIP.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> The idea: Let users write natural language prompts like
> > "find
> > > > >>>>>> customers
> > > > >>>>>>>>>> with missing emails" and have Airflow generate safe SQL
> > > > >>> queries
> > > > >>>>>> with
> > > > >>>>>>>> full
> > > > >>>>>>>>>> context about your database schema, connections, and data
> > > > >>>>>> sensitivity.
> > > > >>>>>>>>>> Why this matters:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Most of us spend too much time on schema drift detection
> and
> > > > >>>>> manual
> > > > >>>>>>>> data
> > > > >>>>>>>>>> quality checks. Meanwhile, AI agents are getting powerful
> > but
> > > > >>>>> lack
> > > > >>>>>>>>>> production-ready data integrations. Airflow could bridge
> > this
> > > > >>>>> gap.
> > > > >>>>>>>>>> Here's what we're dealing with at Tavant:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Our team works with multiple data domain teams producing
> > > > >>> data in
> > > > >>>>>>>>> different
> > > > >>>>>>>>>> formats and storage across S3, PostgreSQL, Iceberg, and
> > > > >>> Aurora.
> > > > >>>>>> When
> > > > >>>>>>>> data
> > > > >>>>>>>>>> assets become available for consumption, we need:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> - Detection of breaking schema changes between systems
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> - Data quality assessments between snapshots
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> - Validation that assets meet mandatory metadata
> > requirements
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> - Lookup validation against existing data (comparing file
> > > > >>> feeds
> > > > >>>>>> with
> > > > >>>>>>>>>> different formats to existing data in Iceberg/Aurora)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> This is exactly the type of work that LLMs  could automate
> > > > >>> while
> > > > >>>>>>>>>> maintaining governance.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> What we're thinking:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ```python
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> # Instead of writing complex SQL by hand...
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> quality_check = LLMSQLQueryOperator(
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>      task_id="find_data_issues",
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>      prompt="Find customers with invalid email formats and
> > > > >>> missing
> > > > >>>>>> phone
> > > > >>>>>>>>>> numbers",
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>      data_sources=[customer_asset],  # Airflow knows the
> > > > >>> schema
> > > > >>>>>>>>>> automatically
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>      # Built-in safety: won't generate DROP/DELETE
> > statements
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> )
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ```
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> The operator would:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> - Auto-inject database schema, sample data, and connection
> > > > >>>>> details
> > > > >>>>>>>>>> - Generate safe SQL (blocks dangerous operations)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> - Work across PostgreSQL, Snowflake, BigQuery with dialect
> > > > >>>>>> awareness
> > > > >>>>>>>>>> - Support schema drift detection between systems
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> - Handle multi-cloud data via Apache DataFusion[1] (Did
> some
> > > > >>>>>>>> experiments
> > > > >>>>>>>>>> with 50M+          records and results are in 10-15
> seconds
> > > > >>> for
> > > > >>>>>> common
> > > > >>>>>>>>>> aggregations)
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> for more info on benchmarks [2]
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Key benefit: Assets become smarter with structured
> metadata
> > > > >>>>>> (schema,
> > > > >>>>>>>>>> sensitivity, format) instead of just throwing everything
> in
> > > > >>>>>> `extra`.
> > > > >>>>>>>>>> Implementation plan:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Start with a separate provider
> > > > >>>>> (`apache-airflow-providers-sql-ai`)
> > > > >>>>>> so
> > > > >>>>>>>> we
> > > > >>>>>>>>>> can iterate without touching the Airflow core. No breaking
> > > > >>>>> changes,
> > > > >>>>>>>> works
> > > > >>>>>>>>>> with existing connections and hooks.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I am presenting this at Airflow Summit 2025 in Seattle
> with
> > > > >>>>> Kaxil -
> > > > >>>>>>>> come
> > > > >>>>>>>>>> see the live demo!
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Next steps:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> If this resonates after the Summit, we'll write a proper
> AIP
> > > > >>> with
> > > > >>>>>>>>> technical
> > > > >>>>>>>>>> details and further build a working prototype.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Thoughts? Concerns? Better ideas?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> [1]: https://datafusion.apache.org/
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> [2]:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>
> > > >
> >
> https://datafusion.apache.org/blog/2024/11/18/datafusion-fastest-single-node-parquet-clickbench/
> > > > >>>>>>>>>> Thanks,
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Pavan
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> P.S. - Happy to share more technical details with anyone
> > > > >>>>>> interested.
> > > > >>>>>>>>>
> > > > >>>>>>>>> --
> > > > >>>>>>>>> Life is a chess game - Anonymous.
> > > > >>>>>>>>>
> > > > >>>>>>
> > > > ---------------------------------------------------------------------
> > > > >>>>>> To unsubscribe, e-mail: [email protected]
> > > > >>>>>> For additional commands, e-mail: [email protected]
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>
> > ---------------------------------------------------------------------
> > > > >>> To unsubscribe, e-mail: [email protected]
> > > > >>> For additional commands, e-mail: [email protected]
> > > > >>>
> > > > >>>
> > > > >
> ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: [email protected]
> > > > > For additional commands, e-mail: [email protected]
> > > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: [email protected]
> > > > For additional commands, e-mail: [email protected]
> > > >
> > > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >
>

Reply via email to