Hey Pavan.
If you are going to introduce this have you thought at the evaluation
framework?
How  do you evaluate the LLm operator?

On Mon, Dec 29, 2025, 09:40 Pavankumar Gopidesu <[email protected]>
wrote:

> Thanks Jens and Jarek, agree on both points raised in comments.
>
> I am happy to defer the embedding of the HITL to separate AIP.
>
> To Jens:
>  Yes it's planned for phases wise, our plan starts with only provider
> changes.
>
> Regards
> Pavan
>
> On Sun, Dec 28, 2025 at 2:03 PM Jarek Potiuk <[email protected]> wrote:
> >
> > I also looked at it and I love it as well. I think of it as a missing
> > abstraction between current Airflow users and current LLM app
> developers, I
> > also proposed something a little bit bolder there, which I think shows
> the
> > true potential of that approach.
> > I added comment in the doc, but I will copy it here for better visibility
> >
> > ---
> >
> > After thinking quite a bit about the proposal, I actually love it and I
> > think that should be the next frontier of making Airflow abstractions
> more
> > approachable and usable by those who want to implement various patterns
> of
> > interacting with LLMS.
> >
> > And I have a little different opinion than Jens regarding HITL. I see
> those
> > common LLM operators as slightly "higher" level operators that might
> > implement a set of common LLM-related patterns that are currently either
> > difficult or impossible to express via putting together things via Dag
> and
> > individual tasks. In this sense, the capability of making HITL call-out
> for
> > approval or selection from within such an operator - without completing
> the
> > operator and even running those "call-outs" more than once, actually even
> > unbounded number of times during a single operator's execution.
> >
> > Actually it's a great way for us to implement some "cyclicness" - without
> > breaking the "acyclic" property of our Dags (for now at least). Making
> Dag
> > "cyclic" is quite a dramatic change, and possibly we do not even have to
> do
> > it, because the "cyclic" part can be likely encompassed within the
> > specialized LLM operators. I can imagine an operator that performs LLM
> > querying and refining it via additional interactions with LLMs
> "internally"
> > - during a single operator's execution. And some of those iterations
> might
> > result in HITL "call-out" - even multiple times during one execution.
> >
> > Also one more proposal I have here is to use an API similar to HITL (or
> > maybe repurpose HITL for that) - to report PROGRESS of such a task. This
> is
> > the typical property of good LLM task that it provides some feedback to
> the
> > user - it might be HITL when it asks for something but also it might be
> > HOOTL (Human Outside Of The Loop) - where the task is simply reporting
> it's
> > progress and allows the user to perform asynchronous actions based on
> that
> > progress → for example abort the execution (to stop the Dag) or mark it
> as
> > "skipped" (to trigger - skip processing path), or mark it as "success" to
> > simulate things being completed when they are not. While the three
> "async"
> > operations we already have, we do not currently have "progress" targeted
> > for the kind of actor who is also HITL "actor" - someone who is not
> > interested in detailed logs, but rather want to monitor progress and
> assess
> > quality of the output - even if it is just a partial output in the
> > iterative process).
> >
> > I think that it will be easier and much more "surgical" (and applied in
> the
> > right place) to embed this "iterative" feedback / progress than to modify
> > the "acyclic" property into our Dags.
> >
> > Also - this kind of Progress interface can also be used to publish the
> > "async" tasks progress as the next step of [WIP] AIP-98: Add async
> support
> > for PythonOperator in Airflow 3:
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-98%3A+Add+async+support+for+PythonOperator+in+Airflow+3
> > that we discussed with David  .
> >
> > J.
> >
> >
> >
> > On Sun, Dec 28, 2025 at 2:16 PM Jens Scheffler <[email protected]>
> wrote:
> >
> > > I like the AIP very much and in my view can be made completely in a
> > > Provider package... with some comments (I assume non blocking) and
> would
> > > propose to really start in increments and then adjust by learning on
> the
> > > path.
> > >
> > > On 12/27/25 22:00, Pavankumar Gopidesu wrote:
> > > > Thanks Giorgio Zoppi, for reviewing the AIP, yes its already planned
> > > > part of this AIP, see the [1] example , where you can disable hitl
> > > > step or enable it. So its integrated part of the Operator with the
> > > > help of HITL operator.
> > > >
> > > > ```
> > > > LLMDataQualityOperator(
> > > >
> > > >      task_id="customer_quality_analysis",
> > > >
> > > >      data_sources=[customer_s3],
> > > >
> > > >      prompt="Generate data quality validation queries",
> > > >
> > > >      require_approval=True,  # Built-in HITL
> > > >
> > > >      approval_timeout=timedelta(hours=2)
> > > >
> > > > )
> > > > ```
> > > >
> > > > [1]:
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406618285
> > > >
> > > > Regards,
> > > > Pavan
> > > >
> > > > On Sat, Dec 27, 2025 at 9:16 AM Giorgio Zoppi <
> [email protected]>
> > > wrote:
> > > >> Hello,
> > > >> Just 1c, skimming AIP,
> > > >> You might  want to explore on how to avoid human approval for
> generated
> > > >> query using llm as judge to eval the quality. The nice thing of data
> > > >> pipelines is automation
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Wed, Dec 24, 2025, 10:23 Pavankumar Gopidesu <
> > > [email protected]>
> > > >> wrote:
> > > >>
> > > >>> Hello everyone,
> > > >>>
> > > >>> The thread has been quiet for some time, and I would like to
> restart
> > > >>> the discussion with the AIP.
> > > >>>
> > > >>> First, a sincere thank you to Kaxil for presenting the idea at
> Airflow
> > > >>> Summit 2025. The session was very well received, and many attendees
> > > >>> expressed strong interest in the proposal. Unfortunately, I was
> unable
> > > >>> to attend the summit due to visa issues, but I am hopeful I will be
> > > >>> able to join next year.
> > > >>>
> > > >>> The demo included well-structured prototypes. For those who were
> > > >>> unable to attend the session, please refer to the recorded talk
> here
> > > >>> [1].
> > > >>>
> > > >>> I have also drafted the complete AIP proposal, which is available
> here
> > > >>> [2]. I would greatly appreciate your reviews and look forward to
> > > >>> feedback and further discussion.
> > > >>>
> > > >>> Finally, to those celebrating Christmas, I wish you a very happy
> > > >>> Christmas and a wonderful holiday season.
> > > >>>
> > > >>> Regards
> > > >>> Pavan
> > > >>>
> > > >>> [1] https://www.youtube.com/watch?v=XSAzSDVUi2o
> > > >>> [2]
> > > >>>
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406618285
> > > >>>
> > > >>> On Wed, Oct 15, 2025 at 6:13 AM Amogh Desai <[email protected]
> >
> > > wrote:
> > > >>>> Thanks Pavan and Kaxil, seems like an interesting idea and a
> pretty
> > > >>>> reasonable problem to solve.
> > > >>>>
> > > >>>> I also like the idea of starting with
> > > >>> `apache-airflow-providers-common-ai`
> > > >>>> and expanding as / when needed.
> > > >>>>
> > > >>>> Looking forward to when the recording will be out, missed
> attending
> > > this
> > > >>>> session at the Airflow Summit.
> > > >>>>
> > > >>>> Thanks & Regards,
> > > >>>> Amogh Desai
> > > >>>>
> > > >>>>
> > > >>>> On Thu, Oct 9, 2025 at 10:49 AM Kaxil Naik <[email protected]>
> > > wrote:
> > > >>>>
> > > >>>>> Yea I think it should be apache-airflow-providers-common-ai
> > > >>>>>
> > > >>>>> On Wed, 8 Oct 2025 at 02:04, Pavankumar Gopidesu <
> > > >>> [email protected]>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Yes its new provider starting with completely experimental, we
> dont
> > > >>>>>> want to break functionalities with existing providers :)
> > > >>>>>>
> > > >>>>>> Mostly its sql based operators, so named it as sql-ai but agree
> we
> > > >>> can
> > > >>>>>> make it generic without specifying sql in it :)
> > > >>>>>>
> > > >>>>>> Pavan
> > > >>>>>>
> > > >>>>>> On Tue, Oct 7, 2025 at 3:48 PM Ryan Hatter via dev
> > > >>>>>> <[email protected]> wrote:
> > > >>>>>>> Would this really necessitate a new provider? Should this just
> be
> > > >>> baked
> > > >>>>>>> into the common SQL provider?
> > > >>>>>>>
> > > >>>>>>> Alternatively, instead of a narrow `sql-ai` provider, why not
> have
> > > >>> a
> > > >>>>>>> generic common ai provider with a SQL package, which would
> allow
> > > >>> for us
> > > >>>>>> to
> > > >>>>>>> build AI-based subpackages into the provider other than just
> SQL?
> > > >>>>>>>
> > > >>>>>>> On Mon, Oct 6, 2025 at 4:31 PM Pavankumar Gopidesu <
> > > >>>>>> [email protected]>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> @Giorgio Yes indeed that's also a good thought to integrate. I
> > > >>> will
> > > >>>>>> keep in
> > > >>>>>>>> mind to think about when I draft AIP and message about this a
> bit
> > > >>>>> more
> > > >>>>>> :)
> > > >>>>>>>> Yes please join. We have great demos packed on this topic :)
> > > >>>>>>>>
> > > >>>>>>>> @kaxil , Yes that's a great blog post from the wren AI and
> > > >>> leveraging
> > > >>>>>> the
> > > >>>>>>>> Apache DataFusion as a query engine to connect to different
> data
> > > >>>>>> sources.
> > > >>>>>>>> Pavan
> > > >>>>>>>>
> > > >>>>>>>> On Tue, Sep 30, 2025 at 7:37 PM Giorgio Zoppi <
> > > >>>>> [email protected]
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hey Pavan,
> > > >>>>>>>>> Some notes:
> > > >>>>>>>>> 1. LLM can be also very useful in detecting root causes of
> your
> > > >>>>> error
> > > >>>>>>>> while
> > > >>>>>>>>> developing and design a pipeline. I explain me better, we'd
> in
> > > >>> the
> > > >>>>>> past
> > > >>>>>>>>> several
> > > >>>>>>>>> Spark processes, when it is all green is ok, but when on
> > > >>> fails, it
> > > >>>>>> will
> > > >>>>>>>> be
> > > >>>>>>>>> nice to have a tool integrated to ask why.
> > > >>>>>>>>> 2. Ideally such operator could be a
> > > >>> ModelContextProtocolOperator
> > > >>>>> and
> > > >>>>>> you
> > > >>>>>>>>> would not need nothing else that put an LLM as parameter with
> > > >>> that
> > > >>>>>>>>> operator,
> > > >>>>>>>>> and just call for tools, execute query, and so on. This would
> > > >>> be
> > > >>>>> more
> > > >>>>>>>>> powerful, because you create an abstraction between devices,
> > > >>>>>> databases,
> > > >>>>>>>>> server and so on, so each source of data can be injected on
> the
> > > >>>>>> pipeline.
> > > >>>>>>>>> 3.  Good job! Looking forward to see the presentation.
> > > >>>>>>>>> Best Regards,
> > > >>>>>>>>> Giorgio
> > > >>>>>>>>>
> > > >>>>>>>>> Il giorno mar 30 set 2025 alle ore 14:51 Pavankumar Gopidesu
> <
> > > >>>>>>>>> [email protected]> ha scritto:
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi everyone,
> > > >>>>>>>>>>
> > > >>>>>>>>>> We're exploring adding LLM-powered SQL operators to Airflow
> > > >>> and
> > > >>>>>> would
> > > >>>>>>>>> love
> > > >>>>>>>>>> community input before writing an AIP.
> > > >>>>>>>>>>
> > > >>>>>>>>>> The idea: Let users write natural language prompts like
> "find
> > > >>>>>> customers
> > > >>>>>>>>>> with missing emails" and have Airflow generate safe SQL
> > > >>> queries
> > > >>>>>> with
> > > >>>>>>>> full
> > > >>>>>>>>>> context about your database schema, connections, and data
> > > >>>>>> sensitivity.
> > > >>>>>>>>>> Why this matters:
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Most of us spend too much time on schema drift detection and
> > > >>>>> manual
> > > >>>>>>>> data
> > > >>>>>>>>>> quality checks. Meanwhile, AI agents are getting powerful
> but
> > > >>>>> lack
> > > >>>>>>>>>> production-ready data integrations. Airflow could bridge
> this
> > > >>>>> gap.
> > > >>>>>>>>>> Here's what we're dealing with at Tavant:
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> Our team works with multiple data domain teams producing
> > > >>> data in
> > > >>>>>>>>> different
> > > >>>>>>>>>> formats and storage across S3, PostgreSQL, Iceberg, and
> > > >>> Aurora.
> > > >>>>>> When
> > > >>>>>>>> data
> > > >>>>>>>>>> assets become available for consumption, we need:
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Detection of breaking schema changes between systems
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Data quality assessments between snapshots
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Validation that assets meet mandatory metadata
> requirements
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Lookup validation against existing data (comparing file
> > > >>> feeds
> > > >>>>>> with
> > > >>>>>>>>>> different formats to existing data in Iceberg/Aurora)
> > > >>>>>>>>>>
> > > >>>>>>>>>> This is exactly the type of work that LLMs  could automate
> > > >>> while
> > > >>>>>>>>>> maintaining governance.
> > > >>>>>>>>>>
> > > >>>>>>>>>> What we're thinking:
> > > >>>>>>>>>>
> > > >>>>>>>>>> ```python
> > > >>>>>>>>>>
> > > >>>>>>>>>> # Instead of writing complex SQL by hand...
> > > >>>>>>>>>>
> > > >>>>>>>>>> quality_check = LLMSQLQueryOperator(
> > > >>>>>>>>>>
> > > >>>>>>>>>>      task_id="find_data_issues",
> > > >>>>>>>>>>
> > > >>>>>>>>>>      prompt="Find customers with invalid email formats and
> > > >>> missing
> > > >>>>>> phone
> > > >>>>>>>>>> numbers",
> > > >>>>>>>>>>
> > > >>>>>>>>>>      data_sources=[customer_asset],  # Airflow knows the
> > > >>> schema
> > > >>>>>>>>>> automatically
> > > >>>>>>>>>>
> > > >>>>>>>>>>      # Built-in safety: won't generate DROP/DELETE
> statements
> > > >>>>>>>>>>
> > > >>>>>>>>>> )
> > > >>>>>>>>>>
> > > >>>>>>>>>> ```
> > > >>>>>>>>>>
> > > >>>>>>>>>> The operator would:
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Auto-inject database schema, sample data, and connection
> > > >>>>> details
> > > >>>>>>>>>> - Generate safe SQL (blocks dangerous operations)
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Work across PostgreSQL, Snowflake, BigQuery with dialect
> > > >>>>>> awareness
> > > >>>>>>>>>> - Support schema drift detection between systems
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Handle multi-cloud data via Apache DataFusion[1] (Did some
> > > >>>>>>>> experiments
> > > >>>>>>>>>> with 50M+          records and results are in 10-15 seconds
> > > >>> for
> > > >>>>>> common
> > > >>>>>>>>>> aggregations)
> > > >>>>>>>>>>
> > > >>>>>>>>>> for more info on benchmarks [2]
> > > >>>>>>>>>>
> > > >>>>>>>>>> Key benefit: Assets become smarter with structured metadata
> > > >>>>>> (schema,
> > > >>>>>>>>>> sensitivity, format) instead of just throwing everything in
> > > >>>>>> `extra`.
> > > >>>>>>>>>> Implementation plan:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Start with a separate provider
> > > >>>>> (`apache-airflow-providers-sql-ai`)
> > > >>>>>> so
> > > >>>>>>>> we
> > > >>>>>>>>>> can iterate without touching the Airflow core. No breaking
> > > >>>>> changes,
> > > >>>>>>>> works
> > > >>>>>>>>>> with existing connections and hooks.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I am presenting this at Airflow Summit 2025 in Seattle with
> > > >>>>> Kaxil -
> > > >>>>>>>> come
> > > >>>>>>>>>> see the live demo!
> > > >>>>>>>>>>
> > > >>>>>>>>>> Next steps:
> > > >>>>>>>>>>
> > > >>>>>>>>>> If this resonates after the Summit, we'll write a proper AIP
> > > >>> with
> > > >>>>>>>>> technical
> > > >>>>>>>>>> details and further build a working prototype.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thoughts? Concerns? Better ideas?
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> [1]: https://datafusion.apache.org/
> > > >>>>>>>>>>
> > > >>>>>>>>>> [2]:
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>
> > >
> https://datafusion.apache.org/blog/2024/11/18/datafusion-fastest-single-node-parquet-clickbench/
> > > >>>>>>>>>> Thanks,
> > > >>>>>>>>>>
> > > >>>>>>>>>> Pavan
> > > >>>>>>>>>>
> > > >>>>>>>>>> P.S. - Happy to share more technical details with anyone
> > > >>>>>> interested.
> > > >>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>> Life is a chess game - Anonymous.
> > > >>>>>>>>>
> > > >>>>>>
> > > ---------------------------------------------------------------------
> > > >>>>>> To unsubscribe, e-mail: [email protected]
> > > >>>>>> For additional commands, e-mail: [email protected]
> > > >>>>>>
> > > >>>>>>
> > > >>>
> ---------------------------------------------------------------------
> > > >>> To unsubscribe, e-mail: [email protected]
> > > >>> For additional commands, e-mail: [email protected]
> > > >>>
> > > >>>
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: [email protected]
> > > > For additional commands, e-mail: [email protected]
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to