Hi Ash,

Regarding secrets backends—since they don't currently support listing
in Airflow, I’ve scoped this feature to the metastore only and will
document that explicitly. As for AIP-103, Task State is designed for
tasks to persist their own state across retries, whereas this PR
focuses on DAGs fetching externally-managed config at runtime, keeping
the boundary between them clean.

Best regards,
Jun Yeong

2026년 5월 5일 (화) 오후 10:48, Ash Berlin-Taylor <[email protected]>님이 작성:
>
> Some of these use cases might be better suited to AIP-103 (the State storage 
> AIP)
>
> One possible issue with exposing Variables.list() (and connections) — do the 
> secrets backends support listing variables?
>
> -ash
>
> > On 5 May 2026, at 11:41, 김준영 <[email protected]> wrote:
> >
> > Hi Amogh,
> >
> > Thanks for the feedback! I've updated the PR (#66022) to implement the
> > lazy iterator pattern you suggested:
> >
> > for key in Variable.keys(prefix="team_a_config_"):
> >    val = Variable.get(key)
> >
> > Now, Variable.keys(prefix=None) returns a list[str] of matching keys,
> > supported by a new GET /variables/keys?prefix= endpoint in the
> > Execution API. This allows callers to fetch only the specific values
> > they need via Variable.get(key).
> >
> > Regarding your point about other Airflow resources (Connections,
> > XComs, etc.) having the same gap: once this PR is merged and the
> > pattern is established, I'd be happy to extend this keys() approach to
> > those resources in follow-up PRs. Would that be a direction you'd
> > encourage?
> >
> > The PR is ready for another review whenever you have a moment.
> >
> > Best regards,
> > Jun Yeong
> >
> > 2026년 5월 5일 (화) 오후 5:51, Amogh Desai <[email protected]>님이 작성:
> >>
> >> Hi Jun Yeong,
> >>
> >> Valid gap. There is no GET /variables list endpoint at all, but that is
> >> true for any
> >> other Airflow artefacts too. So variables isn't the only one missing it,
> >> its all of the
> >> Airflow resources: connections, xcoms, etc.
> >>
> >> One suggestion worth considering: rather than Variable.list() returning a
> >> full list,
> >>
> >> a lazy iterator might be a better fit. Something like:
> >>
> >>
> >> for key in Variable.keys(prefix="team_a_config_"):
> >>    val = Variable.get(key)
> >>
> >> Happy to review when a PR is ready.
> >>
> >> Thanks & Regards,
> >> Amogh Desai
> >>
> >>
> >> On Fri, May 1, 2026 at 3:02 AM 김준영 <[email protected]> wrote:
> >>
> >>> Hi Jens,
> >>>
> >>> That is a valid alternative, and using a single JSON variable is
> >>> indeed a great pattern for centralized or static configurations to
> >>> reduce I/O.
> >>>
> >>> However, for the use cases I'm targeting, the "Single JSON Variable"
> >>> approach has a few significant drawbacks:
> >>>
> >>> 1.Concurrency & Atomic Updates: When multiple external systems or
> >>> independent CI/CD pipelines need to update their own configurations, a
> >>> single JSON variable creates a race condition. They would have to
> >>> "Read-Modify-Write" the entire blob, which risks overwriting each
> >>> other's changes. Separate variables allow for atomic, independent
> >>> updates.
> >>>
> >>> 2.Integration Complexity: Many users integrate Airflow with external
> >>> tools that independently push values to the Airflow Metadata DB or
> >>> Secret Backend. Forcing these decoupled systems to coordinate and
> >>> maintain a single shared JSON structure adds significant integration
> >>> overhead.
> >>>
> >>> 3.Data Modeling Flexibility: While a JSON blob works for some,
> >>> Variable.list(prefix=) allows Airflow to be unopinionated about how
> >>> users model their data. It provides a standard "Key-Value store"
> >>> experience (similar to AWS SSM or Redis) where prefix-based discovery
> >>> is a first-class citizen.
> >>>
> >>> In short, while the JSON approach is a good workaround for specific
> >>> cases, Variable.list() provides the necessary flexibility for highly
> >>> dynamic and decoupled environments without forcing a specific data
> >>> structure on the user.
> >>>
> >>> What do you think?
> >>>
> >>> Best regards, Jun Yeong
> >>>
> >>> 2026년 5월 1일 (금) 오전 6:19, Jens Scheffler <[email protected]>님이 작성:
> >>>>
> >>>> Hi Jun Yeong,
> >>>>
> >>>> one thought on this: We had similar.
> >>>>
> >>>> Our use case: Implemented a custom archiving (Dag) that needed to take
> >>>> care of different retention times in different Dags (maybe bad example
> >>>> because this archival itself accesses the database for archiving...
> >>>> haha) and we wanted to have different retention times per Dag.
> >>>>
> >>>> What we did is we made a JSON structure with all parameters into a
> >>>> single Variable. Then we did not need to have many IO operations and
> >>>> Variables but could store all in a single Variable.
> >>>>
> >>>> Would this be a viable solution for your case as well?
> >>>>
> >>>> Jens
> >>>>
> >>>> On 30.04.26 22:47, 김준영 wrote:
> >>>>> Hi Jens,
> >>>>>
> >>>>> Thank you for the thoughtful feedback!
> >>>>>
> >>>>> The primary demand for this feature comes from workflows that require
> >>>>> dynamic configuration discovery. A common pattern is grouping related
> >>>>> variables under a shared prefix (e.g., team_a_config_ or
> >>>>> pipeline_x_param_). In many cases, these keys are generated or updated
> >>>>> dynamically by external systems, meaning the exact set of keys isn't
> >>>>> known at DAG authoring time.
> >>>>>
> >>>>> While users in Airflow 2.x relied on session.query(Variable).all() as
> >>>>> a workaround, Airflow 3’s move toward the Task SDK aims to abstract
> >>>>> away direct ORM/DB access for better security and stability.
> >>>>> Variable.list(prefix=) provides a supported, clean way to achieve this
> >>>>> discovery within that new architecture.
> >>>>>
> >>>>> Regarding the secrets backend limitation, I completely agree. It’s
> >>>>> important to manage expectations, so I will update the PyDoc to
> >>>>> explicitly state that this method only lists variables stored in the
> >>>>> metadata database.
> >>>>>
> >>>>> Best regards, Jun Yeong
> >>>>>
> >>>>> 2026년 5월 1일 (금) 오전 5:23, Jens Scheffler <[email protected]>님이 작성:
> >>>>>> Hi!
> >>>>>>
> >>>>>> thanks for the discussion. While I am not against this I would say in
> >>>>>> Airflow 2 it was also not a "public API" but the DB connecton "just
> >>>>>> used" to list and have a missing API compensated.
> >>>>>>
> >>>>>> Can you express what the demand for the missing feature is? What
> >>>>>> business function did you implement based on listing all Variables?
> >>>>>>
> >>>>>> As you already stated and also highlighted in the PR the list() might
> >>>>>> not tell about all Variables as the list is not provided from secret
> >>>>>> managers. So it might (small risk thoug) lead to some confusion.
> >>> Should
> >>>>>> be explicitly documented in the PyDoc. But this is a nit.
> >>>>>>
> >>>>>> Jens
> >>>>>>
> >>>>>> On 30.04.26 11:47, 김준영 wrote:
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I'd like to propose adding Variable.list() to the Task SDK to address
> >>>>>>> the gap left by the removal of direct ORM access in Airflow 3.
> >>>>>>>
> >>>>>>>    Background:
> >>>>>>>    In Airflow 2.x, users could list all variables via:
> >>>>>>>
> >>>>>>>        from airflow.models import Variable
> >>>>>>>        from airflow.utils.session import create_session
> >>>>>>>
> >>>>>>>        with create_session() as session:
> >>>>>>>            variables = session.query(Variable).all()
> >>>>>>>
> >>>>>>>    In Airflow 3.x, this pattern raises:
> >>>>>>>        RuntimeError: Direct database access via the ORM is not
> >>> allowed in Airflow
> >>>>>>>     3.0
> >>>>>>>
> >>>>>>>    There is currently no supported way to discover variable keys
> >>> dynamically
> >>>>>>>    when they are not known at DAG authoring time.
> >>>>>>>
> >>>>>>>    Proposal:
> >>>>>>>    - Add Variable.list(prefix=None) to the Task SDK
> >>>>>>>    - Scope is limited to the metadata database only (same as the
> >>> old ORM pattern)
> >>>>>>>    - Secrets backend support is intentionally out of scope, as it
> >>> would
> >>>>>>>      require a broader interface contract change and separate
> >>> community
> >>>>>>>    discussion
> >>>>>>>
> >>>>>>>    Related issue: https://github.com/apache/airflow/issues/61166
> >>>>>>>    Draft PR: https://github.com/apache/airflow/pull/66022
> >>>>>>>
> >>>>>>>    I would appreciate any feedback or concerns from the community
> >>> before
> >>>>>>>    this moves forward.
> >>>>>>>
> >>>>>>>    Best Regards,
> >>>>>>>    Jun Yeong Kim
> >>>>>>>
> >>>>>>> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: [email protected]
> >>>>>>> For additional commands, e-mail: [email protected]
> >>>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: [email protected]
> >>>>>> For additional commands, e-mail: [email protected]
> >>>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: [email protected]
> >>>>> For additional commands, e-mail: [email protected]
> >>>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: [email protected]
> >>>> For additional commands, e-mail: [email protected]
> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [email protected]
> >>> For additional commands, e-mail: [email protected]
> >>>
> >>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to