Hi Jens,

That is a valid alternative, and using a single JSON variable is
indeed a great pattern for centralized or static configurations to
reduce I/O.

However, for the use cases I'm targeting, the "Single JSON Variable"
approach has a few significant drawbacks:

1.Concurrency & Atomic Updates: When multiple external systems or
independent CI/CD pipelines need to update their own configurations, a
single JSON variable creates a race condition. They would have to
"Read-Modify-Write" the entire blob, which risks overwriting each
other's changes. Separate variables allow for atomic, independent
updates.

2.Integration Complexity: Many users integrate Airflow with external
tools that independently push values to the Airflow Metadata DB or
Secret Backend. Forcing these decoupled systems to coordinate and
maintain a single shared JSON structure adds significant integration
overhead.

3.Data Modeling Flexibility: While a JSON blob works for some,
Variable.list(prefix=) allows Airflow to be unopinionated about how
users model their data. It provides a standard "Key-Value store"
experience (similar to AWS SSM or Redis) where prefix-based discovery
is a first-class citizen.

In short, while the JSON approach is a good workaround for specific
cases, Variable.list() provides the necessary flexibility for highly
dynamic and decoupled environments without forcing a specific data
structure on the user.

What do you think?

Best regards, Jun Yeong

2026년 5월 1일 (금) 오전 6:19, Jens Scheffler <[email protected]>님이 작성:
>
> Hi Jun Yeong,
>
> one thought on this: We had similar.
>
> Our use case: Implemented a custom archiving (Dag) that needed to take
> care of different retention times in different Dags (maybe bad example
> because this archival itself accesses the database for archiving...
> haha) and we wanted to have different retention times per Dag.
>
> What we did is we made a JSON structure with all parameters into a
> single Variable. Then we did not need to have many IO operations and
> Variables but could store all in a single Variable.
>
> Would this be a viable solution for your case as well?
>
> Jens
>
> On 30.04.26 22:47, 김준영 wrote:
> > Hi Jens,
> >
> > Thank you for the thoughtful feedback!
> >
> > The primary demand for this feature comes from workflows that require
> > dynamic configuration discovery. A common pattern is grouping related
> > variables under a shared prefix (e.g., team_a_config_ or
> > pipeline_x_param_). In many cases, these keys are generated or updated
> > dynamically by external systems, meaning the exact set of keys isn't
> > known at DAG authoring time.
> >
> > While users in Airflow 2.x relied on session.query(Variable).all() as
> > a workaround, Airflow 3’s move toward the Task SDK aims to abstract
> > away direct ORM/DB access for better security and stability.
> > Variable.list(prefix=) provides a supported, clean way to achieve this
> > discovery within that new architecture.
> >
> > Regarding the secrets backend limitation, I completely agree. It’s
> > important to manage expectations, so I will update the PyDoc to
> > explicitly state that this method only lists variables stored in the
> > metadata database.
> >
> > Best regards, Jun Yeong
> >
> > 2026년 5월 1일 (금) 오전 5:23, Jens Scheffler <[email protected]>님이 작성:
> >> Hi!
> >>
> >> thanks for the discussion. While I am not against this I would say in
> >> Airflow 2 it was also not a "public API" but the DB connecton "just
> >> used" to list and have a missing API compensated.
> >>
> >> Can you express what the demand for the missing feature is? What
> >> business function did you implement based on listing all Variables?
> >>
> >> As you already stated and also highlighted in the PR the list() might
> >> not tell about all Variables as the list is not provided from secret
> >> managers. So it might (small risk thoug) lead to some confusion. Should
> >> be explicitly documented in the PyDoc. But this is a nit.
> >>
> >> Jens
> >>
> >> On 30.04.26 11:47, 김준영 wrote:
> >>> Hi all,
> >>>
> >>> I'd like to propose adding Variable.list() to the Task SDK to address
> >>> the gap left by the removal of direct ORM access in Airflow 3.
> >>>
> >>>     Background:
> >>>     In Airflow 2.x, users could list all variables via:
> >>>
> >>>         from airflow.models import Variable
> >>>         from airflow.utils.session import create_session
> >>>
> >>>         with create_session() as session:
> >>>             variables = session.query(Variable).all()
> >>>
> >>>     In Airflow 3.x, this pattern raises:
> >>>         RuntimeError: Direct database access via the ORM is not allowed 
> >>> in Airflow
> >>>      3.0
> >>>
> >>>     There is currently no supported way to discover variable keys 
> >>> dynamically
> >>>     when they are not known at DAG authoring time.
> >>>
> >>>     Proposal:
> >>>     - Add Variable.list(prefix=None) to the Task SDK
> >>>     - Scope is limited to the metadata database only (same as the old ORM 
> >>> pattern)
> >>>     - Secrets backend support is intentionally out of scope, as it would
> >>>       require a broader interface contract change and separate community
> >>>     discussion
> >>>
> >>>     Related issue: https://github.com/apache/airflow/issues/61166
> >>>     Draft PR: https://github.com/apache/airflow/pull/66022
> >>>
> >>>     I would appreciate any feedback or concerns from the community before
> >>>     this moves forward.
> >>>
> >>>     Best Regards,
> >>>     Jun Yeong Kim
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [email protected]
> >>> For additional commands, e-mail: [email protected]
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to