Hi Jun Yeong,
one thought on this: We had similar.
Our use case: Implemented a custom archiving (Dag) that needed to take
care of different retention times in different Dags (maybe bad example
because this archival itself accesses the database for archiving...
haha) and we wanted to have different retention times per Dag.
What we did is we made a JSON structure with all parameters into a
single Variable. Then we did not need to have many IO operations and
Variables but could store all in a single Variable.
Would this be a viable solution for your case as well?
Jens
On 30.04.26 22:47, 김준영 wrote:
Hi Jens,
Thank you for the thoughtful feedback!
The primary demand for this feature comes from workflows that require
dynamic configuration discovery. A common pattern is grouping related
variables under a shared prefix (e.g., team_a_config_ or
pipeline_x_param_). In many cases, these keys are generated or updated
dynamically by external systems, meaning the exact set of keys isn't
known at DAG authoring time.
While users in Airflow 2.x relied on session.query(Variable).all() as
a workaround, Airflow 3’s move toward the Task SDK aims to abstract
away direct ORM/DB access for better security and stability.
Variable.list(prefix=) provides a supported, clean way to achieve this
discovery within that new architecture.
Regarding the secrets backend limitation, I completely agree. It’s
important to manage expectations, so I will update the PyDoc to
explicitly state that this method only lists variables stored in the
metadata database.
Best regards, Jun Yeong
2026년 5월 1일 (금) 오전 5:23, Jens Scheffler <[email protected]>님이 작성:
Hi!
thanks for the discussion. While I am not against this I would say in
Airflow 2 it was also not a "public API" but the DB connecton "just
used" to list and have a missing API compensated.
Can you express what the demand for the missing feature is? What
business function did you implement based on listing all Variables?
As you already stated and also highlighted in the PR the list() might
not tell about all Variables as the list is not provided from secret
managers. So it might (small risk thoug) lead to some confusion. Should
be explicitly documented in the PyDoc. But this is a nit.
Jens
On 30.04.26 11:47, 김준영 wrote:
Hi all,
I'd like to propose adding Variable.list() to the Task SDK to address
the gap left by the removal of direct ORM access in Airflow 3.
Background:
In Airflow 2.x, users could list all variables via:
from airflow.models import Variable
from airflow.utils.session import create_session
with create_session() as session:
variables = session.query(Variable).all()
In Airflow 3.x, this pattern raises:
RuntimeError: Direct database access via the ORM is not allowed in
Airflow
3.0
There is currently no supported way to discover variable keys dynamically
when they are not known at DAG authoring time.
Proposal:
- Add Variable.list(prefix=None) to the Task SDK
- Scope is limited to the metadata database only (same as the old ORM
pattern)
- Secrets backend support is intentionally out of scope, as it would
require a broader interface contract change and separate community
discussion
Related issue: https://github.com/apache/airflow/issues/61166
Draft PR: https://github.com/apache/airflow/pull/66022
I would appreciate any feedback or concerns from the community before
this moves forward.
Best Regards,
Jun Yeong Kim
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]