+100. We meant to do it for a long time.

Two things that should also be considered:

1) We need to consider compatibility with existing 3rd-party providers, not
only with ours, So when we discover providers we should fall-back to the
old way - at least for a while and with deprecation notice.
2) While we are doing it, I think we should consider a scenario, where we
do not have to add providers as packages to the api-server. As far as I
know, there are two things that api-server needs from providers currently:
   a) operator extra links
   b) connection definition

For a) I believe we had the idea (and it is already implemented?) that the
extra links are generated and sent via Xcoms - so that no code from
providers is executed in the api-server context. I am not 100% sure if that
is already implemented fully, but I **think** it can be done fully (if not
already done).
So if we get rid of b), that would mean that we might have an api-server
without providers installed. That might be a longer term goal - and we do
not have to do it now, but I think it's worth discussing on how this change
might be used for that.

I think - we might employ Triggerer (that has to have providers installed)
to act as a bridge between providers connection definition and api-server
using them. Simply - if we use declarative Connection definition from the
above, Triggerer during provider discovery, could retrieve the connection
definitions and store them in the database - and the api-server could read
those from the database. The database will have to add "team" key in
multi-team airflow (different teams might have different providers and
connection types).

Now - this can of-course be done later as a long term change, but since we
will have to do back-compatibility now, maybe it's worth it to make it as a
single change. That would make it a bit cleaner in terms of
back-compatibility -> triggerer would read either the declarative
configuration, or dynamically convert "old" definition into the declarative
one and store it in the database - and the api-server could simply always
read it from the database. This way we could immediately get rid of the
connection-definition provider code from the API server. That also means
"true" isolation between airflow (api-server) and task-sdk - because
(providing the extra links are also not needed) - this means that the api
server would not need to instantiate the provider's manager.

Again - can be done in a separate step later, but I would consider doing it
together,

J,

On Thu, Jan 15, 2026 at 3:08 PM Amogh Desai <[email protected]> wrote:

> Hi All,
>
> I wanted to get feedback on something I have been twiddling with. For
> context, the API server has to import
> every single hook class from all providers just to render connection forms
> in the UI. This is because the UI
> metadata (what fields to show, labels, validators, etc.) are living in
> python functions like `get_connection_form_widgets()`
> and `get_ui_field_behaviour()` which are defined on the hook classes.
>
> This means:
> - API server startup imports 100+ hook classes it might not actually need
> - Slower startup due to heavier memory footprint
> - Poor client-server separation (why does the API server need to know about
> pyodbc just to show a UI form?)
>
> My proposal
>
> Moving the UI metadata from python code to something static / declarative
> like yaml. I want to add this information
> in the provider.yaml file that every provider already has. For example -
>
> class PostgresHook(BaseHook):
>     @classmethod
>     def get_ui_field_behaviour(cls) -> dict[str, Any]:
>         return {
>             "hidden_fields": [],
>             "relabeling": {
>                 "schema": "Database",
>             },
>         }
>
> Will become:
>
> connection-types:
>   - connection-type: postgres
>     hook-class-name: airflow.providers.postgres.hooks.postgres.PostgresHook
>
>     ui-field-behaviour:
>       hidden-fields: []
>       relabeling:
>         schema: "Database"
>
>     conn-fields:
>       sslmode:
>         type: string
>         label: SSL Mode
>         enum: ["disable", "prefer", "require"]
>         default: "prefer"
>
>       timeout:
>         type: integer
>         label: Timeout
>         range: [1, 300]
>         default: 30
>
> The schema will now consist of two new sections:
>
> 1. ui-field-behaviour
> - Used to customize the standard connection fields (host, port, login,
> etc.)
> - hidden-fields: Hide some fields
> - relabeling: Change labels for some fields (like schema -> Database above)
> - placeholders: Show hints in the form (port 5432 for example)
>
> 2. conn-fields
> - Can be used to define custom fields stored in Connection.extra
> - You can define inline validators like enum, range, pattern, min-length,
> max-length
> - Will support the standard wtforms string, integer, boolean, number types
>
> As for why this schema was chosen, check the comparison with alternative in
> the PR
> desc: https://github.com/apache/airflow/pull/60410
>
>
> Current Status
>
> I have a POC in: https://github.com/apache/airflow/pull/60410 where I
> chose
> two pilot providers of
> varying difficulty: HTTP and SMTP (HTTP is easy, just a vanilla form but
> SMTP has some hidden fields).
>
>
> Benefits this will offer
>
> - Once complete, the API server won't import any hook classes for UI
> rendering leading to faster startup
> - Provider dependencies don't affect API server
> - YAML is easier to read/write than python functions for form metadata
>
> Would love feedback on:
> 1. Schema design - does it cover your use cases?
> 2. Any missing field types or validators?
>
> The goal is to get the pilot providers in so we can start migrating
> providers incrementally. Old way still
> works, so no rush for everyone to migrate at once.
>
> Thoughts?
>
> Thanks & Regards,
> Amogh Desai
>

Reply via email to