Re: [DISCUSS] Move connection testing to workers

Jarek Potiuk Sun, 22 Feb 2026 06:38:25 -0800

I am all for it :)

> 1. The connection test needs to store an encrypted URI, conn_type, and
some timestamps. Is the Callback.data JSON column the right place
for that, or does it warrant its own small table?


They don't have to be stored. It's enough to send connection_id (after
saving it to the DB).  The worker can retrieve all the credentials the
usual way workers do.
I think it's reasonable to only run test connection when it has been
saved (not during editing) - and even if during editing, we could save
it automatically for tests.


> 2. Stale requests: if a worker crashes mid-test, the record stays
in a non-terminal state. Should there be a scheduler-side reaper
similar to zombie task detection, or is client-side timeout (60s
in the UI) enough?

A good idea would be to generate a random/unique ID for the test
request. This ID should be random enough to prevent easy guessing,
ensuring only the client who initiated the request can poll for its
status—which also serves as a security feature. We can simply store
such test connection requests (and eventually responses) in a
database, including a timestamp, and use our standard `db clean` to
clear them.

J.


On Sun, Feb 22, 2026 at 4:52 AM Anish Giri <[email protected]> wrote:
>
> Hi all,
>
> I'd like to discuss moving connection testing off the API server and
> onto workers. Jarek suggested this direction in a comment on #59643
> [1], and I think the Callback infrastructure being built for running
> callbacks on executors is the right foundation for it.
>
> Since 2.7.0, test_connection has been disabled by default (#32052).
> Running it on the API server has two problems: the API server
> shouldn't be executing user-supplied driver code (Jarek described the
> ODBC/JDBC risks in detail on #59643), and workers typically have
> network access to external systems that API servers don't, so test
> results from the API server can be misleading.
>
> Ramit's generic Callback model (#54796 [2]) and Ferruzzi's
> in-progress executor dispatch (#61153 [3]) together give us most of
> what's needed. The flow would be:
>
> 1. UI calls POST /connections/test
> 2. API server Fernet-encrypts the connection URI, creates an
> ExecutorCallback pointing to the test function, returns an ID
> 3. Scheduler dispatch loop (from #61153) picks it up, sends it
> to the executor
> 4. Worker decrypts the URI, builds a transient Connection, calls
> test_connection(), reports result through the callback path
> 5. UI polls GET /connections/test/{id} until it gets a terminal
> state
>
> The connection-testing-specific code would be small: a POST endpoint
> to queue the test, a GET endpoint to poll for results, and the worker
> function that decrypts and runs test_connection().
>
> One thing I noticed: #61153's _enqueue_executor_callbacks currently
> requires dag_run_id in the callback data dict, and ExecuteCallback.make
> needs a DagRun for bundle info. Connection tests don't have a DagRun.
> It would be a small change to make that optional. The dispatch query
> itself is already generic (selects all PENDING ExecutorCallbacks). I
> can take a look at decoupling that if it would be useful.
>
> A couple of other open questions:
>
> 1. The connection test needs to store an encrypted URI, conn_type, and
> some timestamps. Is the Callback.data JSON column the right place
> for that, or does it warrant its own small table?
>
> 2. Stale requests: if a worker crashes mid-test, the record stays
> in a non-terminal state. Should there be a scheduler-side reaper
> similar to zombie task detection, or is client-side timeout (60s
> in the UI) enough?
>
> I explored this earlier in #60618 [4] with a self-contained
> implementation. Now that the ExecutorCallback dispatch is taking shape
> in #61153, building on top of will be in right direction.
>
> Thoughts?
>
> Anish
>
> [1] https://github.com/apache/airflow/pull/59643
> [2] https://github.com/apache/airflow/pull/54796
> [3] https://github.com/apache/airflow/pull/61153
> [4] https://github.com/apache/airflow/pull/60618
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [DISCUSS] Move connection testing to workers

Reply via email to