Re: [DISCUSS] Move connection testing to workers

Anish Giri Sun, 22 Feb 2026 14:25:45 -0800

Thanks Jarek, thanks Jens, this is really helpful.

Understood on passing just the connection_id and having the worker
fetch credentials through the standard path. That simplifies things a
lot. So the flow would be: connection is already saved, test gets
queued with the connection_id, worker picks it up and retrieves
everything the usual way.


The crypto-random ID for polling makes a lot of sense, which covers
the authorization side and keeps cleanup simple with db clean and
timestamps.

On queue routing,  agreed, default queue for now as Jens mentioned.

I'll start on the endpoints and worker function and put up an
implementation PR. The dispatch side depends on #61153,  happy to look
at decoupling the dag_run_id requirement there too. Will tag you both
when it's up.

Anish

On Sun, Feb 22, 2026 at 8:57 AM Jens Scheffler <[email protected]> wrote:
>
> +1 in my view. That would be a proper resolution - with the trade-off
> that processing is async and response to user might take a few seconds
> for "some" worker to pick-up.
>
> We would need to assume that testing is using "default" queue, if
> different workers are configured then there might be differences - but
> adding further complexity would not be reasonable in my view.
>
> On 22.02.26 15:38, Jarek Potiuk wrote:
> > I am all for it :)
> >
> >> 1. The connection test needs to store an encrypted URI, conn_type, and
> > some timestamps. Is the Callback.data JSON column the right place
> > for that, or does it warrant its own small table?
> >
> > They don't have to be stored. It's enough to send connection_id (after
> > saving it to the DB).  The worker can retrieve all the credentials the
> > usual way workers do.
> > I think it's reasonable to only run test connection when it has been
> > saved (not during editing) - and even if during editing, we could save
> > it automatically for tests.
> >
> >
> >> 2. Stale requests: if a worker crashes mid-test, the record stays
> > in a non-terminal state. Should there be a scheduler-side reaper
> > similar to zombie task detection, or is client-side timeout (60s
> > in the UI) enough?
> >
> > A good idea would be to generate a random/unique ID for the test
> > request. This ID should be random enough to prevent easy guessing,
> > ensuring only the client who initiated the request can poll for its
> > status—which also serves as a security feature. We can simply store
> > such test connection requests (and eventually responses) in a
> > database, including a timestamp, and use our standard `db clean` to
> > clear them.
> >
> > J.
> >
> >
> > On Sun, Feb 22, 2026 at 4:52 AM Anish Giri <[email protected]> wrote:
> >> Hi all,
> >>
> >> I'd like to discuss moving connection testing off the API server and
> >> onto workers. Jarek suggested this direction in a comment on #59643
> >> [1], and I think the Callback infrastructure being built for running
> >> callbacks on executors is the right foundation for it.
> >>
> >> Since 2.7.0, test_connection has been disabled by default (#32052).
> >> Running it on the API server has two problems: the API server
> >> shouldn't be executing user-supplied driver code (Jarek described the
> >> ODBC/JDBC risks in detail on #59643), and workers typically have
> >> network access to external systems that API servers don't, so test
> >> results from the API server can be misleading.
> >>
> >> Ramit's generic Callback model (#54796 [2]) and Ferruzzi's
> >> in-progress executor dispatch (#61153 [3]) together give us most of
> >> what's needed. The flow would be:
> >>
> >> 1. UI calls POST /connections/test
> >> 2. API server Fernet-encrypts the connection URI, creates an
> >> ExecutorCallback pointing to the test function, returns an ID
> >> 3. Scheduler dispatch loop (from #61153) picks it up, sends it
> >> to the executor
> >> 4. Worker decrypts the URI, builds a transient Connection, calls
> >> test_connection(), reports result through the callback path
> >> 5. UI polls GET /connections/test/{id} until it gets a terminal
> >> state
> >>
> >> The connection-testing-specific code would be small: a POST endpoint
> >> to queue the test, a GET endpoint to poll for results, and the worker
> >> function that decrypts and runs test_connection().
> >>
> >> One thing I noticed: #61153's _enqueue_executor_callbacks currently
> >> requires dag_run_id in the callback data dict, and ExecuteCallback.make
> >> needs a DagRun for bundle info. Connection tests don't have a DagRun.
> >> It would be a small change to make that optional. The dispatch query
> >> itself is already generic (selects all PENDING ExecutorCallbacks). I
> >> can take a look at decoupling that if it would be useful.
> >>
> >> A couple of other open questions:
> >>
> >> 1. The connection test needs to store an encrypted URI, conn_type, and
> >> some timestamps. Is the Callback.data JSON column the right place
> >> for that, or does it warrant its own small table?
> >>
> >> 2. Stale requests: if a worker crashes mid-test, the record stays
> >> in a non-terminal state. Should there be a scheduler-side reaper
> >> similar to zombie task detection, or is client-side timeout (60s
> >> in the UI) enough?
> >>
> >> I explored this earlier in #60618 [4] with a self-contained
> >> implementation. Now that the ExecutorCallback dispatch is taking shape
> >> in #61153, building on top of will be in right direction.
> >>
> >> Thoughts?
> >>
> >> Anish
> >>
> >> [1] https://github.com/apache/airflow/pull/59643
> >> [2] https://github.com/apache/airflow/pull/54796
> >> [3] https://github.com/apache/airflow/pull/61153
> >> [4] https://github.com/apache/airflow/pull/60618
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [DISCUSS] Move connection testing to workers

Reply via email to