potiuk opened a new pull request, #68147:
URL: https://github.com/apache/airflow/pull/68147

   The TCP connection-ownership check added in #67781 only accepted the 
supervisor
   channel when the connecting peer belonged to the spawned process's *exact* 
PID.
   In the real Java SDK PROD e2e the JVM's loopback connection is not found 
under
   that single PID, so both the `comm` and `logs` channels are rejected, the 
task
   subprocess dies with `process exited with 1 before connecting`, and every 
Java
   task fails (e.g. `java_annotation_example.extract`). The Java SDK e2e suite 
is
   canary-only, so it did not run on #67781 — the breakage only surfaced in the
   nightly `main` runs (red since 2026-05-30).
   
   This widens the trust boundary to the child process **or any of its
   descendants**: a launcher (JVM, shell wrapper, or any runtime that forks a
   worker) legitimately connects back from a descendant rather than the launched
   PID. A process *outside* the spawned subtree is still rejected, so the 
hardening
   #67781 added is preserved. The ownership lookup is also retried briefly to
   absorb the race where a freshly established connection is not yet visible in
   `/proc`.
   
   Validated by the canary `Java SDK e2e tests with PROD image` job (forced on 
this
   PR via the `canary` label), plus unit coverage for the descendant-connection 
and
   retry paths.
   
   related: #67781
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [X] Yes — Claude Code (Opus 4.8)
   
   Generated-by: Claude Code (Opus 4.8) following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to