Dev-iL opened a new pull request, #68169:
URL: https://github.com/apache/airflow/pull/68169
<!-- SPDX-License-Identifier: Apache-2.0
https://www.apache.org/licenses/LICENSE-2.0 -->
<!--
Thank you for contributing!
Please provide above a brief description of the changes made in this pull
request.
Write a good git commit message following this guide:
http://chris.beams.io/posts/git-commit/
Please make sure that your code changes are covered with tests.
And in case of new features or big changes remember to adjust the
documentation.
Feel free to ping (in general) for the review if you do not see reaction for
a few days
(72 Hours is the minimum reaction time you can expect from volunteers) - we
sometimes miss notifications.
In case of an existing issue, reference it using one of the following:
* closes: #68160
* related: #ISSUE
-->
closes: https://github.com/apache/airflow/issues/68160
Java SDK e2e tasks have been failing in CI: every Java task dies during
startup with `RuntimeError: process exited with 1 before connecting`, while
Python tasks in the same DAG succeed.
### Root cause
The subprocess coordinator added a TCP connection-ownership check in #67781
(refined in #68147): before accepting the child's `comm`/`logs` channels it
verifies the connecting peer belongs to the child's process tree, by matching
the accepted connection's address against what `psutil` reports for that
process.
The supervisor's listening socket is `AF_INET` (`socket.socket()` bound to
`127.0.0.1`), so `getpeername()`/`getsockname()` report **plain IPv4**
(`127.0.0.1`). On an IPv6-enabled host the JVM connects back over a
**dual-stack** socket, so the kernel records its loopback connection as the
**IPv4-mapped** form `::ffff:127.0.0.1` in `/proc/net/tcp6`, which `psutil`
returns verbatim. The two never compare equal, so both channels are rejected,
the child's sockets are closed, and it exits 1.
Python tasks are unaffected because the Python runtime connects back over a
plain IPv4 socket, which matches the server's address directly. This is why the
symptom is `extract: failed` (Java) while `python_task_1: success`, with the
rest `upstream_failed`.
### Fix
Canonicalize IPv4-mapped IPv6 addresses to their IPv4 form in
`_socket_address`, the single helper through which **both** sides of the
comparison pass. `::ffff:127.0.0.1` and `127.0.0.1` now compare equal, so a
dual-stack client (the JVM, or any runtime that connects over an `AF_INET6`
socket) is matched correctly.
The security property from #67781 is preserved: the connection must still be
on loopback and owned by the child's process tree — only the address
*representation* is normalized, not what is accepted.
The two Java SDK e2e tests that were temporarily marked `xfail` as a CI
workaround for this issue are re-enabled in the same change.
### Verification
- Added `test_matches_dual_stack_ipv4_mapped_connection`, which connects via
an `AF_INET6` socket to reproduce the JVM's v4-mapped address. It fails against
the old code (exact reproduction of the rejection) and passes with the fix.
- Full coordinator suite passes, including the existing rejection tests
(`test_rejects_racing_connection_from_other_process`,
`test_rejects_tcp_connection_not_owned_by_child_process`) — the ownership guard
still rejects connections outside the child's process tree.
- Drove a real child process connecting back over a dual-stack socket
through `_accept_connections` (the function in the original traceback): it now
accepts the connection instead of timing out.
- `ruff` and `mypy-task-sdk` clean.
---
##### Was generative AI tooling used to co-author this PR?
- [X] Yes — Claude Code (Opus 4.8)
Generated-by: Claude Code (Opus 4.8) following [the
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
---
* Read the **[Pull Request
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
for more information. Note: commit author/co-author name and email in commits
become permanently public when merged.
* For fundamental code changes, an Airflow Improvement Proposal
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
is needed.
* When adding dependency, check compliance with the [ASF 3rd Party License
Policy](https://www.apache.org/legal/resolved.html#category-x).
* For significant user-facing changes create newsfragment:
`{pr_number}.significant.rst`, in
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
You can add this file in a follow-up commit after the PR is created so you
know the PR number.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]