HyukjinKwon opened a new pull request, #56676:
URL: https://github.com/apache/spark/pull/56676
### What changes were proposed in this pull request?
`test_profile_before_sc_for_connect` creates a `ResourceProfile` over Spark
Connect immediately
after `SparkSession.builder.remote(...).getOrCreate()`. This PR makes the
test wait for the Connect
server to be ready before doing so, using the existing
`pyspark.testing.eventually` helper to retry a
trivial job until it succeeds:
```python
from pyspark.testing.utils import eventually
def _server_ready() -> bool:
spark.range(1).count()
return True
eventually(timeout=120, expected_exceptions=(Exception,))(_server_ready)()
rp.id
```
### Why are the changes needed?
The scheduled "Build / Python-only, Connect-only (Python 3.11)" build runs
this test in its
`Run tests (local-cluster)` step, where the server is started with
`start-connect-server.sh --master "local-cluster[2, 4, 1024]"`. That script
returns before the
local-cluster `SparkContext` is fully initialized, so the first command(s)
issued against it can
fail server-side. `test_connect_resources` is the first test in that step,
so it races server
startup and fails intermittently (~60% of runs), observed as a bare
`java.lang.AssertionError` on
`rp.id`, or `SparkConnectGrpcException: Application error processing RPC` on
the first job. When the
cluster happens to be ready, the test passes (~22-77s). Waiting for
readiness first makes it
deterministic.
This is a test-only stabilization. The underlying server behavior (an
internal error leaking on a
very-early command before the context is ready) is a separate, deeper
robustness concern and is not
addressed here.
### Does this PR introduce _any_ user-facing change?
No. Test-only change.
### How was this patch tested?
Ran the scheduled workflow (`build_python_connect.yml`) on a fork. The
Connect-only build is green
end-to-end, including the previously-flaky `local-cluster` step, on two
consecutive runs:
- https://github.com/HyukjinKwon/spark/actions/runs/27969109059 (attempt 1
and attempt 2: both
`Run tests (local)` and `Run tests (local-cluster)` green)
The default `build_and_test` on this branch is also green:
https://github.com/HyukjinKwon/spark/actions/runs/27973120689
Note: the Connect-only build's `Run tests (local)` step also requires the
import fix in #56644
(SPARK-57598); the validation runs above were performed on a branch carrying
both changes so the
`local-cluster` step is reached. This PR contains only the
`test_connect_resources.py` change.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.8)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]