abhinavgautam01 opened a new pull request, #1685:
URL: https://github.com/apache/datafusion-ballista/pull/1685
# Which issue does this PR close?
<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases. You can
link an issue to this
PR using the GitHub syntax. For example `Closes #123` indicates that this PR
will close issue #123.
-->
Closes #1463.
# Rationale for this change
<!--
Why are you proposing this change? If this is already explained clearly in
the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand your
changes and offer better suggestions for fixes.
-->
Issue #1463 asked for integration tests to cover **push-staged** scheduling
as well as **pull-staged**. The in-process test cluster was effectively
pull-only: the standalone scheduler used a fixed identity while listening on a
random port, and the standalone executor always used the pull `poll_loop`. That
gap meant we did not exercise the same scheduler/executor path as the default
push policy in real deployments.
In addition, exposing push mode in tests surfaced a real correctness issue:
push executors report task status via gRPC clients keyed by the scheduler’s
advertised `host:port`; a hard-coded placeholder did not match the actual
listener, which could cause hangs or failed status updates.
# What changes are included in this PR?
<!--
There is no need to duplicate the description in the issue here but it is
sometimes worth providing a summary of the individual changes in this PR.
-->
- **Scheduler (`standalone`)**: bind `127.0.0.1:0` first, then build the
in-memory cluster / `SchedulerServer` with `scheduler_endpoint =
addr.to_string()` so task metadata and curator IDs match the real gRPC listener.
- **Executor (`standalone`)**: add push-staged support using the existing
`executor_server::startup` path; keep pull-staged behavior on `poll_loop` as
before.
- **`executor_server::startup`**: wait until the executor gRPC address
accepts a TCP connection before `register_executor`, reducing a race with
asynchronously started listeners.
- **Client integration tests**: helpers and `rstest` cases for push
(`remote_push`, `remote_state_push`); `context_setup` remote tests
parameterized over `TaskSchedulingPolicy`; align test URLs with `127.0.0.1` for
consistency with the binder.
- **Examples test common**: same `127.0.0.1` host alignment for cluster URLs.
- **Minor**: cast ephemeral `grpc_port` to `u32` for protobuf registration
fields.
Tests: `cargo test -p ballista --features standalone` (with `protoc`
available).
# Are there any user-facing changes?
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
**Yes, but small and additive.** New public APIs were added on the scheduler
and executor `standalone` modules (policy-aware constructors /
`_with_scheduling_policy` helpers). Existing entry points retain prior behavior
where applicable (e.g. default standalone scheduler helpers still target
pull-staged integration defaults as before).
No changes to CLI flags or documented user workflows beyond optional use of
these APIs by embedders/tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]