r4ntix opened a new issue, #214: URL: https://github.com/apache/arrow-ballista/issues/214
**Describe the bug** I start the scheduler and executor service in localhost: ```shell ./target/debug/ballista-scheduler -s push-staged --log-level-setting INFO,ballista_scheduler=DEBUG ./target/debug/ballista-executor -c 4 -s push-staged --log-level-setting INFO,ballista_executor=DEBUG ``` And run [examples/src/bin/sql.rs](https://github.com/apache/arrow-ballista/blob/2e1f5d619760d3b7acce225a166a9507f9efe9a1/examples/src/bin/sql.rs), the executor was an error: ```shell 2022-09-14T07:18:35.502374Z ERROR tokio-runtime-worker ThreadId(08) ballista_executor::executor_server: Fail to connect to scheduler scheduler_ballista_localhost_50050 due to TonicError(tonic::transport::Error(Transport, hyper::Error(Connect, ConnectError("dns error", Custom { kind: Uncategorized, error: "failed to lookup address information: nodename nor servname provided, or not known" })))) ``` The executor can't connect to scheduler via `scheduler_ballista_localhost_50050`. **To Reproduce** The scheduler send `scheduler_id` in `LaunchTaskParams` to executor when launch task: https://github.com/apache/arrow-ballista/blob/2e1f5d619760d3b7acce225a166a9507f9efe9a1/ballista/rust/scheduler/src/state/task_manager.rs#L415-L430 The scheduler_id generate by scheduler when start service, and value is `format!("scheduler_{}_{}_{}", namespace, external_host, port)`: https://github.com/apache/arrow-ballista/blob/2e1f5d619760d3b7acce225a166a9507f9efe9a1/ballista/rust/scheduler/src/main.rs#L171 In the process of the executor reporting the task status, call `get_scheduler_client` pass `scheduler_id`: https://github.com/apache/arrow-ballista/blob/2e1f5d619760d3b7acce225a166a9507f9efe9a1/ballista/rust/executor/src/executor_server.rs#L507-L519 The `scheduler_id` value is `format!("scheduler_{}_{}_{}", namespace, external_host, port)`, that can't lookup address via dns: https://github.com/apache/arrow-ballista/blob/2e1f5d619760d3b7acce225a166a9507f9efe9a1/ballista/rust/executor/src/executor_server.rs#L222-L237 So the executor throw an error, and task fail. **Describe the solution you'd like** Fix `scheduler_name` to format!("{}:{}", opt.external_host, opt.bind_port), default is `localhost:50050`. The prefix name of the log file remains `format!("scheduler_{}_{}_{}", namespace, external_host, port)`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
