miantalha45 opened a new pull request, #17181:
URL: https://github.com/apache/iotdb/pull/17181
## Description
Add automatic reconnection to the IoTDB CLI when the connection to the
server is lost during an interactive session (e.g. server restart, network
blip, or idle timeout). The CLI no longer exits immediately on
connection-related errors; it attempts to reconnect with the same parameters
and retries the failed command, aligning behavior with the Session API, JDBC,
and C++/Python clients.
### Content1 — Detection and reconnection flow
- **Detection**: Connection loss is detected when a command fails with a
connection-related `SQLException`. We treat an exception as connection-related
if its message (or cause message, lowercased) contains any of: `connection`,
`refused`, `timeout`, `closed`, `reset`, `network`, `broken pipe`. This logic
lives in `AbstractCli.isConnectionRelated(SQLException)` and
`matchesConnectionFailure(String)` so it can be shared and reused.
- **Reconnection**: On such a failure, the CLI closes the current connection
and opens a new one using the same parameters (host, port, user, password, and
options) via `DriverManager.getConnection` and the existing `info` properties.
Helper methods `openConnection()`, `setupConnection()`, and
`closeConnectionQuietly()` in `Cli` encapsulate open/setup/close so the main
loop stays clear.
- **Retry**: After a successful reconnection, the same user command (the
current line that failed) is retried with the new connection. We retry
reconnection up to **3** times with a **1 s** delay between attempts (no delay
before the first attempt). Constants `RECONNECT_RETRY_NUM` and
`RECONNECT_RETRY_INTERVAL_MS` in `Cli` control this; they are not yet
user-configurable.
- **Feedback**: On successful reconnection we print: `Connection lost.
Reconnected. Retrying command.` If all reconnection attempts fail we print:
`IoTDB: Could not reconnect after 3 attempts. Please check that the server is
running and try again.` and exit with error code.
### Content2 — Class and method organization
- **AbstractCli**: Added `isConnectionRelated(SQLException)`
(package-private static) and `matchesConnectionFailure(String)` (private
static) for shared detection. In `executeQuery`, `setTimeZone`, and
`showTimeZone`, we catch `SQLException` (or `Exception` where the API does not
throw `SQLException`) and rethrow when `isConnectionRelated(e)`; otherwise we
keep the existing “print error and return error code” behavior.
`handleInputCmd` and `processCommand` now declare `throws SQLException` so
connection failures propagate to the CLI loop instead of being swallowed.
- **Cli**: Introduced `ReadLineResult` (inner class with `stop`,
`failedCommand`) and factory methods `continueLoop()`, `stopLoop()`,
`reconnectAndRetry(String)` so the read-eval loop can signal “continue”,
“exit”, or “reconnect and retry this command”. `receiveCommands()` no longer
uses try-with-resources for the connection; it holds the connection in a
variable, and when `readerReadLine()` returns a result with `failedCommand !=
null`, it runs the reconnect loop (close → retry open/setup → print message →
retry command). `readerReadLine()` wraps `processCommand()` in a try-catch; on
connection-related `SQLException` it returns `reconnectAndRetry(s)` with the
current line; on other `SQLException` it prints and returns `stopLoop()`.
- **AbstractCliTest**: `testHandleInputInputCmd()` now declares `throws
SQLException` and imports `java.sql.SQLException` so it compiles with the
updated `handleInputCmd` signature.
### Content3 — Corner cases and alternatives
- **Corner cases**: If reconnection succeeds but the retried command fails
again with a connection-related error, the outer loop will see another
`reconnectAndRetry` and run the same reconnect/retry flow again (each time with
up to 3 reconnect attempts). Non-connection `SQLException`s still print the
error and stop the loop (exit) as before. Interrupt and EOF handling in
`readerReadLine()` are unchanged.
- **Alternatives considered**: (1) Reconnect without retrying the failed
command—simpler but worse UX. (2) Prompt “Reconnect? (y/n)”—gives control but
adds friction and is less script-friendly. (3) Leave current behavior—rejected
to align CLI with other clients and improve long-lived session UX.
<hr>
This PR has:
- [x] been self-reviewed.
- [x] concurrent read
- [x] concurrent write
- [x] concurrent read and write
- [ ] added documentation for new or modified features or behaviors.
- [x] added Javadocs for most classes and all non-trivial methods.
- [ ] added or updated version, __license__, or notice information
- [x] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for code coverage.
- [ ] added integration tests.
- [ ] been tested in a test IoTDB cluster.
<hr>
##### Key changed/added classes (or packages if there are too many classes)
in this PR
- `org.apache.iotdb.cli.AbstractCli` — `isConnectionRelated`,
`matchesConnectionFailure`; rethrow connection-related `SQLException` in
`executeQuery`, `setTimeZone`, `showTimeZone`; `handleInputCmd`,
`processCommand` now `throws SQLException`
- `org.apache.iotdb.cli.Cli` — `ReadLineResult`, `openConnection()`,
`setupConnection()`, `closeConnectionQuietly()`; refactored `receiveCommands()`
and `readerReadLine()` for reconnect-and-retry flow
- `org.apache.iotdb.cli.AbstractCliTest` — `testHandleInputInputCmd()`
updated for `throws SQLException`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]