JackieTien97 opened a new pull request, #17794: URL: https://github.com/apache/iotdb/pull/17794
## Problem Per [the root-cause analysis](https://timechor.feishu.cn/docx/UPU1dVSN8ocBNDx27c8cWnaLnYc): `FragmentInstanceDispatcherImpl.dispatchRemote` retries the **same** `FragmentInstance` once after a `TException`. A `TException` only means the client didn't receive the response — the server may have already executed the FI. After the first execution finishes it runs `releaseResource()` (`dataRegion = null`) but its `FragmentInstanceContext` stays cached in `FragmentInstanceManager.instanceContext` (~5 min) while `instanceExecution` is removed. The retry hits `instanceContext.computeIfAbsent`, **reuses the released context**, and a fresh (ALIVE) driver dereferences the null `dataRegion` in `init()` → **NPE**. The single-execution guards don't help because this is cross-execution reuse. ## Changes - **`TSStatusCode`**: add `REPEATED_RPC_CALL(723)` (intentionally not in `NEED_RETRY`). - **`FragmentInstanceManager`** (data + schema paths): when `instanceContext.computeIfAbsent` would reuse an existing context for the same `instanceId`, throw `IoTDBRuntimeException(REPEATED_RPC_CALL)` **before** the planning `try` block — so it propagates up cleanly without invoking `clearFIRelatedResources`/`createFailedInstanceInfo` on the first execution's cached resources. - **`RegionReadExecutor`**: in both `catch` blocks, carry an `IoTDBRuntimeException`'s status code back so `REPEATED_RPC_CALL` reaches the dispatcher (instead of being downgraded to `EXECUTE_STATEMENT_ERROR`); `needRetryHelper` keeps it non-retryable. - **`FragmentInstanceDispatcherImpl`**: before retrying `dispatchRemoteHelper`, if the query has already timed out, fail fast with a `QUERY_TIMEOUT` status wrapped in `FragmentInstanceDispatchException` instead of re-dispatching. - **`ErrorHandlingUtils`**: map `QueryTimeoutRuntimeException` to `QUERY_TIMEOUT`. ## Test - New `RegionReadExecutorTest#testRepeatedRpcCall` covers both the consensus-read and VirtualDataRegion paths, asserting the response carries `REPEATED_RPC_CALL` and `readNeedRetry == false`. - `mvn test -pl iotdb-core/datanode -Dtest=RegionReadExecutorTest` → 6 passed. - `mvn compile -pl iotdb-core/datanode` (incl. spotless:check) → BUILD SUCCESS. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
