dosubot[bot] commented on issue #3036: URL: https://github.com/apache/hugegraph/issues/3036#issuecomment-4517897203
<!-- Answer --> Short answer: **`EtcdMetaDriver` watches can die silently; `PdMetaDriver` watches self-heal.** *** **`EtcdMetaDriver.listen()` / `listenPrefix()`** pass only a `Consumer<WatchResponse>` to jetcd's Watch client — no `onError`, no `onCompleted` [[1]](https://github.com/apache/hugegraph/blob/9126c80e414c073b60fc074361b211cfb409e0eb/hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/meta/EtcdMetaDriver.java#L303-L321). Whether the watch auto-reestablishes after a gRPC drop depends entirely on jetcd's internal retry behaviour, which is *not* validated or documented anywhere in the HugeGraph codebase. There is no application-level reconnection, no leader-change detection, and no health monitoring. If jetcd silently gives up (which it can, depending on version and channel config), events stop arriving with zero indication. **`PdMetaDriver`** is a different story. Its underlying `KvClient` registers a full `StreamObserver<WatchResponse>` with explicit callbacks [[2]](https://github.com/apache/hugegraph/blob/9126c80e414c073b60fc074361b211cfb409e0eb/hugegraph-pd/hg-pd-client/src/main/java/org/apache/hugegraph/pd/client/KvClient.java#L143-L195): - **`onError`** → resets the client ID and re-invokes `listen()`/`listenPrefix()` via a `BiConsumer` wrapper (fixed 1 s backoff) [[3]](https://github.com/apache/hugegraph/blob/9126c80e414c073b60fc074361b211cfb409e0eb/hugegraph-pd/hg-pd-client/src/main/java/org/apache/hugegraph/pd/client/KvClient.java#L357-L379). - **`onNext` with `Leader_Changed` state** → same reconnect path. - **`onCompleted`** → no-op, but defined. So PD watches *do* self-heal after transport failures and leader elections. *** **Implication for issue #3036**: The issue is *not* moot. For HStore/etcd mode, a gRPC drop will silently kill the `schema-cache-clear` watch registered in `CachedSchemaTransactionV2`, and nothing in `EtcdMetaDriver` brings it back. That teardown point — jetcd's implicit `onError` / channel-close path — is exactly where a reconnect callback must hook. The fix has two angles: 1. **Immediate (EtcdMetaDriver):** Switch from the `Consumer`-only overload to jetcd's `Watch.Listener` (or equivalent) that exposes `onError`/`onCompleted`, then re-register the watch on error — mirroring what `KvClient` already does for PD [[2]](https://github.com/apache/hugegraph/blob/9126c80e414c073b60fc074361b211cfb409e0eb/hugegraph-pd/hg-pd-client/src/main/java/org/apache/hugegraph/pd/client/KvClient.java#L143-L195). 2. **API-level (MetaDriver interface):** Surface a `listenReconnect` / `onTransportReconnect` callback so `CachedSchemaTransactionV2.resetMetaListenerForReconnect()` can be wired as a reconnect target without depending on driver internals. As of today there are no PRs addressing this in `EtcdMetaDriver` [[4]](https://github.com/apache/hugegraph/pull/3011), and no tests covering watch failure/recovery for either driver. <!-- Dosu Comment Footer --> *To reply, just mention [@dosu](https://go.dosu.dev/dosubot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=reply-with-mention&utm_term=apache%2Fhugegraph).* --- Docs are dead. Just use [Dosu](https://dosu.dev?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-tagline&utm_term=apache%2Fhugegraph). [](https://app.dosu.dev/response-feedback?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-feedback&utm_term=apache%2Fhugegraph&message_id=c73e1851-f2b3-42b0-a13f-c7a87d740c09) [](https://github.dosu.com/apache/hugegraph?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-ask-repo&utm_term=apache%2Fhugegraph) [](https://app.dosu.dev/signup?referrer=openSource&source=github-footer&utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-share-team&utm_term=apache%2Fhugegraph) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
