Alchuang22-dev opened a new pull request, #16647:
URL: https://github.com/apache/iotdb/pull/16647
## Description
This PR refactors the AINode client infrastructure to support direct
communication between DataNode and AINode, removing the dependency on
ConfigNode for AI-related operations such as model loading and inference.
Should be reviewed by @CRZbulabula
## Contents
`AINodeClient`
- Added a new executeRemoteCallWithRetry() method for automatic retry and
reconnection on Thrift transport failures, following the same design pattern as
ConfigNodeClient.
- Updated the loadModel(TLoadModelReq req) API to use this retry wrapper for
improved resilience.
- Simplified connection lifecycle management (init(), close()) to ensure
stable client reuse via AINodeClientManager.
`ClusterConfigTaskExecutor`
- Replaced indirect ConfigNode RPCs with direct calls to
AINodeClientManager.borrowClient(TEndPoint) for model operations (currently
loadModel as an example).
- Ensured the DataNode→AINode invocation flow mirrors the ConfigNode client
style while maintaining compatibility with existing client pooling.
- Updated Thrift imports to use org.apache.iotdb.ainode.rpc.thrift.* instead
of org.apache.iotdb.confignode.rpc.thrift.*.
`AINodeClientManager`
- No functional changes; reused existing pool management for TEndPoint-based
clients to keep consistency with ConfigNodeClientManager.
## Impact
DataNode can now directly send AI-related requests (e.g., model load/unload,
inference) to AINode without routing through ConfigNode.
## Next Steps
Extend the same direct invocation pattern
(AINodeClientManager.borrowClient()) to other AI APIs:
unloadModel, showModel, showLoadedModel, showAIDevices, createTraining, and
getModelInfo.
<hr>
This PR has:
- [x] been self-reviewed.
- [ ] concurrent read
- [ ] concurrent write
- [ ] concurrent read and write
- [ ] added documentation for new or modified features or behaviors.
- [ ] added Javadocs for most classes and all non-trivial methods.
- [ ] added or updated version, __license__, or notice information
- [ ] added comments explaining the "why" and the intent of the code
wherever would not be obvious
for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold
for code coverage.
- [ ] added integration tests.
- [ ] been tested in a test IoTDB cluster.
<!-- Check the items by putting "x" in the brackets for the done things. Not
all of these items
apply to every PR. Remove the items which are not done or not relevant to
the PR. None of the items
from the checklist above are strictly necessary, but it would be very
helpful if you at least
self-review the PR. -->
<hr>
##### Key changed/added classes (or packages if there are too many classes)
in this PR
As former.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]