[PR] [AINode] Refactoring of Model Storage, Loading, and Inference Pipeline [iotdb]

via GitHub Wed, 26 Nov 2025 01:41:07 -0800


CRZbulabula opened a new pull request, #16819:
URL: https://github.com/apache/iotdb/pull/16819


   This PR introduces significant improvements in the model storage, loading, 
and inference pipeline management for better extensibility, efficiency, and 
ease of use. The changes include the refactoring of model storage to support a 
wider range of models, streamlining the model loading process, and the 
introduction of a unified inference pipeline. These improvements aim to 
optimize model management, reduce memory usage, and enhance the overall 
inference workflow.
   
   Model Storage Refactoring
   
   Extended Support for Models: The system now supports not only built-in 
models like TimerXL and Sundial but also allows the integration of fine-tuned 
and user-defined models.
   Unified Model Management: A new model management system enables model 
registration, deletion, and loading from both local paths and Hugging Face.
   Code Optimization: Redundant code from previous versions has been removed, 
and hard-coded model management has been replaced by a more flexible approach 
that integrates seamlessly with the Hugging Face Transformers ecosystem.
   Model Loading Refactoring
   
   Simplified Model Loading: The previous custom loading logic with complex 
if...else... conditions has been replaced by a unified model loading interface, 
simplifying the process.
   Automatic Model Type Detection: The system now automatically detects the 
model type and selects the appropriate loading method, supporting models from 
Transformers, sktime, and PyTorch.
   Lazy Loading: The PR introduces lazy loading for Python modules, eliminating 
the need to load multiple modules at startup, reducing initialization time and 
memory consumption.
   Inference Pipeline Addition
   
   Unified Inference Workflow: The introduction of the Inference Pipeline 
encapsulates the entire model inference process, offering a standardized 
interface for preprocessing, inference, and post-processing.
   Support for Multiple Tasks: The pipeline is versatile, supporting various 
inference tasks such as prediction, classification, and dialogue-based tasks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [AINode] Refactoring of Model Storage, Loading, and Inference Pipeline [iotdb]

Reply via email to