chiruu12 commented on issue #183:
URL: 
https://github.com/apache/incubator-hugegraph-ai/issues/183#issuecomment-2692519805

   > I looked further into what [@chiruu12](https://github.com/chiruu12) 
suggests, about not using off-the-shelf agentic components that can prevent 
developers from understanding critical behaviors, so that over-time the 
behavior of the service doesnt go out of control. Indeed it would be possible 
to write a custom HG_agentic library that borrows only the necessary pieces, so 
hugegraph can maintain control over the logic and integration details.
   > 
   > How about combining all of our suggestions that integrates LlamaIndex, 
Pydantic-AI, CrewFlow, and Agno into a dual-mode, modular GraphRAG system while 
avoiding the dependency hell as warned by [@Aryankb](https://github.com/Aryankb)
   > 
   > As a hybrid GraphRAG system that supports two modes we can include: • A 
beginner-friendly “agentic retriever” that is pre-fine-tuned with robust LLM 
prompting for straightforward use cases, and • A customizable mode for advanced 
developers who need to tailor retrieval, orchestration, and validation 
mechanisms.
   > 
   > Key design principles so that everyone can get a good night's sleep: • 
Modularity & Microservices: standalone services with clearly defined APIs. • 
Dual-Mode Operation: ease of use and deep customization. • Transparent 
Integration: extracting core functionalities and integrating them in-house. • 
Extensive Logging & Monitoring: via Prometheus. • Containerization: isolate 
dependencies.
   > 
   > Architectural Layers & Components: A. Base Layer – Agno for L1 Queries 
Handle high-frequency, low-latency queries (e.g., simple entity lookups) with 
optimized parallel execution. Beyond this point we must also seek the correct 
strategy to handle LN queries as well.
   > 
   > Key Features: • Fast execution with low memory footprint. • Built-in 
Gremlin-Cypher transpiler for hybrid query support. • Integration with a hybrid 
caching layer that combines Agno shared memory and RocksDB. • Wrap Agno’s core 
query engine in a microservice that exposes an HTTP endpoint. • Queries can be 
configured to pass through a lightweight pre-processing step to select between 
cache and live query execution (L1) specific.
   > 
   > This component when abstracted into our own agentic library will be the 
base of all performance optimizations.
   > 
   > B. Orchestration Layer – CrewAI for Complex Workflows This would help us 
manage multi-hop, dynamic queries and agent workflows that require intent 
classification and asynchronous execution and allows customization.
   > 
   > Key Features: • Dynamic intent classification powered by domain-specific 
embeddings (integrated with HugeGraph). • Event-driven workflow, where subtasks 
are dynamically generated from a user’s plain-English prompt. • Built-in 
support for sequencing (sequential/parallel) and conditional delegation of 
agent tasks. • Adapt core functionalities from CrewAI (CrewFlow) to create a 
custom orchestration module. • Define a clear API contract for submitting 
workflows, retrieving status, and handling error/fallback logic.
   > 
   > C. Validation Layer – Pydantic All general schema consistency and data 
integrity across all operations. THIS distinction is necessary to understand 
that it's sole purpose here should be for schema purposes only, not beyond it 
so far.
   > 
   > Key Features: • Middleware to validate incoming queries and agent 
responses. • Dev-friendly type hints and error reporting. • Mechanisms to 
ensure that changes in one layer do not break API contracts. • Wrap core 
endpoints of other layers with Pydantic models that perform input/output 
validation. • Integrate validation middleware as a separate microservice or as 
decorators within the existing service codebase.
   > 
   > Note: This is the general usage of Pydantic, not it's agentic tools. 
Otherwise it is too unpredictable and unsuitable for production.
   > 
   > D. Retrieval Enhancement Layer – LlamaIndex Finally, in order to provide 
recursive, multi-hop retrieval functionality enhanced by tiered caching, 
ensuring that complex graph queries are answered effectively, LlamaIndex is 
compatible with CrewAI alreayd, so we'll look further into how it's 
compatibility has been provided. Key Features: • Recursive retrieval strategies 
that work well with hierarchical graph caching. • Integration with HugeGraph’s 
OLAP engine for analytical queries. • Modular “runnables” inspired by LangChain 
that allow flexible composition of retrieval steps. • Expose LlamaIndex’s 
retrieval engine via an API that accepts complex, multi-hop query parameters. • 
Use a caching strategy that combines in-memory (for fast lookups) and 
persistent (RocksDB) storage to accelerate repeated queries.
   > 
   > Summary for the plan with general key points and implementation steps:
   > 
   > * RESTful API endpoints for query submission, workflow orchestration, 
validation, and retrieval.
   > * A Python SDK (e.g., HG_agentic and HG_orchestrator) that abstracts away 
the internal microservices and provides simple functions for (examples):
   >   Creating agents via plain-English commands.
   >   Configuring custom workflows (sequential, parallel, conditional).
   >   Integrating with existing agent systems (AutoGen).
   > * Define API endpoints for each core service. For example:
   > 
   > > /query/l1 for Agno-based L1 queries.
   > > /workflow/submit for submitting orchestration tasks.
   > > /validate for schema checks.
   > > /retrieve for multi-hop retrieval.
   > 
   > * The Python SDK wraps these endpoints and provides high-level functions, 
error handling, and logging.
   > * Pre-fine-tuned with robust LLMs using few-shot or one-shot prompting.
   > * Offers a simplified interface where users only need to provide a natural 
language prompt.
   > * Customizable pipeline where developers can modify key components (LLM 
selection, prompt configuration, integration with vector databases like 
Pinecone, FAISS, Qdrant).
   > * Leverage the modular “runnables” design inspired by LangChain to allow 
easy insertion or replacement of retrieval steps.
   > * Minimize latency by combining HugeGraph’s native caching (e.g., via 
RocksDB) with Agno’s shared memory features.
   > * Develop a caching microservice that first checks an in-memory cache and 
then falls back to RocksDB.
   > * Ensure that cached results are seamlessly used across L1 and multi-hop 
retrieval layers.
   > * Package each architectural layer as its own Docker container.
   > * Use orchestration tools e.g., Kubernetes
   > * Define strict API contracts between services.
   > * Integrate Prometheus (or a similar tool) into each microservice to 
collect metrics
   > 
   > graph A[User Query/Input] --> B{HTTP API Gateway} B --> C[Agno L1 Query 
Service] B --> D[CrewFlow Orchestrator] D --> E[Dynamic Agent Creation] E --> 
F[Workflow Execution] F --> G[Pydantic Validation Middleware] D --> H[Retrieve 
Request] H --> I[LlamaIndex Recursive Retriever] I --> J[Hybrid Caching Layer 
(RocksDB + Shared Memory)] G & J --> K[Result Aggregator] K --> L[HTTP API 
Gateway -> Response]
   > 
   > What are your thoughts on this approach 
[@imbajin](https://github.com/imbajin) ? Further I'd also like your thoughts 
about what I mentioned regarding LN queries and how we'd go about to handle 
them. But I'd still stand by what I said, that implementing this is a seperate 
project in itself and would require lots of time and expertise to do before it 
can be put into production due to the added complexities of the architecture.
   
   I searched and looked at the code for each component that mentioned here but 
I think it will be wiser to make the Architecte simpler as this is just going 
to increase complexity and time it will take, we need to make something that 
will fast robust and also easy to understand as even though devs will be able 
to understand the complex architecture too but they won't be willing to invest 
that much time to just integrate an agentic retriever which is still a small 
thing in the greater scheme. Also now days the work is so fast paced that no 
one will be willing to invest if there will be so many dependencies to deal 
with and also a complex architecture on top of that...
   I have thought of a simpler solution for the same and also I will be talking 
to agno's team to get some insights too 
   @imbajin sir please let me know if you will be up for a little discussion 
for the same.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to