GitHub user Kryst4lDem0ni4s added a comment to the discussion: [Discussion] The 
selection of Agentic/Taskflow frame

I looked further into what @chiruu12 suggests, about not using off-the-shelf 
agentic components that can prevent developers from understanding critical 
behaviors, so that over-time the behavior of the service doesnt go out of 
control.
Indeed it would be possible to write a custom HG_agentic library that borrows 
only the necessary pieces, so hugegraph can maintain control over the logic and 
integration details.

How about combining all of our suggestions that integrates LlamaIndex, 
Pydantic-AI, CrewFlow, and Agno into a dual-mode, modular GraphRAG system while 
avoiding the dependency hell as warned by @Aryankb 

As a hybrid GraphRAG system that supports two modes we can include:
• A beginner-friendly “agentic retriever” that is pre-fine-tuned with robust 
LLM prompting for straightforward use cases, and
• A customizable mode for advanced developers who need to tailor retrieval, 
orchestration, and validation mechanisms.

Key design principles so that everyone can get a good night's sleep:
• Modularity & Microservices: standalone services with clearly defined APIs.
• Dual-Mode Operation: ease of use and deep customization.
• Transparent Integration: extracting core functionalities and integrating them 
in-house.
• Extensive Logging & Monitoring: via Prometheus.
• Containerization: isolate dependencies.

Architectural Layers & Components:
A. Base Layer – Agno for L1 Queries
Handle high-frequency, low-latency queries (e.g., simple entity lookups) with 
optimized parallel execution. Beyond this point we must also seek the correct 
strategy to handle LN queries as well.

Key Features:
• Fast execution with low memory footprint.
• Built-in Gremlin-Cypher transpiler for hybrid query support.
• Integration with a hybrid caching layer that combines Agno shared memory and 
RocksDB.
• Wrap Agno’s core query engine in a microservice that exposes an HTTP endpoint.
• Queries can be configured to pass through a lightweight pre-processing step 
to select between cache and live query execution (L1) specific.

This component when abstracted into our own agentic library will be the base of 
all performance optimizations.

B. Orchestration Layer – CrewAI for Complex Workflows
This would help us manage multi-hop, dynamic queries and agent workflows that 
require intent classification and asynchronous execution and allows 
customization.

Key Features:
• Dynamic intent classification powered by domain-specific embeddings 
(integrated with HugeGraph).
• Event-driven workflow, where subtasks are dynamically generated from a user’s 
plain-English prompt.
• Built-in support for sequencing (sequential/parallel) and conditional 
delegation of agent tasks.
• Adapt core functionalities from CrewAI (CrewFlow) to create a custom 
orchestration module.
• Define a clear API contract for submitting workflows, retrieving status, and 
handling error/fallback logic.

C. Validation Layer – Pydantic
All general schema consistency and data integrity across all operations. THIS 
distinction is necessary to understand that it's sole purpose here should be 
for schema purposes only, not beyond it so far.

Key Features:
• Middleware to validate incoming queries and agent responses.
• Dev-friendly type hints and error reporting.
• Mechanisms to ensure that changes in one layer do not break API contracts.
• Wrap core endpoints of other layers with Pydantic models that perform 
input/output validation.
• Integrate validation middleware as a separate microservice or as decorators 
within the existing service codebase.

Note: This is the general usage of Pydantic, not it's agentic tools. Otherwise 
it is too unpredictable and unsuitable for production.

D. Retrieval Enhancement Layer – LlamaIndex
Finally, in order to provide recursive, multi-hop retrieval functionality 
enhanced by tiered caching, ensuring that complex graph queries are answered 
effectively, LlamaIndex is compatible with CrewAI alreayd, so we'll look 
further into how it's compatibility has been provided.
Key Features:
• Recursive retrieval strategies that work well with hierarchical graph caching.
• Integration with HugeGraph’s OLAP engine for analytical queries.
• Modular “runnables” inspired by LangChain that allow flexible composition of 
retrieval steps.
• Expose LlamaIndex’s retrieval engine via an API that accepts complex, 
multi-hop query parameters.
• Use a caching strategy that combines in-memory (for fast lookups) and 
persistent (RocksDB) storage to accelerate repeated queries.


Summary for the plan with general key points and implementation steps:
- RESTful API endpoints for query submission, workflow orchestration, 
validation, and retrieval.
- A Python SDK (e.g., HG_agentic and HG_orchestrator) that abstracts away the 
internal microservices and provides simple functions for (examples):
Creating agents via plain-English commands.
Configuring custom workflows (sequential, parallel, conditional).
Integrating with existing agent systems (AutoGen).
- Define API endpoints for each core service. For example:
> /query/l1 for Agno-based L1 queries.
> /workflow/submit for submitting orchestration tasks.
> /validate for schema checks.
> /retrieve for multi-hop retrieval. 
- The Python SDK wraps these endpoints and provides high-level functions, error 
handling, and logging.
- Pre-fine-tuned with robust LLMs using few-shot or one-shot prompting.
- Offers a simplified interface where users only need to provide a natural 
language prompt.
- Customizable pipeline where developers can modify key components (LLM 
selection, prompt configuration, integration with vector databases like 
Pinecone, FAISS, Qdrant).
- Leverage the modular “runnables” design inspired by LangChain to allow easy 
insertion or replacement of retrieval steps.
- Minimize latency by combining HugeGraph’s native caching (e.g., via RocksDB) 
with Agno’s shared memory features.
- Develop a caching microservice that first checks an in-memory cache and then 
falls back to RocksDB.
- Ensure that cached results are seamlessly used across L1 and multi-hop 
retrieval layers.
- Package each architectural layer as its own Docker container.
- Use orchestration tools e.g., Kubernetes
- Define strict API contracts between services.
- Integrate Prometheus (or a similar tool) into each microservice to collect 
metrics

```mermaid
graph 
  A[User Query_Input] --> B{HTTP API Gateway}
  B --> C[Agno L1 Query Service]
  B --> D[CrewFlow Orchestrator]
  D --> E[Dynamic Agent Creation]
  E --> F[Workflow Execution]
  F --> G[Pydantic Validation Middleware]
  D --> H[Retrieve Request]
  H --> I[LlamaIndex Recursive Retriever]
  I --> J[Hybrid Caching Layer_RocksDB_Shared Memory]
  G & J --> K[Result Aggregator]
  K --> L[HTTP API Gateway_Response]
```

What are your thoughts on this approach @imbajin ? Further I'd also like your 
thoughts about what I mentioned regarding LN queries and how we'd go about to 
handle them. But I'd still stand by what I said, that implementing this is a 
seperate project in itself and would require lots of time and expertise to do 
before it can be put into production due to the added complexities of the 
architecture.

GitHub link: 
https://github.com/apache/incubator-hugegraph-ai/discussions/203#discussioncomment-12666612

----
This is an automatically sent email for dev@hugegraph.apache.org.
To unsubscribe, please send an email to: dev-unsubscr...@hugegraph.apache.org

Reply via email to