(hugegraph-ai) branch main updated: docs: add root agent guidance and refine LLM module rules (#339)

jin Tue, 19 May 2026 19:51:19 -0700

This is an automated email from the ASF dual-hosted git repository.

imbajin pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/hugegraph-ai.git



The following commit(s) were added to refs/heads/main by this push:
     new d36135db docs: add root agent guidance and refine LLM module rules 
(#339)
d36135db is described below

commit d36135dbf4f0604effeff270da0f575e9f7d54e8
Author: imbajin <[email protected]>
AuthorDate: Wed May 20 10:51:05 2026 +0800

    docs: add root agent guidance and refine LLM module rules (#339)
    
    ## Summary
    - add root AGENTS.md with concise repo-wide AI agent guidance
    - refactor hugegraph-llm/AGENTS.md into focused module rules
    - emphasize sufficient and effective tests for code changes
---
 AGENTS.md               |  48 +++++++++++++++++++
 hugegraph-llm/AGENTS.md | 125 +++++++++++++++++-------------------------------
 2 files changed, 92 insertions(+), 81 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 00000000..0376b4e2
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,48 @@
+# AGENTS.md
+
+Guidance for AI agents working in this repository. Keep README content in 
README files; keep this file focused on decisions agents commonly get wrong.
+
+## Stack & Modules
+
+- This is a Python `uv` workspace. Prefer root-level workspace commands unless 
a module-specific file says otherwise.
+- `hugegraph-llm/` is the primary and most frequently changed module. When 
editing or reviewing it, read `hugegraph-llm/AGENTS.md` first.
+- `hugegraph-python-client/` is a supporting dependency for HugeGraph access. 
Change it only when the client contract itself must change, and verify 
`hugegraph-llm` callers when you do.
+- Treat `hugegraph-ml/` and `vermeer-python-client/` as lower-frequency 
modules. Do not expand changes into them without a direct reason.
+
+## Testing Expectations
+
+- Any code change must include sufficient and effective test coverage for the 
changed behavior, regression risk, or failure path.
+- Do not add tests that only improve coverage numbers while mocking away the 
behavior being changed.
+- If a change cannot reasonably include automated tests, state why and provide 
the manual verification performed.
+- Cross-module or shared dependency changes must test the affected downstream 
module, not only the package where the edit was made.
+
+## Code Search Anchors
+
+- `hugegraph-llm/src/hugegraph_llm/` - main LLM, RAG, KG, prompt, API, and 
vector-index code.
+- `hugegraph-python-client/src/pyhugegraph/` - Python client used by LLM code 
to talk to HugeGraph.
+- `pyproject.toml` and module `pyproject.toml` files - workspace membership, 
dependency groups, lint settings, Python versions.
+- `rules/README.md` - staged AI-assisted workflow for multi-file features, API 
contract changes, or cross-module design changes.
+
+## Build & Test
+
+```bash
+uv sync --all-extras
+uv run ruff format --check .
+uv run ruff check .
+```
+
+- Run tests for the affected module rather than defaulting to a 
full-repository test sweep.
+- For `hugegraph-llm`, use the module CI split between unit-style tests and 
integration tests.
+- For `hugegraph-python-client`, include client tests and any `hugegraph-llm` 
tests needed to validate caller compatibility.
+
+## Agent Workflow
+
+- Before editing, identify whether the change belongs to `hugegraph-llm`, 
`hugegraph-python-client`, or root workspace configuration.
+- For multi-file features, API contract changes, or cross-module design 
changes, read `rules/README.md` first.
+- Keep changes scoped to the module that owns the behavior. Avoid 
opportunistic rewrites in sibling modules.
+
+## Cross-module Notes
+
+- Root dependency or workspace changes can affect multiple packages; verify 
the package that consumes the changed dependency.
+- `hugegraph-llm` imports `hugegraph-python-client`; client API changes must 
preserve or deliberately update those call sites.
+- Do not duplicate README quick-start, Docker, or deployment instructions in 
AGENTS files.
diff --git a/hugegraph-llm/AGENTS.md b/hugegraph-llm/AGENTS.md
index 4ca973ff..bc50fb5d 100644
--- a/hugegraph-llm/AGENTS.md
+++ b/hugegraph-llm/AGENTS.md
@@ -1,93 +1,56 @@
-# Basic Introduction
+# hugegraph-llm AGENTS.md
 
-This file provides guidance to AI coding tools and developers when working 
with code in this repository.
+Module-specific guidance for AI agents. Root `../AGENTS.md` still applies; 
this file only adds rules that matter inside `hugegraph-llm`.
 
-## Project Overview
+## Module Focus
 
-HugeGraph-LLM is a comprehensive toolkit that bridges graph databases and 
large language models,
-part of the Apache HugeGraph AI ecosystem. It enables seamless integration 
between HugeGraph and LLMs for building
-intelligent applications with three main capabilities: Knowledge Graph 
Construction, Graph-Enhanced RAG,
-and Text2Gremlin query generation.
+- This module owns GraphRAG, knowledge graph construction, and Text2Gremlin 
behavior.
+- Prefer changes in the owning layer first. If a fix crosses API, flow, node, 
operator, model, prompt, or index boundaries, preserve the existing contract or 
update tests for the new contract explicitly.
+- `hugegraph-python-client` is the HugeGraph access boundary. Prefer adapting 
LLM-side code unless the client contract is actually wrong.
 
-## Tech Stack
+## Testing Expectations
 
-- **Language**: Python 3.10+ (uv package manager required)
-- **Framework**: FastAPI + Gradio for web interfaces
-- **Graph Database**: HugeGraph Server 1.5+
-- **LLM Integration**: LiteLLM (supports OpenAI, Ollama, Qianfan, etc.)
-- **Vector Operations**: FAISS, NumPy, and will support multiple Vector DB soon
-- **Code style**: ruff & mypy (on the way, soon)
-- **Key Dependencies**: hugegraph-python-client
+- Any code change must add or update tests that exercise the changed behavior, 
regression risk, or failure path.
+- For pipeline changes, cover the relevant flow, node, or operator contract 
instead of only testing a helper in isolation.
+- For API or request/response changes, cover the public model or endpoint 
behavior.
+- For prompt or Text2Gremlin changes, preserve and test the expected output 
contract, especially Gremlin-only fenced output when callers depend on it.
+- External-service tests may be skipped only through explicit, traceable skip 
controls. Do not hide failures by silently swallowing HugeGraph, LLM provider, 
or vector DB connection errors.
 
-## Essential Commands
+## Code Search Anchors
+
+- `src/hugegraph_llm/api/` and `src/hugegraph_llm/api/models/` - FastAPI 
endpoints and request/response models.
+- `src/hugegraph_llm/flows/`, `src/hugegraph_llm/nodes/`, and 
`src/hugegraph_llm/operators/` - pipeline orchestration and executable units.
+- `src/hugegraph_llm/config/` and `src/hugegraph_llm/resources/` - runtime 
config and prompt resources.
+- `src/hugegraph_llm/indices/` - vector index implementations and backends.
+- `src/tests/` - unit, integration, and contract tests for this module.
+
+## Build & Test
+
+From the repository root:
 
-### Running the Application
 ```bash
-# Install dependencies and create virtual environment (uv already installed)
-uv sync
-# Activate virtual environment
-source .venv/bin/activate
-# Launch main RAG demo application
-python -m hugegraph_llm.demo.rag_demo.app
-# Custom host/port
-python -m hugegraph_llm.demo.rag_demo.app --host 127.0.0.1 --port 18001
+uv sync --extra llm --extra dev
 ```
 
-### Testing
+From `hugegraph-llm/`, these commands mirror the CI split:
+
 ```bash
-pytest src/tests/
-# Or using unittest
-python -m unittest discover src/tests/
+SKIP_EXTERNAL_SERVICES=true uv run pytest src/tests/config/ 
src/tests/document/ src/tests/middleware/ src/tests/operators/ 
src/tests/models/ src/tests/indices/ src/tests/test_utils.py -v --tb=short
+SKIP_EXTERNAL_SERVICES=true uv run pytest 
src/tests/integration/test_graph_rag_pipeline.py 
src/tests/integration/test_kg_construction.py 
src/tests/integration/test_rag_pipeline.py -v --tb=short
 ```
-PS: we skip Docker Deployment details here.
-
-## Architecture Overview
-
-### Core Directory Structure
-- `src/hugegraph_llm/api/` - FastAPI endpoints (rag_api.py, admin_api.py)
-- `src/hugegraph_llm/demo/rag_demo/` - Main Gradio UI application
-- `src/hugegraph_llm/operators/` - Core processing pipelines
-- `src/hugegraph_llm/models/` - LLM, embedding, reranker implementations
-- `src/hugegraph_llm/indices/` - Vector and graph indexing
-- `src/hugegraph_llm/config/` - Configuration management
-- `src/hugegraph_llm/utils/` - Utilities, logging, decorators
-
-### Key Processing Pipelines
-
-1. **KG Construction** (`operators/kg_construction_task.py`)
-   - Text chunking and vectorization pipeline
-   - Schema management and validation
-   - Information extraction using LLMs
-   - Graph data commitment to HugeGraph
-
-2. **Graph RAG** (`operators/graph_rag_task.py`)
-   - Multi-modal retrieval (vector, graph, hybrid)
-   - Keyword extraction and entity matching
-   - Graph traversal and Gremlin query generation
-   - Result merging and reranking
-
-3. **Text2Gremlin** (`operators/gremlin_generate_task.py`)
-   - Natural language to Gremlin query conversion
-   - Template-based and few-shot learning approaches
-
-### Configuration Management
-
-- Main config: `.env` file (generate with `config.generate` module)
-- Prompt config: `src/hugegraph_llm/resources/demo/config_prompt.yaml`
-- HugeGraph connection settings in environment variables
-- LLM provider configuration through `LiteLLM` & `openai/ollama` client
-
-## Development Workflow
-
-1. **Prerequisites**: Ensure HugeGraph Server is running and LLM provider is 
configured
-2. **Environment Setup**: Use UV for dependency management, activate virtual 
environment
-3. **Configuration**: Generate configs and set up .env file with proper 
credentials
-4. **Development**: Use Gradio demo for interactive testing, FastAPI for 
programmatic access
-5. **Testing**: Unit tests use standard unittest framework in src/tests/
-
-## Important Notes
-
-- Always use `uv` package manager instead of `pip` for dependency management
-- HugeGraph Server must be accessible while running the app
-- The system supports multiple LLM providers through `LiteLLM` abstraction
-- Each file should be better < 600 lines for maintainability
+
+- Use narrower `pytest` targets while iterating, but finish with coverage that 
matches the touched behavior.
+- For Python code changes, run root `uv run ruff format --check .` and `uv run 
ruff check .` before handoff.
+
+## LLM-specific Rules
+
+- Preserve Text2Gremlin prompt/output contracts unless the task explicitly 
changes them.
+- Keep GraphRAG retrieval, KG construction, and Text2Gremlin paths 
behaviorally separate; shared helpers should not blur pipeline semantics.
+- Do not introduce a new LLM, embedding, reranker, or vector DB dependency 
without wiring it through existing config patterns.
+- Treat HugeGraph Server, LLM providers, and vector databases as external 
services with explicit configuration and explicit test skip behavior.
+
+## Style
+
+- Python is `>=3.10,<3.12` for this module.
+- Use `uv` for dependency management; do not document or rely on ad hoc `pip 
install` workflows.
+- Ruff and mypy behavior comes from `pyproject.toml`; do not duplicate their 
rule sets here.

(hugegraph-ai) branch main updated: docs: add root agent guidance and refine LLM module rules (#339)

Reply via email to