This is an automated email from the ASF dual-hosted git repository. jin pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/incubator-hugegraph-ai.git
The following commit(s) were added to refs/heads/main by this push: new fe94881b chore(llm): add a basic LLM/AI coding instruction file (#290) fe94881b is described below commit fe94881b7b43f24e7d49b3174d0fad7b5942bcc5 Author: imbajin <j...@apache.org> AuthorDate: Fri Jul 25 15:50:29 2025 +0800 chore(llm): add a basic LLM/AI coding instruction file (#290) PR #293 follow it to create first step --- hugegraph-llm/basic-introduction.md | 93 +++++++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) diff --git a/hugegraph-llm/basic-introduction.md b/hugegraph-llm/basic-introduction.md new file mode 100644 index 00000000..a15eb432 --- /dev/null +++ b/hugegraph-llm/basic-introduction.md @@ -0,0 +1,93 @@ +# Basic Introduction + +This file provides guidance to AI coding tools and developers when working with code in this repository. + +## Project Overview + +HugeGraph-LLM is a comprehensive toolkit that bridges graph databases and large language models, +part of the Apache HugeGraph AI ecosystem. It enables seamless integration between HugeGraph and LLMs for building +intelligent applications with three main capabilities: Knowledge Graph Construction, Graph-Enhanced RAG, +and Text2Gremlin query generation. + +## Tech Stack + +- **Language**: Python 3.10+ (uv package manager required) +- **Framework**: FastAPI + Gradio for web interfaces +- **Graph Database**: HugeGraph Server 1.5+ +- **LLM Integration**: LiteLLM (supports OpenAI, Ollama, Qianfan, etc.) +- **Vector Operations**: FAISS, NumPy, and will support multiple Vector DB soon +- **Code style**: ruff & mypy (on the way, soon) +- **Key Dependencies**: hugegraph-python-client + +## Essential Commands + +### Running the Application +```bash +# Install dependencies and create virtual environment (uv already installed) +uv sync +# Activate virtual environment +source .venv/bin/activate +# Launch main RAG demo application +python -m hugegraph_llm.demo.rag_demo.app +# Custom host/port +python -m hugegraph_llm.demo.rag_demo.app --host 127.0.0.1 --port 18001 +``` + +### Testing +```bash +pytest src/tests/ +# Or using unittest +python -m unittest discover src/tests/ +``` +PS: we skip Docker Deployment details here. + +## Architecture Overview + +### Core Directory Structure +- `src/hugegraph_llm/api/` - FastAPI endpoints (rag_api.py, admin_api.py) +- `src/hugegraph_llm/demo/rag_demo/` - Main Gradio UI application +- `src/hugegraph_llm/operators/` - Core processing pipelines +- `src/hugegraph_llm/models/` - LLM, embedding, reranker implementations +- `src/hugegraph_llm/indices/` - Vector and graph indexing +- `src/hugegraph_llm/config/` - Configuration management +- `src/hugegraph_llm/utils/` - Utilities, logging, decorators + +### Key Processing Pipelines + +1. **KG Construction** (`operators/kg_construction_task.py`) + - Text chunking and vectorization pipeline + - Schema management and validation + - Information extraction using LLMs + - Graph data commitment to HugeGraph + +2. **Graph RAG** (`operators/graph_rag_task.py`) + - Multi-modal retrieval (vector, graph, hybrid) + - Keyword extraction and entity matching + - Graph traversal and Gremlin query generation + - Result merging and reranking + +3. **Text2Gremlin** (`operators/gremlin_generate_task.py`) + - Natural language to Gremlin query conversion + - Template-based and few-shot learning approaches + +### Configuration Management + +- Main config: `.env` file (generate with `config.generate` module) +- Prompt config: `src/hugegraph_llm/resources/demo/config_prompt.yaml` +- HugeGraph connection settings in environment variables +- LLM provider configuration through `LiteLLM` & `openai/ollama` client + +## Development Workflow + +1. **Prerequisites**: Ensure HugeGraph Server is running and LLM provider is configured +2. **Environment Setup**: Use UV for dependency management, activate virtual environment +3. **Configuration**: Generate configs and set up .env file with proper credentials +4. **Development**: Use Gradio demo for interactive testing, FastAPI for programmatic access +5. **Testing**: Unit tests use standard unittest framework in src/tests/ + +## Important Notes + +- Always use `uv` package manager instead of `pip` for dependency management +- HugeGraph Server must be accessible while running the app +- The system supports multiple LLM providers through `LiteLLM` abstraction +- Each file should be better < 600 lines for maintainability