This is an automated email from the ASF dual-hosted git repository.
jin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hugegraph-doc.git
The following commit(s) were added to refs/heads/master by this push:
new 0f574cf4 doc: add hugegraph-ai doc (#331)
0f574cf4 is described below
commit 0f574cf4ee63e97cf4f4c2d9998fd10cb386b761
Author: Simon Cheung <[email protected]>
AuthorDate: Fri Mar 8 10:14:28 2024 +0800
doc: add hugegraph-ai doc (#331)
Link https://github.com/apache/incubator-hugegraph-ai/pull/30
---
README.md | 1 +
content/cn/docs/images/gradio-config.png | Bin 0 -> 115705 bytes
content/cn/docs/images/gradio-kg.png | Bin 0 -> 246455 bytes
content/cn/docs/images/gradio-rag-1.png | Bin 0 -> 32135 bytes
content/cn/docs/images/gradio-rag-2.png | Bin 0 -> 59572 bytes
content/cn/docs/images/gradio-rag.png | Bin 0 -> 156454 bytes
content/cn/docs/images/kg-uml.png | Bin 0 -> 49357 bytes
content/cn/docs/quickstart/hugegraph-ai.md | 147 ++++++++++++++++++++++++++++
content/en/docs/images/gradio-config.png | Bin 0 -> 115705 bytes
content/en/docs/images/gradio-kg.png | Bin 0 -> 246455 bytes
content/en/docs/images/gradio-rag-1.png | Bin 0 -> 32135 bytes
content/en/docs/images/gradio-rag-2.png | Bin 0 -> 59572 bytes
content/en/docs/images/gradio-rag.png | Bin 0 -> 156454 bytes
content/en/docs/images/kg-uml.png | Bin 0 -> 49357 bytes
content/en/docs/quickstart/hugegraph-ai.md | 151 +++++++++++++++++++++++++++++
15 files changed, 299 insertions(+)
diff --git a/README.md b/README.md
index 2069624a..2a678be2 100644
--- a/README.md
+++ b/README.md
@@ -50,6 +50,7 @@ The functions of this system include but are not limited to:
-
[HugeGraph-Computer](https://hugegraph.apache.org/docs/quickstart/hugegraph-computer):
HugeGraph-Computer is a distributed graph processing system for HugeGraph
(OLAP). It is an implementation of
[Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf). It runs on
Kubernetes framework;
-
[HugeGraph-Hubble](https://hugegraph.apache.org/docs/quickstart/hugegraph-hubble):
HugeGraph-Hubble is HugeGraph's web visualization management platform, a
one-stop visual analysis platform. The platform covers the whole process from
data modeling, to rapid data import, to online and offline analysis of data,
and unified management of graphs;
-
[HugeGraph-Tools](https://hugegraph.apache.org/docs/quickstart/hugegraph-tools):
HugeGraph-Tools is HugeGraph's deployment and management tools, including
functions such as managing graphs, backup/restore, Gremlin execution, etc.
+- [HugeGraph-Ai
(Beta)](https://hugegraph.apache.org/docs/quickstart/hugegraph-ai):
HugeGraph-Ai is a tool that integrates HugeGraph and artificial intelligence
(AI), including applications combined with large models, integration with graph
machine learning components, etc., to provide comprehensive support for
developers to use HugeGraph's AI capabilities in projects.
### Subscribe the mailing list
diff --git a/content/cn/docs/images/gradio-config.png
b/content/cn/docs/images/gradio-config.png
new file mode 100644
index 00000000..b5a28252
Binary files /dev/null and b/content/cn/docs/images/gradio-config.png differ
diff --git a/content/cn/docs/images/gradio-kg.png
b/content/cn/docs/images/gradio-kg.png
new file mode 100644
index 00000000..9e80b28d
Binary files /dev/null and b/content/cn/docs/images/gradio-kg.png differ
diff --git a/content/cn/docs/images/gradio-rag-1.png
b/content/cn/docs/images/gradio-rag-1.png
new file mode 100644
index 00000000..0a4df78c
Binary files /dev/null and b/content/cn/docs/images/gradio-rag-1.png differ
diff --git a/content/cn/docs/images/gradio-rag-2.png
b/content/cn/docs/images/gradio-rag-2.png
new file mode 100644
index 00000000..ee24cc43
Binary files /dev/null and b/content/cn/docs/images/gradio-rag-2.png differ
diff --git a/content/cn/docs/images/gradio-rag.png
b/content/cn/docs/images/gradio-rag.png
new file mode 100644
index 00000000..855ed655
Binary files /dev/null and b/content/cn/docs/images/gradio-rag.png differ
diff --git a/content/cn/docs/images/kg-uml.png
b/content/cn/docs/images/kg-uml.png
new file mode 100644
index 00000000..d823e7f8
Binary files /dev/null and b/content/cn/docs/images/kg-uml.png differ
diff --git a/content/cn/docs/quickstart/hugegraph-ai.md
b/content/cn/docs/quickstart/hugegraph-ai.md
new file mode 100644
index 00000000..1e0f6d56
--- /dev/null
+++ b/content/cn/docs/quickstart/hugegraph-ai.md
@@ -0,0 +1,147 @@
+---
+title: "HugeGraph-Ai Quick Start (Beta)"
+linkTitle: "使用 HugeGraph-Ai (Beta)"
+weight: 4
+---
+
+### 1 HugeGraph-Ai 概述
+hugegraph-ai 旨在探索 HugeGraph 与人工智能(AI)的融合,包括与大模型结合的应用,与图机器学习组件的集成等,为开发者在项目中利用
HugeGraph 的 AI 能力提供全面支持。
+
+### 2 环境要求
+- python 3.8+
+- hugegraph 1.0.0+
+
+### 3 准备工作
+- 启动 HugeGraph 数据库,你可以通过 Docker
来实现。请参考这个[链接](https://hub.docker.com/r/hugegraph/hugegraph)获取指引。
+- 启动 gradio 交互式 demo,你可以通过以下命令启动,启动后打开
[http://127.0.0.1:8001](http://127.0.0.1:8001)
+```bash
+# ${PROJECT_ROOT_DIR} 为 hugegraph-ai 的根目录,需要自行配置
+export
PYTHONPATH=${PROJECT_ROOT_DIR}/hugegraph-llm/src:${PROJECT_ROOT_DIR}/hugegraph-python-client/src
+python3 ./hugegraph-llm/src/hugegraph_llm/utils/gradio_demo.py
+```
+- 配置 HugeGraph 数据库连接信息和 LLM 模型信息,可以通过两种方式配置:
+ 1. 配置 `./hugegraph-llm/src/config/config.ini` 文件
+ 2. 在 gradio 中,分别完成 LLM 和 HugeGraph 的配置后,点击 `Initialize
configs`,将返回初始化后的完整配置文件。如图所示:
+ 
+- 离线下载 NLTK stopwords
+```bash
+python3 ./hugegraph_llm/operators/common_op/nltk_helper.py
+```
+
+
+### 4 使用说明
+#### 4.1 通过 LLM 在 HugeGraph 中构建知识图谱
+##### 4.1.1 通过 gradio 交互式界面构建知识图谱
+- 参数说明:
+ - Text: 输入的文本。
+ - Schema:接受以下两种类型的文本:
+ - 用户定义的 JSON 格式模式。
+ - 指定 HugeGraph 图实例的名称,它将自动提取图的模式。
+ - Disambiguate word sense:是否进行词义消除歧义。
+ - Commit to hugegraph:是否将构建的知识图谱提交到 HugeGraph 服务器
+
+
+
+##### 4.1.2 通过代码构建知识图谱
+- 完整代码
+```python
+from hugegraph_llm.llms.init_llm import LLMs
+from hugegraph_llm.operators.kg_construction_task import KgBuilder
+
+llm = LLMs().get_llm()
+builder = KgBuilder(llm)
+(
+ builder
+ .import_schema(from_hugegraph="test_graph").print_result()
+ .extract_triples(TEXT).print_result()
+ .disambiguate_word_sense().print_result()
+ .commit_to_hugegraph()
+ .run()
+)
+```
+- 时序图
+
+
+
+1. 初始化: 初始化 LLMs 实例,获取 LLM,然后创建图谱构建的任务实例 `KgBuilder`,KgBuilder 中定义了多个
operator,用户可以根据需求自由组合达到目的 。(tip: `print_result()` 可以在控制台打印每一步输出的结果,不影响整体执行逻辑)
+
+```python
+llm = LLMs().get_llm()
+builder = KgBuilder(llm)
+```
+2. 导入 Schema:使用 `import_schema` 方法导入, 支持三种模式:
+ - 从 HugeGraph 实例导入,指定 HugeGraph 图实例的名称,它将自动提取图的模式。
+ - 从用户定义的模式导入,接受用户定义的 JSON 格式模式。
+ - 从提取结果导入(即将发布)
+```python
+# Import schema from a HugeGraph instance
+builder.import_schema(from_hugegraph="test_graph").print_result()
+# Import schema from user-defined schema
+builder.import_schema(from_user_defined="xxx").print_result()
+# Import schema from an extraction result
+builder.import_schema(from_extraction="xxx").print_result()
+```
+3. 提取三元组:使用 `extract_triples` 方法从文本中提取三元组。
+
+```python
+TEXT = "Meet Sarah, a 30-year-old attorney, and her roommate, James, whom
she's shared a home with since 2010."
+builder.extract_triples(TEXT).print_result()
+```
+4. 消除词义歧义:使用 `disambiguate_word_sense` 方法消除词义歧义。
+
+```python
+builder.disambiguate_word_sense().print_result()
+```
+5. 提交到 HugeGraph:使用 `commit_to_hugegraph` 方法提交构建的知识图谱到 HugeGraph 实例。
+
+```python
+builder.commit_to_hugegraph().print_result()
+```
+
+6. 运行:使用 `run` 方法执行上述操作。
+```python
+builder.run()
+```
+
+#### 4.2 基于 HugeGraph 的检索增强生成(RAG)
+##### 4.1.1 通过 gradio 交互问答
+1. 首先点击 `Initialize HugeGraph test data` 按钮,初始化 HugeGraph 数据。
+ 
+2. 然后点击 `Retrieval augmented generation` 按钮,生成问题的答案。
+ 
+
+##### 4.1.2 通过代码构建 Graph RAG
+- 完整代码
+```python
+graph_rag = GraphRAG()
+result = (
+ graph_rag.extract_keyword(text="Tell me about Al Pacino.").print_result()
+ .query_graph_for_rag(
+ max_deep=2,
+ max_items=30
+ ).print_result()
+ .synthesize_answer().print_result()
+ .run(verbose=True)
+)
+```
+1. extract_keyword: 提取关键词, 并进行近义词扩展
+```python
+graph_rag.extract_keyword(text="Tell me about Al Pacino.").print_result()
+```
+2. query_graph_for_rag: 从 HugeGraph 中检索对应的关键词,及其多度的关联关系
+ - max_deep: hugegraph 检索的最大深度
+ - max_items: hugegraph 最大返回结果数
+```python
+graph_rag.query_graph_for_rag(
+ max_deep=2,
+ max_items=30
+).print_result()
+```
+3. synthesize_answer: 针对提问,汇总结果,组织语言回答问题。
+```python
+graph_rag.synthesize_answer().print_result()
+```
+4. run: 执行上述操作。
+```python
+graph_rag.run(verbose=True)
+```
diff --git a/content/en/docs/images/gradio-config.png
b/content/en/docs/images/gradio-config.png
new file mode 100644
index 00000000..b5a28252
Binary files /dev/null and b/content/en/docs/images/gradio-config.png differ
diff --git a/content/en/docs/images/gradio-kg.png
b/content/en/docs/images/gradio-kg.png
new file mode 100644
index 00000000..9e80b28d
Binary files /dev/null and b/content/en/docs/images/gradio-kg.png differ
diff --git a/content/en/docs/images/gradio-rag-1.png
b/content/en/docs/images/gradio-rag-1.png
new file mode 100644
index 00000000..0a4df78c
Binary files /dev/null and b/content/en/docs/images/gradio-rag-1.png differ
diff --git a/content/en/docs/images/gradio-rag-2.png
b/content/en/docs/images/gradio-rag-2.png
new file mode 100644
index 00000000..ee24cc43
Binary files /dev/null and b/content/en/docs/images/gradio-rag-2.png differ
diff --git a/content/en/docs/images/gradio-rag.png
b/content/en/docs/images/gradio-rag.png
new file mode 100644
index 00000000..855ed655
Binary files /dev/null and b/content/en/docs/images/gradio-rag.png differ
diff --git a/content/en/docs/images/kg-uml.png
b/content/en/docs/images/kg-uml.png
new file mode 100644
index 00000000..d823e7f8
Binary files /dev/null and b/content/en/docs/images/kg-uml.png differ
diff --git a/content/en/docs/quickstart/hugegraph-ai.md
b/content/en/docs/quickstart/hugegraph-ai.md
new file mode 100644
index 00000000..39bacdac
--- /dev/null
+++ b/content/en/docs/quickstart/hugegraph-ai.md
@@ -0,0 +1,151 @@
+---
+title: "HugeGraph-Ai Quick Start (Beta)"
+linkTitle: "Explore with HugeGraph-Ai (Beta)"
+weight: 4
+---
+
+### 1 HugeGraph-Ai Overview
+hugegraph-ai aims to explore the integration of HugeGraph and artificial
intelligence (AI), including applications combined with large models,
integration with graph machine learning components, etc., to provide
comprehensive support for developers to use HugeGraph's AI capabilities in
projects.
+
+### 2 Environment Requirements
+- python 3.8+
+- hugegraph 1.0.0+
+
+### 3 Preparation
+- Start the HugeGraph database, you can achieve this through Docker. Please
refer to this [link](https://hub.docker.com/r/hugegraph/hugegraph) for guidance.
+- Start the gradio interactive demo, you can start with the following command,
and open [http://127.0.0.1:8001](http://127.0.0.1:8001) after starting
+
+```bash
+# ${PROJECT_ROOT_DIR} is the root directory of hugegraph-ai, which needs to be
configured by yourself
+export
PYTHONPATH=${PROJECT_ROOT_DIR}/hugegraph-llm/src:${PROJECT_ROOT_DIR}/hugegraph-python-client/src
+python3 ./hugegraph-llm/src/hugegraph_llm/utils/gradio_demo.py
+```
+- Configure HugeGraph database connection information and LLM information,
which can be configured in two ways:
+ 1. Configure the `./hugegraph-llm/src/config/config.ini` file
+ 2. In gradio, after completing the configurations for LLM and HugeGraph,
click on `Initialize configs`, the complete and initialized configuration file
will be outputted.
+ 
+- offline download NLTK stopwords
+```bash
+python3 ./hugegraph_llm/operators/common_op/nltk_helper.py
+```
+
+### 4 How to use
+#### 4.1 Build a knowledge graph in HugeGraph through LLM
+##### 4.1.1 Build a knowledge graph through the gradio interactive interface
+- Parameter description:
+ - Text: The input text.
+ - Schema: Accepts the following two types of text:
+ - User-defined JSON format schema.
+ - Specify the name of the HugeGraph graph instance, which will
automatically extract the schema of the graph.
+ - Disambiguate word sense: Whether to disambiguate word sense.
+ - Commit to hugegraph: Whether to submit the constructed knowledge graph to
the HugeGraph server
+
+
+
+##### 4.1.2 Build a knowledge graph through code
+- Complete code
+```python
+from hugegraph_llm.llms.init_llm import LLMs
+from hugegraph_llm.operators.kg_construction_task import KgBuilder
+
+llm = LLMs().get_llm()
+builder = KgBuilder(llm)
+(
+ builder
+ .import_schema(from_hugegraph="test_graph").print_result()
+ .extract_triples(TEXT).print_result()
+ .disambiguate_word_sense().print_result()
+ .commit_to_hugegraph()
+ .run()
+)
+```
+- Sequence Diagram
+ 
+
+1. Initialize: Initialize the LLMs instance, get the LLM, and then create a
task instance `KgBuilder` for graph construction. `KgBuilder` defines multiple
operators, and users can freely combine them according to their needs. (tip:
`print_result()` can print the result of each step in the console, without
affecting the overall execution logic)
+
+```python
+llm = LLMs().get_llm()
+builder = KgBuilder(llm)
+```
+2. Import Schema: Import using the `import_schema` method, which supports
three modes:
+ - Import from a HugeGraph instance, specify the name of the HugeGraph
graph instance, and it will automatically extract the schema of the graph.
+ - Import from a user-defined schema, accept user-defined JSON format
schema.
+ - Import from the extraction result (release soon)
+
+```python
+# Import schema from a HugeGraph instance
+builder.import_schema(from_hugegraph="test_graph").print_result()
+# Import schema from user-defined schema
+builder.import_schema(from_user_defined="xxx").print_result()
+# Import schema from an extraction result
+builder.import_schema(from_extraction="xxx").print_result()
+```
+3. Extract triples: Use the `extract_triples` method to extract triples from
the text.
+
+```python
+TEXT = "Meet Sarah, a 30-year-old attorney, and her roommate, James, whom
she's shared a home with since 2010."
+builder.extract_triples(TEXT).print_result()
+```
+4. Disambiguate word sense: Use the `disambiguate_word_sense` method to
disambiguate word sense.
+
+```python
+builder.disambiguate_word_sense().print_result()
+```
+5. Commit to HugeGraph: Use the `commit_to_hugegraph` method to submit the
constructed knowledge graph to the HugeGraph instance.
+
+```python
+builder.commit_to_hugegraph().print_result()
+```
+
+6. Run: Use the `run` method to execute the above operations.
+
+```python
+builder.run()
+```
+
+#### 4.2 Retrieval augmented generation (RAG) based on HugeGraph
+##### 4.1.1 Interactive Q&A through gradio
+1. First click the `Initialize HugeGraph test data` button to initialize the
HugeGraph data.
+ 
+2. Then click the `Retrieval augmented generation` button to generate the
answer to the question.
+ 
+
+##### 4.1.2 Build Graph RAG through code
+- code
+```python
+graph_rag = GraphRAG()
+result = (
+ graph_rag.extract_keyword(text="Tell me about Al Pacino.").print_result()
+ .query_graph_for_rag(
+ max_deep=2,
+ max_items=30
+ ).print_result()
+ .synthesize_answer().print_result()
+ .run(verbose=True)
+)
+```
+1. extract_keyword: Extract keywords and expand synonyms.
+
+```python
+graph_rag.extract_keyword(text="Tell me about Al Pacino.").print_result()
+```
+2. query_graph_for_rag: Retrieve the corresponding keywords and their
multi-degree associated relationships from HugeGraph.
+ - max_deep: The maximum depth of hugegraph retrieval.
+ - max_items: The maximum number of results returned by hugegraph.
+
+```python
+graph_rag.query_graph_for_rag(
+ max_deep=2,
+ max_items=30
+).print_result()
+```
+3. synthesize_answer: Summarize the results and organize the language to
answer the question.
+```python
+graph_rag.synthesize_answer().print_result()
+```
+4. run: Execute the above operations.
+
+```python
+graph_rag.run(verbose=True)
+```