(incubator-hugegraph-ai) branch main updated: feat(llm): added the process of text2gql in graphrag V1.0 (#105)

jin Mon, 09 Dec 2024 02:15:47 -0800

This is an automated email from the ASF dual-hosted git repository.

jin pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-hugegraph-ai.git



The following commit(s) were added to refs/heads/main by this push:
     new cbfca3c  feat(llm): added the process of text2gql in graphrag V1.0 
(#105)
cbfca3c is described below

commit cbfca3c47e903c2034cde51480a3b04188839222
Author: vichayturen <[email protected]>
AuthorDate: Mon Dec 9 18:14:04 2024 +0800

    feat(llm): added the process of text2gql in graphrag V1.0 (#105)
    
    address #10
    
    1. added the process of intelligent generated gremlin retrivecal
    2. added text2gremlin block in rag app
    3. add text2gremlin prompt & config
    4. fix log bug in py-client
    5. ....
    
    We also add a `flag` value under the interface of graph query rag/graph:
    - `1` represents text2gql accurate matching success
    - `0` represents (k-neighbor) generalization matching success
    - `-1` represents no relevant graph info
    
    ---------
    
    Co-authored-by: Simon Cheung <[email protected]>
    Co-authored-by: imbajin <[email protected]>
    Co-authored-by: HaoJin Yang <[email protected]>
---
 hugegraph-llm/README.md                            |  11 +-
 hugegraph-llm/src/hugegraph_llm/api/rag_api.py     |   3 +-
 hugegraph-llm/src/hugegraph_llm/config/__init__.py |   2 +-
 hugegraph-llm/src/hugegraph_llm/config/config.py   |   8 +
 .../src/hugegraph_llm/config/config_data.py        |  24 +++
 .../demo/gremlin_generate_web_demo.py              | 208 ---------------------
 .../src/hugegraph_llm/demo/rag_demo/app.py         |  18 +-
 .../src/hugegraph_llm/demo/rag_demo/rag_block.py   |  12 +-
 .../demo/rag_demo/text2gremlin_block.py            | 168 +++++++++++++++++
 .../src/hugegraph_llm/indices/vector_index.py      |   4 +-
 .../src/hugegraph_llm/models/llms/base.py          |  21 +--
 .../src/hugegraph_llm/operators/graph_rag_task.py  |   9 +-
 .../operators/gremlin_generate_task.py             |  34 +++-
 .../operators/hugegraph_op/graph_rag_query.py      | 104 +++++++++--
 .../operators/hugegraph_op/schema_manager.py       |  22 +++
 .../index_op/gremlin_example_index_query.py        |  46 ++++-
 .../operators/index_op/semantic_id_query.py        |  45 +++--
 .../operators/index_op/vector_index_query.py       |   3 +-
 .../operators/llm_op/gremlin_generate.py           | 154 +++++++--------
 .../src/hugegraph_llm/resources/demo/css.py        |   9 +
 .../hugegraph_llm/resources/demo/text2gremlin.csv  |  99 ++++++++++
 .../src/hugegraph_llm/utils/graph_index_utils.py   |   2 +-
 .../src/pyhugegraph/utils/util.py                  |  43 ++---
 23 files changed, 644 insertions(+), 405 deletions(-)

diff --git a/hugegraph-llm/README.md b/hugegraph-llm/README.md
index a8a4957..aaf9a24 100644
--- a/hugegraph-llm/README.md
+++ b/hugegraph-llm/README.md
@@ -45,21 +45,18 @@ graph systems and large language models.
     ```bash
     python3 -m hugegraph_llm.demo.rag_demo.app --host 127.0.0.1 --port 18001
     ```
-6. Or start the gradio interactive demo of **Text2Gremlin**, you can run with 
the following command, and open http://127.0.0.1:8002 after starting. You can 
also change the default host `0.0.0.0` and port `8002` as above. (🚧ing)
-    ```bash
-    python3 -m hugegraph_llm.demo.gremlin_generate_web_demo
-   ```
-7. After running the web demo, the config file `.env` will be automatically 
generated at the path `hugegraph-llm/.env`.    Additionally, a prompt-related 
configuration file `config_prompt.yaml` will also be generated at the path 
`hugegraph-llm/src/hugegraph_llm/resources/demo/config_prompt.yaml`.
+   
+6. After running the web demo, the config file `.env` will be automatically 
generated at the path `hugegraph-llm/.env`.    Additionally, a prompt-related 
configuration file `config_prompt.yaml` will also be generated at the path 
`hugegraph-llm/src/hugegraph_llm/resources/demo/config_prompt.yaml`.
     You can modify the content on the web page, and it will be automatically 
saved to the configuration file after the corresponding feature is triggered.  
You can also modify the file directly without restarting the web application;  
simply refresh the page to load your latest changes.  
     (Optional)To regenerate the config file, you can use `config.generate` 
with `-u` or `--update`.  
     ```bash
     python3 -m hugegraph_llm.config.generate --update
     ```
-8. (__Optional__) You could use 
+7. (__Optional__) You could use 
     
[hugegraph-hubble](https://hugegraph.apache.org/docs/quickstart/hugegraph-hubble/#21-use-docker-convenient-for-testdev)
 
     to visit the graph data, could run it via 
[Docker/Docker-Compose](https://hub.docker.com/r/hugegraph/hubble) 
     for guidance. (Hubble is a graph-analysis dashboard include data 
loading/schema management/graph traverser/display).
-9. (__Optional__) offline download NLTK stopwords  
+8. (__Optional__) offline download NLTK stopwords  
     ```bash
     python ./hugegraph_llm/operators/common_op/nltk_helper.py
     ```
diff --git a/hugegraph-llm/src/hugegraph_llm/api/rag_api.py 
b/hugegraph-llm/src/hugegraph_llm/api/rag_api.py
index 9d1e01a..0685292 100644
--- a/hugegraph-llm/src/hugegraph_llm/api/rag_api.py
+++ b/hugegraph-llm/src/hugegraph_llm/api/rag_api.py
@@ -39,7 +39,8 @@ def graph_rag_recall(
 ) -> dict:
     from hugegraph_llm.operators.graph_rag_task import RAGPipeline
     rag = RAGPipeline()
-    
rag.extract_keywords().keywords_to_vid().query_graphdb().merge_dedup_rerank(
+
+    
rag.extract_keywords().keywords_to_vid().import_schema(settings.graph_name).query_graphdb().merge_dedup_rerank(
         rerank_method=rerank_method,
         near_neighbor_first=near_neighbor_first,
         custom_related_information=custom_related_information,
diff --git a/hugegraph-llm/src/hugegraph_llm/config/__init__.py 
b/hugegraph-llm/src/hugegraph_llm/config/__init__.py
index 87b6d2e..1f39b4c 100644
--- a/hugegraph-llm/src/hugegraph_llm/config/__init__.py
+++ b/hugegraph-llm/src/hugegraph_llm/config/__init__.py
@@ -16,7 +16,7 @@
 # under the License.
 
 
-__all__ = ["settings", "resource_path"]
+__all__ = ["settings", "prompt", "resource_path"]
 
 import os
 from .config import Config, PromptConfig
diff --git a/hugegraph-llm/src/hugegraph_llm/config/config.py 
b/hugegraph-llm/src/hugegraph_llm/config/config.py
index 8209463..aca610f 100644
--- a/hugegraph-llm/src/hugegraph_llm/config/config.py
+++ b/hugegraph-llm/src/hugegraph_llm/config/config.py
@@ -118,6 +118,8 @@ class PromptConfig(PromptData):
 
     def save_to_yaml(self):
         indented_schema = "\n".join([f"  {line}" for line in 
self.graph_schema.splitlines()])
+        indented_text2gql_schema = "\n".join([f"  {line}" for line in 
self.text2gql_graph_schema.splitlines()])
+        indented_gremlin_prompt = "\n".join([f"  {line}" for line in 
self.gremlin_generate_prompt.splitlines()])
         indented_example_prompt = "\n".join([f"    {line}" for line in 
self.extract_graph_prompt.splitlines()])
         indented_question = "\n".join([f"    {line}" for line in 
self.default_question.splitlines()])
         indented_custom_related_information = (
@@ -132,6 +134,9 @@ class PromptConfig(PromptData):
         yaml_content = f"""graph_schema: |
 {indented_schema}
 
+text2gql_graph_schema: |
+{indented_text2gql_schema}
+
 extract_graph_prompt: |
 {indented_example_prompt}
 
@@ -147,6 +152,9 @@ answer_prompt: |
 keywords_extract_prompt: |
 {indented_keywords_extract_template}
 
+gremlin_generate_prompt: |
+{indented_gremlin_prompt}
+
 """
         with open(yaml_file_path, "w", encoding="utf-8") as file:
             file.write(yaml_content)
diff --git a/hugegraph-llm/src/hugegraph_llm/config/config_data.py 
b/hugegraph-llm/src/hugegraph_llm/config/config_data.py
index 57ffe78..f69e04b 100644
--- a/hugegraph-llm/src/hugegraph_llm/config/config_data.py
+++ b/hugegraph-llm/src/hugegraph_llm/config/config_data.py
@@ -219,6 +219,9 @@ Meet Sarah, a 30-year-old attorney, and her roommate, 
James, whom she's shared a
 }
 """
 
+    # TODO: we should provide a better example to reduce the useless 
information
+    text2gql_graph_schema = ConfigData.graph_name
+
     # Extracted from llm_op/keyword_extract.py
     keywords_extract_prompt = """指令：
 请对以下文本执行以下任务：
@@ -266,3 +269,24 @@ KEYWORDS:关键词1,关键词2,...,关键词n
 # Text:
 # {question}
 # """
+
+    gremlin_generate_prompt = """\
+Given the example query-gremlin pairs:
+{example}
+
+Given the graph schema:
+```json
+{schema}
+```
+
+Given the extracted vertex vid:
+{vertices}
+
+Generate gremlin from the following user query.
+{query}
+The output format must be like:
+```gremlin
+g.V().limit(10)
+```
+The generated gremlin is:
+"""
diff --git a/hugegraph-llm/src/hugegraph_llm/demo/gremlin_generate_web_demo.py 
b/hugegraph-llm/src/hugegraph_llm/demo/gremlin_generate_web_demo.py
deleted file mode 100644
index d21a54b..0000000
--- a/hugegraph-llm/src/hugegraph_llm/demo/gremlin_generate_web_demo.py
+++ /dev/null
@@ -1,208 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
-
-
-import json
-import uvicorn
-import gradio as gr
-from fastapi import FastAPI
-from hugegraph_llm.config import settings
-from hugegraph_llm.models.llms.init_llm import LLMs
-from hugegraph_llm.models.embeddings.init_embedding import Embeddings
-from hugegraph_llm.operators.gremlin_generate_task import GremlinGenerator
-
-
-def build_example_vector_index(temp_file):
-    full_path = temp_file.name
-    if full_path.endswith(".json"):
-        with open(full_path, "r", encoding="utf-8") as f:
-            examples = json.load(f)
-    else:
-        return "ERROR: please input json file."
-    builder = GremlinGenerator(
-        llm=LLMs().get_text2gql_llm(),
-        embedding=Embeddings().get_embedding(),
-    )
-    return builder.example_index_build(examples).run()
-
-
-def gremlin_generate(inp, use_schema, use_example, example_num, schema):
-    generator = GremlinGenerator(
-        llm=LLMs().get_text2gql_llm(),
-        embedding=Embeddings().get_embedding(),
-    )
-    if use_example == "true":
-        generator.example_index_query(inp, example_num)
-    context = generator.gremlin_generate(use_schema, use_example, schema).run()
-    return context.get("match_result", "No Results"), context["result"]
-
-
-if __name__ == '__main__':
-    app = FastAPI()
-    with gr.Blocks() as demo:
-        gr.Markdown(
-            """# HugeGraph LLM Text2Gremlin Demo"""
-        )
-        gr.Markdown("## Set up the LLM")
-        llm_dropdown = gr.Dropdown(["openai", "qianfan_wenxin", 
"ollama/local"], value=settings.text2gql_llm_type,
-                                   label="LLM")
-
-
-        @gr.render(inputs=[llm_dropdown])
-        def llm_settings(llm_type):
-            settings.text2gql_llm_type = llm_type
-            if llm_type == "openai":
-                with gr.Row():
-                    llm_config_input = [
-                        gr.Textbox(value=settings.openai_text2gql_api_key, 
label="api_key"),
-                        gr.Textbox(value=settings.openai_text2gql_api_base, 
label="api_base"),
-                        
gr.Textbox(value=settings.openai_text2gql_language_model, label="model_name"),
-                        gr.Textbox(value=str(settings.openai_text2gql_tokens), 
label="max_token"),
-                    ]
-            elif llm_type == "qianfan_wenxin":
-                with gr.Row():
-                    llm_config_input = [
-                        gr.Textbox(value=settings.qianfan_text2gql_api_key, 
label="api_key"),
-                        gr.Textbox(value=settings.qianfan_text2gql_secret_key, 
label="secret_key"),
-                        gr.Textbox(value=settings.qianfan_chat_url, 
label="chat_url"),
-                        
gr.Textbox(value=settings.qianfan_text2gql_language_model, label="model_name")
-                    ]
-            elif llm_type == "ollama/local":
-                with gr.Row():
-                    llm_config_input = [
-                        gr.Textbox(value=settings.ollama_text2gql_host, 
label="host"),
-                        gr.Textbox(value=str(settings.ollama_text2gql_port), 
label="port"),
-                        
gr.Textbox(value=settings.ollama_text2gql_language_model, label="model_name"),
-                        gr.Textbox(value="", visible=False)
-                    ]
-            else:
-                llm_config_input = []
-            llm_config_button = gr.Button("Apply Configuration")
-
-            def apply_configuration(arg1, arg2, arg3, arg4):
-                llm_option = settings.text2gql_llm_type
-                if llm_option == "openai":
-                    settings.openai_text2gql_api_key = arg1
-                    settings.openai_text2gql_api_base = arg2
-                    settings.openai_text2gql_language_model = arg3
-                    settings.openai_text2gql_tokens = int(arg4)
-                elif llm_option == "qianfan_wenxin":
-                    settings.qianfan_text2gql_api_key = arg1
-                    settings.qianfan_text2gql_secret_key = arg2
-                    settings.qianfan_chat_url = arg3
-                    settings.qianfan_text2gql_language_model = arg4
-                elif llm_option == "ollam/local":
-                    settings.ollama_text2gql_host = arg1
-                    settings.ollama_text2gql_port = int(arg2)
-                    settings.ollama_text2gql_language_model = arg3
-                gr.Info("configured!")
-
-            llm_config_button.click(apply_configuration, 
inputs=llm_config_input)  # pylint: disable=no-member
-
-        gr.Markdown("## Set up the Embedding")
-        embedding_dropdown = gr.Dropdown(
-            choices=["openai", "ollama/local"],
-            value=settings.embedding_type,
-            label="Embedding"
-        )
-
-        @gr.render(inputs=[embedding_dropdown])
-        def embedding_settings(embedding_type):
-            settings.embedding_type = embedding_type
-            if embedding_type == "openai":
-                with gr.Row():
-                    embedding_config_input = [
-                        gr.Textbox(value=settings.openai_text2gql_api_key, 
label="api_key"),
-                        gr.Textbox(value=settings.openai_text2gql_api_base, 
label="api_base"),
-                        gr.Textbox(value=settings.openai_embedding_model, 
label="model_name")
-                    ]
-            elif embedding_type == "ollama/local":
-                with gr.Row():
-                    embedding_config_input = [
-                        gr.Textbox(value=settings.ollama_text2gql_host, 
label="host"),
-                        gr.Textbox(value=str(settings.ollama_text2gql_port), 
label="port"),
-                        gr.Textbox(value=settings.ollama_embedding_model, 
label="model_name"),
-                    ]
-            else:
-                embedding_config_input = []
-            embedding_config_button = gr.Button("Apply Configuration")
-
-            def apply_configuration(arg1, arg2, arg3):
-                embedding_option = settings.embedding_type
-                if embedding_option == "openai":
-                    settings.openai_text2gql_api_key = arg1
-                    settings.openai_text2gql_api_base = arg2
-                    settings.openai_embedding_model = arg3
-                elif embedding_option == "ollama/local":
-                    settings.ollama_text2gql_host = arg1
-                    settings.ollama_text2gql_port = int(arg2)
-                    settings.ollama_embedding_model = arg3
-                gr.Info("configured!")
-            # pylint: disable=no-member
-            embedding_config_button.click(apply_configuration, 
inputs=embedding_config_input)
-
-        gr.Markdown("## Build Example Vector Index")
-        gr.Markdown("Uploaded json file should be in format below:\n\n"
-                    "[{\"query\":\"who is peter\", 
\"gremlin\":\"g.V().has('name', 'peter')\"}]")
-        with gr.Row():
-            file = gr.File(label="Upload Example Query-Gremlin Pairs Json")
-            out = gr.Textbox(label="Result Message")
-        with gr.Row():
-            btn = gr.Button("Build Example Vector Index")
-        btn.click(build_example_vector_index, inputs=[file], outputs=[out])  # 
pylint: disable=no-member
-        gr.Markdown("## Nature Language To Gremlin")
-        SCHEMA = """{
-    "vertices": [
-        {"vertex_label": "entity", "properties": []}
-    ],
-    "edges": [
-        {
-            "edge_label": "relation",
-            "source_vertex_label": "entity",
-            "target_vertex_label": "entity",
-            "properties": {}
-        }
-    ]
-}"""
-        with gr.Row():
-            with gr.Column(scale=1):
-                schema_box = gr.Textbox(value=SCHEMA, label="Schema")
-            with gr.Column(scale=1):
-                input_box = gr.Textbox(value="Tell me about Al Pacino.",
-                                       label="Nature Language Query")
-                match = gr.Textbox(label="Best-Matched Examples")
-                out = gr.Textbox(label="Structured Query Language: Gremlin")
-            with gr.Column(scale=1):
-                use_example_radio = gr.Radio(choices=["true", "false"], 
value="false",
-                                             label="Use example")
-                use_schema_radio = gr.Radio(choices=["true", "false"], 
value="false",
-                                            label="Use schema")
-                example_num_slider = gr.Slider(
-                    minimum=1,
-                    maximum=10,
-                    step=1,
-                    value=5,
-                    label="Number of examples"
-                )
-                btn = gr.Button("Text2Gremlin")
-        btn.click(  # pylint: disable=no-member
-            fn=gremlin_generate,
-            inputs=[input_box, use_schema_radio, use_example_radio, 
example_num_slider, schema_box],
-            outputs=[match, out]
-        )
-    app = gr.mount_gradio_app(app, demo, path="/")
-    uvicorn.run(app, host="0.0.0.0", port=8002)
diff --git a/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py 
b/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py
index da6a437..e08be2e 100644
--- a/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py
+++ b/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py
@@ -36,6 +36,7 @@ from hugegraph_llm.demo.rag_demo.configs_block import (
     apply_graph_config,
 )
 from hugegraph_llm.demo.rag_demo.other_block import create_other_block
+from hugegraph_llm.demo.rag_demo.text2gremlin_block import 
create_text2gremlin_block
 from hugegraph_llm.demo.rag_demo.rag_block import create_rag_block, rag_answer
 from hugegraph_llm.demo.rag_demo.vector_graph_block import 
create_vector_graph_block
 from hugegraph_llm.resources.demo.css import CSS
@@ -55,6 +56,7 @@ def authenticate(credentials: HTTPAuthorizationCredentials = 
Depends(sec)):
             headers={"WWW-Authenticate": "Bearer"},
         )
 
+
 # pylint: disable=C0301
 def init_rag_ui() -> gr.Interface:
     with gr.Blocks(
@@ -93,9 +95,11 @@ def init_rag_ui() -> gr.Interface:
             textbox_input_schema, textbox_info_extract_template = 
create_vector_graph_block()
         with gr.Tab(label="2. (Graph)RAG & User Functions 📖"):
             textbox_inp, textbox_answer_prompt_input, 
textbox_keywords_extract_prompt_input = create_rag_block()
-        with gr.Tab(label="3. Graph Tools 🚧"):
+        with gr.Tab(label="3. Text2gremlin ⚙️"):
+            textbox_gremlin_inp, textbox_gremlin_schema, 
textbox_gremlin_prompt = create_text2gremlin_block()
+        with gr.Tab(label="4. Graph Tools 🚧"):
             create_other_block()
-        with gr.Tab(label="4. Admin Tools ⚙️"):
+        with gr.Tab(label="5. Admin Tools 🛠"):
             create_admin_block()
 
         def refresh_ui_config_prompt() -> tuple:
@@ -104,10 +108,11 @@ def init_rag_ui() -> gr.Interface:
             return (
                 settings.graph_ip, settings.graph_port, settings.graph_name, 
settings.graph_user,
                 settings.graph_pwd, settings.graph_space, prompt.graph_schema, 
prompt.extract_graph_prompt,
-                prompt.default_question, prompt.answer_prompt, 
prompt.keywords_extract_prompt
+                prompt.default_question, prompt.answer_prompt, 
prompt.keywords_extract_prompt,
+                prompt.default_question, settings.graph_name, 
prompt.gremlin_generate_prompt
             )
 
-        hugegraph_llm_ui.load(fn=refresh_ui_config_prompt, outputs=[ #pylint: 
disable=E1101
+        hugegraph_llm_ui.load(fn=refresh_ui_config_prompt, outputs=[  # 
pylint: disable=E1101
             textbox_array_graph_config[0],
             textbox_array_graph_config[1],
             textbox_array_graph_config[2],
@@ -118,7 +123,10 @@ def init_rag_ui() -> gr.Interface:
             textbox_info_extract_template,
             textbox_inp,
             textbox_answer_prompt_input,
-            textbox_keywords_extract_prompt_input
+            textbox_keywords_extract_prompt_input,
+            textbox_gremlin_inp,
+            textbox_gremlin_schema,
+            textbox_gremlin_prompt
         ])
 
     return hugegraph_llm_ui
diff --git a/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/rag_block.py 
b/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/rag_block.py
index 761773e..fdc554d 100644
--- a/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/rag_block.py
+++ b/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/rag_block.py
@@ -24,7 +24,7 @@ import gradio as gr
 import pandas as pd
 from gradio.utils import NamedString
 
-from hugegraph_llm.config import resource_path, prompt
+from hugegraph_llm.config import resource_path, prompt, settings
 from hugegraph_llm.operators.graph_rag_task import RAGPipeline
 from hugegraph_llm.utils.log import log
 
@@ -35,6 +35,7 @@ def rag_answer(
         vector_only_answer: bool,
         graph_only_answer: bool,
         graph_vector_answer: bool,
+        with_gremlin_template: bool,
         graph_ratio: float,
         rerank_method: Literal["bleu", "reranker"],
         near_neighbor_first: bool,
@@ -72,9 +73,10 @@ def rag_answer(
     if vector_search:
         rag.query_vector_index()
     if graph_search:
-        
rag.extract_keywords(extract_template=keywords_extract_prompt).keywords_to_vid().query_graphdb()
+        
rag.extract_keywords(extract_template=keywords_extract_prompt).keywords_to_vid().import_schema(
+            
settings.graph_name).query_graphdb(with_gremlin_template=with_gremlin_template)
     # TODO: add more user-defined search strategies
-    rag.merge_dedup_rerank(graph_ratio, rerank_method, near_neighbor_first, 
custom_related_information)
+    rag.merge_dedup_rerank(graph_ratio, rerank_method, near_neighbor_first, )
     rag.synthesize_answer(raw_answer, vector_only_answer, graph_only_answer, 
graph_vector_answer, answer_prompt)
 
     try:
@@ -119,7 +121,8 @@ def create_rag_block():
             with gr.Row():
                 graph_only_radio = gr.Radio(choices=[True, False], value=True, 
label="Graph-only Answer")
                 graph_vector_radio = gr.Radio(choices=[True, False], 
value=False, label="Graph-Vector Answer")
-
+            with gr.Row():
+                with_gremlin_template_radio = gr.Radio(choices=[True, False], 
value=True, label="With Gremlin Template")
             def toggle_slider(enable):
                 return gr.update(interactive=enable)
 
@@ -155,6 +158,7 @@ def create_rag_block():
             vector_only_radio,
             graph_only_radio,
             graph_vector_radio,
+            with_gremlin_template_radio,
             graph_ratio,
             rerank_method,
             near_neighbor_first,
diff --git 
a/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/text2gremlin_block.py 
b/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/text2gremlin_block.py
new file mode 100644
index 0000000..f2fb6fb
--- /dev/null
+++ b/hugegraph-llm/src/hugegraph_llm/demo/rag_demo/text2gremlin_block.py
@@ -0,0 +1,168 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import json
+import os
+from typing import Any, Tuple, Dict
+
+import gradio as gr
+import pandas as pd
+
+from hugegraph_llm.config import prompt, resource_path
+from hugegraph_llm.models.embeddings.init_embedding import Embeddings
+from hugegraph_llm.models.llms.init_llm import LLMs
+from hugegraph_llm.operators.gremlin_generate_task import GremlinGenerator
+from hugegraph_llm.operators.hugegraph_op.schema_manager import SchemaManager
+from hugegraph_llm.utils.hugegraph_utils import run_gremlin_query
+from hugegraph_llm.utils.log import log
+
+
+def store_schema(schema, question, gremlin_prompt):
+    if (prompt.text2gql_graph_schema != schema or prompt.default_question != 
question or
+            prompt.gremlin_generate_prompt != gremlin_prompt):
+        prompt.text2gql_graph_schema = schema
+        prompt.default_question = question
+        prompt.gremlin_generate_prompt = gremlin_prompt
+        prompt.update_yaml_file()
+
+
+def build_example_vector_index(temp_file) -> dict:
+    if temp_file is None:
+        full_path = os.path.join(resource_path, "demo", "text2gremlin.csv")
+    else:
+        full_path = temp_file.name
+    if full_path.endswith(".json"):
+        with open(full_path, "r", encoding="utf-8") as f:
+            examples = json.load(f)
+    elif full_path.endswith(".csv"):
+        examples = pd.read_csv(full_path).to_dict('records')
+    else:
+        log.critical("Unsupported file format. Please input a JSON or CSV 
file.")
+        return {"error": "Unsupported file format. Please input a JSON or CSV 
file."}
+    builder = GremlinGenerator(
+        llm=LLMs().get_text2gql_llm(),
+        embedding=Embeddings().get_embedding(),
+    )
+    return builder.example_index_build(examples).run()
+
+
+def gremlin_generate(inp, example_num, schema, gremlin_prompt) -> tuple[str, 
str] | tuple[str, Any, Any, Any, Any]:
+    generator = GremlinGenerator(llm=LLMs().get_text2gql_llm(), 
embedding=Embeddings().get_embedding())
+    sm = SchemaManager(graph_name="schema")
+    short_schema = False
+
+    if schema:
+        schema = schema.strip()
+        if not schema.startswith("{"):
+            short_schema = True
+            log.info("Try to get schema from graph '%s'", schema)
+            generator.import_schema(from_hugegraph=schema)
+            # FIXME: update the logic here
+            schema = sm.schema.getSchema()
+        else:
+            try:
+                schema = json.loads(schema)
+                generator.import_schema(from_user_defined=schema)
+            except json.JSONDecodeError as e:
+                log.error("Invalid JSON schema provided: %s", e)
+                return "Invalid JSON schema, please check the format 
carefully.", ""
+    # FIXME: schema is not used in gremlin_generate() step, no context for it 
(enhance the logic here)
+    updated_schema = sm.simple_schema(schema) if short_schema else schema
+    context = 
generator.example_index_query(example_num).gremlin_generate_synthesize(updated_schema,
+                                                                               
      gremlin_prompt).run(query=inp)
+    try:
+        context["template_exec_res"] = 
run_gremlin_query(query=context["result"])
+    except Exception as e: # pylint: disable=broad-except
+        context["template_exec_res"] = f"{e}"
+    try:
+        context["raw_exec_res"] = 
run_gremlin_query(query=context["raw_result"])
+    except Exception as e: # pylint: disable=broad-except
+        context["raw_exec_res"] = f"{e}"
+
+    match_result = json.dumps(context.get("match_result", "No Results"), 
ensure_ascii=False, indent=2)
+    return match_result, context["result"], context["raw_result"], 
context["template_exec_res"], context["raw_exec_res"]
+
+
+def simple_schema(schema: Dict[str, Any]) -> Dict[str, Any]:
+    mini_schema = {}
+
+    # Add necessary vertexlabels items (3)
+    if "vertexlabels" in schema:
+        mini_schema["vertexlabels"] = []
+        for vertex in schema["vertexlabels"]:
+            new_vertex = {key: vertex[key] for key in ["id", "name", 
"properties"] if key in vertex}
+            mini_schema["vertexlabels"].append(new_vertex)
+
+    # Add necessary edgelabels items (4)
+    if "edgelabels" in schema:
+        mini_schema["edgelabels"] = []
+        for edge in schema["edgelabels"]:
+            new_edge = {key: edge[key] for key in
+                        ["name", "source_label", "target_label", "properties"] 
if key in edge}
+            mini_schema["edgelabels"].append(new_edge)
+
+    return mini_schema
+
+
+def create_text2gremlin_block() -> Tuple:
+    gr.Markdown("""## Build Vector Template Index (Optional)
+    > Uploaded CSV file should be in `query,gremlin` format below:    
+    > e.g. `who is peter?`,`g.V().has('name', 'peter')`    
+    > JSON file should be in format below:  
+    > e.g. `[{"query":"who is peter", "gremlin":"g.V().has('name', 'peter')"}]`
+    """)
+    with gr.Row():
+        file = gr.File(
+            value=os.path.join(resource_path, "demo", "text2gremlin.csv"),
+            label="Upload Text-Gremlin Pairs File"
+        )
+        out = gr.Textbox(label="Result Message")
+    with gr.Row():
+        btn = gr.Button("Build Example Vector Index", variant="primary")
+    btn.click(build_example_vector_index, inputs=[file], outputs=[out])  # 
pylint: disable=no-member
+    gr.Markdown("## Nature Language To Gremlin")
+
+    with gr.Row():
+        with gr.Column(scale=1):
+            input_box = gr.Textbox(value=prompt.default_question, 
label="Nature Language Query", show_copy_button=True)
+            match = gr.Code(label="Similar Template (TopN)", 
language="javascript", elem_classes="code-container-show")
+            initialized_out = gr.Textbox(label="Gremlin With Template", 
show_copy_button=True)
+            raw_out = gr.Textbox(label="Gremlin Without Template", 
show_copy_button=True)
+            tmpl_exec_out = gr.Code(label="Query With Template Output", 
language="json",
+                                    elem_classes="code-container-show")
+            raw_exec_out = gr.Code(label="Query Without Template Output", 
language="json",
+                                   elem_classes="code-container-show")
+
+        with gr.Column(scale=1):
+            example_num_slider = gr.Slider(
+                minimum=0,
+                maximum=10,
+                step=1,
+                value=2,
+                label="Number of refer examples"
+            )
+            schema_box = gr.Textbox(value=prompt.text2gql_graph_schema, 
label="Schema", lines=2, show_copy_button=True)
+            prompt_box = gr.Textbox(value=prompt.gremlin_generate_prompt, 
label="Prompt", lines=2,
+                                    show_copy_button=True)
+            btn = gr.Button("Text2Gremlin", variant="primary")
+    btn.click(  # pylint: disable=no-member
+        fn=gremlin_generate,
+        inputs=[input_box, example_num_slider, schema_box, prompt_box],
+        outputs=[match, initialized_out, raw_out, tmpl_exec_out, raw_exec_out]
+    ).then(store_schema, inputs=[schema_box, input_box, prompt_box], )
+
+    return input_box, schema_box, prompt_box
diff --git a/hugegraph-llm/src/hugegraph_llm/indices/vector_index.py 
b/hugegraph-llm/src/hugegraph_llm/indices/vector_index.py
index 3732a9f..4ba1983 100644
--- a/hugegraph-llm/src/hugegraph_llm/indices/vector_index.py
+++ b/hugegraph-llm/src/hugegraph_llm/indices/vector_index.py
@@ -18,7 +18,7 @@
 import os
 import pickle as pkl
 from copy import deepcopy
-from typing import List, Dict, Any, Set, Union
+from typing import List, Any, Set, Union
 
 import faiss
 import numpy as np
@@ -85,7 +85,7 @@ class VectorIndex:
         self.properties = [p for i, p in enumerate(self.properties) if i not 
in indices]
         return remove_num
 
-    def search(self, query_vector: List[float], top_k: int, dis_threshold: 
float = 0.9) -> List[Dict[str, Any]]:
+    def search(self, query_vector: List[float], top_k: int, dis_threshold: 
float = 0.9) -> List[Any]:
         if self.index.ntotal == 0:
             return []
 
diff --git a/hugegraph-llm/src/hugegraph_llm/models/llms/base.py 
b/hugegraph-llm/src/hugegraph_llm/models/llms/base.py
index 04c1c27..f2dd234 100644
--- a/hugegraph-llm/src/hugegraph_llm/models/llms/base.py
+++ b/hugegraph-llm/src/hugegraph_llm/models/llms/base.py
@@ -15,7 +15,6 @@
 # specific language governing permissions and limitations
 # under the License.
 
-
 from abc import ABC, abstractmethod
 from typing import Any, List, Optional, Callable, Dict
 
@@ -25,9 +24,9 @@ class BaseLLM(ABC):
 
     @abstractmethod
     def generate(
-        self,
-        messages: Optional[List[Dict[str, Any]]] = None,
-        prompt: Optional[str] = None,
+            self,
+            messages: Optional[List[Dict[str, Any]]] = None,
+            prompt: Optional[str] = None,
     ) -> str:
         """Comment"""
 
@@ -41,23 +40,23 @@ class BaseLLM(ABC):
 
     @abstractmethod
     def generate_streaming(
-        self,
-        messages: Optional[List[Dict[str, Any]]] = None,
-        prompt: Optional[str] = None,
-        on_token_callback: Callable = None,
+            self,
+            messages: Optional[List[Dict[str, Any]]] = None,
+            prompt: Optional[str] = None,
+            on_token_callback: Callable = None,
     ) -> List[Any]:
         """Comment"""
 
     @abstractmethod
     def num_tokens_from_string(
-        self,
-        string: str,
+            self,
+            string: str,
     ) -> str:
         """Given a string returns the number of tokens the given string 
consists of"""
 
     @abstractmethod
     def max_allowed_token_length(
-        self,
+            self,
     ) -> int:
         """Returns the maximum number of tokens the LLM can handle"""
 
diff --git a/hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py 
b/hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py
index 07dc770..8f5f81d 100644
--- a/hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py
+++ b/hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py
@@ -26,6 +26,7 @@ from hugegraph_llm.operators.common_op.merge_dedup_rerank 
import MergeDedupReran
 from hugegraph_llm.operators.common_op.print_result import PrintResult
 from hugegraph_llm.operators.document_op.word_extract import WordExtract
 from hugegraph_llm.operators.hugegraph_op.graph_rag_query import GraphRAGQuery
+from hugegraph_llm.operators.hugegraph_op.schema_manager import SchemaManager
 from hugegraph_llm.operators.index_op.semantic_id_query import SemanticIdQuery
 from hugegraph_llm.operators.index_op.vector_index_query import 
VectorIndexQuery
 from hugegraph_llm.operators.llm_op.answer_synthesize import AnswerSynthesize
@@ -89,6 +90,10 @@ class RAGPipeline:
         )
         return self
 
+    def import_schema(self, graph_name: str):
+        self._operators.append(SchemaManager(graph_name))
+        return self
+
     def keywords_to_vid(
         self,
         by: Literal["query", "keywords"] = "keywords",
@@ -119,6 +124,7 @@ class RAGPipeline:
         max_v_prop_len: int = 2048,
         max_e_prop_len: int = 256,
         prop_to_match: Optional[str] = None,
+        with_gremlin_template: bool = True,
     ):
         """
         Add a graph RAG query operator to the pipeline.
@@ -132,7 +138,8 @@ class RAGPipeline:
         """
         self._operators.append(
             GraphRAGQuery(max_deep=max_deep, max_items=max_items, 
max_v_prop_len=max_v_prop_len,
-                          max_e_prop_len=max_e_prop_len, 
prop_to_match=prop_to_match)
+                          max_e_prop_len=max_e_prop_len, 
prop_to_match=prop_to_match,
+                          with_gremlin_template=with_gremlin_template)
         )
         return self
 
diff --git a/hugegraph-llm/src/hugegraph_llm/operators/gremlin_generate_task.py 
b/hugegraph-llm/src/hugegraph_llm/operators/gremlin_generate_task.py
index 772dcd2..dfbf085 100644
--- a/hugegraph-llm/src/hugegraph_llm/operators/gremlin_generate_task.py
+++ b/hugegraph-llm/src/hugegraph_llm/operators/gremlin_generate_task.py
@@ -14,14 +14,16 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-
+from typing import Optional, List
 
 from hugegraph_llm.models.embeddings.base import BaseEmbedding
 from hugegraph_llm.models.llms.base import BaseLLM
+from hugegraph_llm.operators.common_op.check_schema import CheckSchema
 from hugegraph_llm.operators.common_op.print_result import PrintResult
+from hugegraph_llm.operators.hugegraph_op.schema_manager import SchemaManager
 from hugegraph_llm.operators.index_op.build_gremlin_example_index import 
BuildGremlinExampleIndex
 from hugegraph_llm.operators.index_op.gremlin_example_index_query import 
GremlinExampleIndexQuery
-from hugegraph_llm.operators.llm_op.gremlin_generate import GremlinGenerate
+from hugegraph_llm.operators.llm_op.gremlin_generate import 
GremlinGenerateSynthesize
 from hugegraph_llm.utils.decorators import log_time, log_operator_time, 
record_qps
 
 
@@ -33,16 +35,32 @@ class GremlinGenerator:
         self.result = None
         self.operators = []
 
+    def clear(self):
+        self.operators = []
+        return self
+
     def example_index_build(self, examples):
         self.operators.append(BuildGremlinExampleIndex(self.embedding, 
examples))
         return self
 
-    def example_index_query(self, query, num_examples):
-        self.operators.append(GremlinExampleIndexQuery(query, self.embedding, 
num_examples))
+    def import_schema(self, from_hugegraph=None, from_extraction=None, 
from_user_defined=None):
+        if from_hugegraph:
+            self.operators.append(SchemaManager(from_hugegraph))
+        elif from_user_defined:
+            self.operators.append(CheckSchema(from_user_defined))
+        elif from_extraction:
+            raise NotImplementedError("Not implemented yet")
+        else:
+            raise ValueError("No input data / invalid schema type")
+        return self
+
+    def example_index_query(self, num_examples):
+        self.operators.append(GremlinExampleIndexQuery(self.embedding, 
num_examples))
         return self
 
-    def gremlin_generate(self, use_schema, use_example, schema):
-        self.operators.append(GremlinGenerate(self.llm, use_schema, 
use_example, schema))
+    def gremlin_generate_synthesize(self, schema, gremlin_prompt: 
Optional[str] = None,
+                                    vertices: Optional[List[str]] = None):
+        self.operators.append(GremlinGenerateSynthesize(self.llm, schema, 
vertices, gremlin_prompt))
         return self
 
     def print_result(self):
@@ -51,8 +69,8 @@ class GremlinGenerator:
 
     @log_time("total time")
     @record_qps
-    def run(self):
-        context = {}
+    def run(self, **kwargs):
+        context = kwargs
         for operator in self.operators:
             context = self._run_operator(operator, context)
         return context
diff --git 
a/hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/graph_rag_query.py 
b/hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/graph_rag_query.py
index a3dc1ad..cd759ef 100644
--- a/hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/graph_rag_query.py
+++ b/hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/graph_rag_query.py
@@ -14,9 +14,14 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+
+import json
 from typing import Any, Dict, Optional, List, Set, Tuple
 
 from hugegraph_llm.config import settings
+from hugegraph_llm.models.embeddings.base import BaseEmbedding
+from hugegraph_llm.models.llms.base import BaseLLM
+from hugegraph_llm.operators.gremlin_generate_task import GremlinGenerator
 from hugegraph_llm.utils.log import log
 from pyhugegraph.client import PyHugeClient
 
@@ -25,7 +30,7 @@ VERTEX_QUERY_TPL = 
"g.V({keywords}).limit(8).as('subj').toList()"
 
 # TODO: we could use a simpler query (like kneighbor-api to get the edges)
 # TODO: test with profile()/explain() to speed up the query
-VID_QUERY_NEIGHBOR_TPL = """
+VID_QUERY_NEIGHBOR_TPL = """\
 g.V({keywords})
 .repeat(
    bothE({edge_labels}).limit({edge_limit}).otherV().dedup()
@@ -47,7 +52,7 @@ g.V({keywords})
 .toList()
 """
 
-PROPERTY_QUERY_NEIGHBOR_TPL = """
+PROPERTY_QUERY_NEIGHBOR_TPL = """\
 g.V().has('{prop}', within({keywords}))
 .repeat(
    bothE({edge_labels}).limit({edge_limit}).otherV().dedup()
@@ -71,8 +76,18 @@ g.V().has('{prop}', within({keywords}))
 
 class GraphRAGQuery:
 
-    def __init__(self, max_deep: int = 2, max_items: int = 20, max_v_prop_len: 
int = 2048,
-                 max_e_prop_len: int = 256, prop_to_match: Optional[str] = 
None):
+    def __init__(
+            self,
+            max_deep: int = 2,
+            max_items: int = 20,
+            prop_to_match: Optional[str] = None,
+            with_gremlin_template: bool = True,
+            llm: Optional[BaseLLM] = None,
+            embedding: Optional[BaseEmbedding] = None,
+            max_v_prop_len: int = 2048,
+            max_e_prop_len: int = 256,
+            num_gremlin_generate_example: int = 1
+    ):
         self._client = PyHugeClient(
             settings.graph_ip,
             settings.graph_port,
@@ -88,6 +103,12 @@ class GraphRAGQuery:
         self._limit_property = settings.limit_property.lower() == "true"
         self._max_v_prop_len = max_v_prop_len
         self._max_e_prop_len = max_e_prop_len
+        self._gremlin_generator = GremlinGenerator(
+            llm=llm,
+            embedding=embedding,
+        )
+        self._num_gremlin_generate_example = num_gremlin_generate_example
+        self._with_gremlin_template = with_gremlin_template
 
     def run(self, context: Dict[str, Any]) -> Dict[str, Any]:
         # pylint: disable=R0915 (too-many-statements)
@@ -104,7 +125,58 @@ class GraphRAGQuery:
                 self._client = PyHugeClient(ip, port, graph, user, pwd, gs)
         assert self._client is not None, "No valid graph to search."
 
-        # 2. Extract params from context
+        # initial flag: -1 means no result, 0 means subgraph query, 1 means 
gremlin query
+        context["graph_result_flag"] = -1
+        # 1. Try to perform a query based on the generated gremlin
+        context = self._gremlin_generate_query(context)
+        # 2. Try to perform a query based on subgraph-search if the previous 
query failed
+        if not context.get("graph_result"):
+            context = self._subgraph_query(context)
+
+        if context.get("graph_result"):
+            log.debug("Knowledge from Graph:\n%s", 
"\n".join(context["graph_result"]))
+        else:
+            log.debug("No Knowledge Extracted from Graph")
+        return context
+
+    def _gremlin_generate_query(self, context: Dict[str, Any]) -> Dict[str, 
Any]:
+        query = context["query"]
+        vertices = context.get("match_vids")
+        query_embedding = context.get("query_embedding")
+
+        self._gremlin_generator.clear()
+        
self._gremlin_generator.example_index_query(num_examples=self._num_gremlin_generate_example)
+        gremlin_response = self._gremlin_generator.gremlin_generate_synthesize(
+            context["simple_schema"],
+            vertices=vertices,
+        ).run(
+            query=query,
+            query_embedding=query_embedding
+        )
+        if self._with_gremlin_template:
+            gremlin = gremlin_response["result"]
+        else:
+            gremlin = gremlin_response["raw_result"]
+        log.info("Generated gremlin: %s", gremlin)
+        context["gremlin"] = gremlin
+        try:
+            result = self._client.gremlin().exec(gremlin=gremlin)["data"]
+            if result == [None]:
+                result = []
+            context["graph_result"] = [json.dumps(item, ensure_ascii=False) 
for item in result]
+            if context["graph_result"]:
+                context["graph_result_flag"] = 1
+                context["graph_context_head"] = (
+                    f"The following are graph query result "
+                    f"from gremlin query `{gremlin}`.\n"
+                )
+        except Exception as e: # pylint: disable=broad-except
+            log.error(e)
+            context["graph_result"] = ""
+        return context
+
+    def _subgraph_query(self, context: Dict[str, Any]) -> Dict[str, Any]:
+        # 1. Extract params from context
         matched_vids = context.get("match_vids")
         if isinstance(context.get("max_deep"), int):
             self._max_deep = context["max_deep"]
@@ -113,7 +185,7 @@ class GraphRAGQuery:
         if isinstance(context.get("prop_to_match"), str):
             self._prop_to_match = context["prop_to_match"]
 
-        # 3. Extract edge_labels from graph schema
+        # 2. Extract edge_labels from graph schema
         _, edge_labels = self._extract_labels_from_schema()
         edge_labels_str = ",".join("'" + label + "'" for label in edge_labels)
         # TODO: enhance the limit logic later
@@ -136,7 +208,7 @@ class GraphRAGQuery:
                 edge_limit=edge_limit_amount,
                 max_items=self._max_items,
             )
-            log.debug("Kneighbor gremlin query: %s", gremlin_query)
+            log.debug("Kneighbor gremlin query: %s", 
gremlin_query.replace("\n", "").replace(" ", ""))
             paths = self._client.gremlin().exec(gremlin=gremlin_query)["data"]
 
             graph_chain_knowledge, vertex_degree_list, knowledge_with_degree = 
self._format_graph_query_result(
@@ -171,15 +243,15 @@ class GraphRAGQuery:
             )
 
         context["graph_result"] = list(graph_chain_knowledge)
-        context["vertex_degree_list"] = [list(vertex_degree) for vertex_degree 
in vertex_degree_list]
-        context["knowledge_with_degree"] = knowledge_with_degree
-        context["graph_context_head"] = (
-            f"The following are graph knowledge in {self._max_deep} depth, 
e.g:\n"
-            "`vertexA --[links]--> vertexB <--[links]-- vertexC ...`"
-            "extracted based on key entities as subject:\n"
-        )
-        # TODO: set color for ↓ "\033[93mKnowledge from Graph:\033[0m"
-        log.debug("Knowledge from Graph:\n%s", 
"\n".join(context["graph_result"]))
+        if context["graph_result"]:
+            context["graph_result_flag"] = 0
+            context["vertex_degree_list"] = [list(vertex_degree) for 
vertex_degree in vertex_degree_list]
+            context["knowledge_with_degree"] = knowledge_with_degree
+            context["graph_context_head"] = (
+                f"The following are graph knowledge in {self._max_deep} depth, 
e.g:\n"
+                "`vertexA--[links]-->vertexB<--[links]--vertexC ...`"
+                "extracted based on key entities as subject:\n"
+            )
         return context
 
     def _format_graph_from_vertex(self, query_result: List[Any]) -> Set[str]:
diff --git 
a/hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/schema_manager.py 
b/hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/schema_manager.py
index ab50abf..0e706cc 100644
--- a/hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/schema_manager.py
+++ b/hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/schema_manager.py
@@ -33,6 +33,26 @@ class SchemaManager:
         )
         self.schema = self.client.schema()
 
+    def simple_schema(self, schema: Dict[str, Any]) -> Dict[str, Any]:
+        mini_schema = {}
+
+        # Add necessary vertexlabels items (3)
+        if "vertexlabels" in schema:
+            mini_schema["vertexlabels"] = []
+            for vertex in schema["vertexlabels"]:
+                new_vertex = {key: vertex[key] for key in ["id", "name", 
"properties"] if key in vertex}
+                mini_schema["vertexlabels"].append(new_vertex)
+
+        # Add necessary edgelabels items (4)
+        if "edgelabels" in schema:
+            mini_schema["edgelabels"] = []
+            for edge in schema["edgelabels"]:
+                new_edge = {key: edge[key] for key in
+                            ["name", "source_label", "target_label", 
"properties"] if key in edge}
+                mini_schema["edgelabels"].append(new_edge)
+
+        return mini_schema
+
     def run(self, context: Optional[Dict[str, Any]]) -> Dict[str, Any]:
         if context is None:
             context = {}
@@ -41,4 +61,6 @@ class SchemaManager:
             raise Exception(f"Can not get {self.graph_name}'s schema from 
HugeGraph!")
 
         context.update({"schema": schema})
+        # TODO: enhance the logic here
+        context["simple_schema"] = self.simple_schema(schema)
         return context
diff --git 
a/hugegraph-llm/src/hugegraph_llm/operators/index_op/gremlin_example_index_query.py
 
b/hugegraph-llm/src/hugegraph_llm/operators/index_op/gremlin_example_index_query.py
index ddcf589..a95e7da 100644
--- 
a/hugegraph-llm/src/hugegraph_llm/operators/index_op/gremlin_example_index_query.py
+++ 
b/hugegraph-llm/src/hugegraph_llm/operators/index_op/gremlin_example_index_query.py
@@ -17,23 +17,53 @@
 
 
 import os
-from typing import Dict, Any
+from typing import Dict, Any, List
+
+import pandas as pd
+from tqdm import tqdm
 
 from hugegraph_llm.config import resource_path
-from hugegraph_llm.models.embeddings.base import BaseEmbedding
 from hugegraph_llm.indices.vector_index import VectorIndex
+from hugegraph_llm.models.embeddings.base import BaseEmbedding
+from hugegraph_llm.models.embeddings.init_embedding import Embeddings
+from hugegraph_llm.utils.log import log
 
 
 class GremlinExampleIndexQuery:
-    def __init__(self, query: str, embedding: BaseEmbedding, num_examples: int 
= 1):
-        self.query = query
-        self.embedding = embedding
+    def __init__(self, embedding: BaseEmbedding = None, num_examples: int = 1):
+        self.embedding = embedding or Embeddings().get_embedding()
         self.num_examples = num_examples
         self.index_dir = os.path.join(resource_path, "gremlin_examples")
+        self._ensure_index_exists()
         self.vector_index = VectorIndex.from_index_file(self.index_dir)
 
+    def _ensure_index_exists(self):
+        if not (os.path.exists(os.path.join(self.index_dir, "index.faiss"))
+                and os.path.exists(os.path.join(self.index_dir, 
"properties.pkl"))):
+            log.warning("No gremlin example index found, will generate one.")
+            self._build_default_example_index()
+
+    def _get_match_result(self, context: Dict[str, Any], query: str) -> 
List[Dict[str, Any]]:
+        if self.num_examples == 0:
+            return []
+
+        query_embedding = context.get("query_embedding")
+        if not isinstance(query_embedding, list):
+            query_embedding = self.embedding.get_text_embedding(query)
+        return self.vector_index.search(query_embedding, self.num_examples, 
dis_threshold=1.8)
+
+    def _build_default_example_index(self):
+        properties = pd.read_csv(os.path.join(resource_path, "demo",
+                                              
"text2gremlin.csv")).to_dict(orient="records")
+        embeddings = [self.embedding.get_text_embedding(row["query"]) for row 
in tqdm(properties)]
+        vector_index = VectorIndex(len(embeddings[0]))
+        vector_index.add(embeddings, properties)
+        vector_index.to_index_file(self.index_dir)
+
     def run(self, context: Dict[str, Any]) -> Dict[str, Any]:
-        context["query"] = self.query
-        query_embedding = self.embedding.get_text_embedding(self.query)
-        context["match_result"] = self.vector_index.search(query_embedding, 
self.num_examples)
+        query = context.get("query")
+        if not query:
+            raise ValueError("query is required")
+
+        context["match_result"] = self._get_match_result(context, query)
         return context
diff --git 
a/hugegraph-llm/src/hugegraph_llm/operators/index_op/semantic_id_query.py 
b/hugegraph-llm/src/hugegraph_llm/operators/index_op/semantic_id_query.py
index d7b5b89..f5dd8de 100644
--- a/hugegraph-llm/src/hugegraph_llm/operators/index_op/semantic_id_query.py
+++ b/hugegraph-llm/src/hugegraph_llm/operators/index_op/semantic_id_query.py
@@ -51,28 +51,6 @@ class SemanticIdQuery:
             settings.graph_space,
         )
 
-    def run(self, context: Dict[str, Any]) -> Dict[str, Any]:
-        graph_query_list = set()
-        if self.by == "query":
-            query = context["query"]
-            query_vector = self.embedding.get_text_embedding(query)
-            results = self.vector_index.search(query_vector, 
top_k=self.topk_per_query)
-            if results:
-                graph_query_list.update(results[:self.topk_per_query])
-        else:  # by keywords
-            keywords = context.get("keywords", [])
-            if not keywords:
-                context["match_vids"] = []
-                return context
-
-            exact_match_vids, unmatched_vids = self._exact_match_vids(keywords)
-            graph_query_list.update(exact_match_vids)
-            fuzzy_match_vids = self._fuzzy_match_vids(unmatched_vids)
-            log.debug("Fuzzy match vids: %s", fuzzy_match_vids)
-            graph_query_list.update(fuzzy_match_vids)
-        context["match_vids"] = list(graph_query_list)
-        return context
-
     def _exact_match_vids(self, keywords: List[str]) -> Tuple[List[str], 
List[str]]:
         assert keywords, "keywords can't be empty, please check the logic"
         # TODO: we should add a global GraphSchemaCache to avoid calling the 
server every time
@@ -99,6 +77,27 @@ class SemanticIdQuery:
             keyword_vector = self.embedding.get_text_embedding(keyword)
             results = self.vector_index.search(keyword_vector, 
top_k=self.topk_per_keyword)
             if results:
-                # FIXME: type mismatch, got 'list[dict[str, Any]]' instead
                 fuzzy_match_result.extend(results[:self.topk_per_keyword])
         return fuzzy_match_result
+
+    def run(self, context: Dict[str, Any]) -> Dict[str, Any]:
+        graph_query_list = set()
+        if self.by == "query":
+            query = context["query"]
+            query_vector = self.embedding.get_text_embedding(query)
+            results = self.vector_index.search(query_vector, 
top_k=self.topk_per_query)
+            if results:
+                graph_query_list.update(results[:self.topk_per_query])
+        else:  # by keywords
+            keywords = context.get("keywords", [])
+            if not keywords:
+                context["match_vids"] = []
+                return context
+
+            exact_match_vids, unmatched_vids = self._exact_match_vids(keywords)
+            graph_query_list.update(exact_match_vids)
+            fuzzy_match_vids = self._fuzzy_match_vids(unmatched_vids)
+            log.debug("Fuzzy match vids: %s", fuzzy_match_vids)
+            graph_query_list.update(fuzzy_match_vids)
+        context["match_vids"] = list(graph_query_list)
+        return context
diff --git 
a/hugegraph-llm/src/hugegraph_llm/operators/index_op/vector_index_query.py 
b/hugegraph-llm/src/hugegraph_llm/operators/index_op/vector_index_query.py
index 8417a9a..fbdf0ad 100644
--- a/hugegraph-llm/src/hugegraph_llm/operators/index_op/vector_index_query.py
+++ b/hugegraph-llm/src/hugegraph_llm/operators/index_op/vector_index_query.py
@@ -35,7 +35,8 @@ class VectorIndexQuery:
     def run(self, context: Dict[str, Any]) -> Dict[str, Any]:
         query = context.get("query")
         query_embedding = self.embedding.get_text_embedding(query)
-        results = self.vector_index.search(query_embedding, self.topk)
+        # TODO: why set dis_threshold=2?
+        results = self.vector_index.search(query_embedding, self.topk, 
dis_threshold=2)
         # TODO: check format results
         context["vector_result"] = results
         log.debug("KNOWLEDGE FROM VECTOR:\n%s", "\n".join(rel for rel in 
context["vector_result"]))
diff --git 
a/hugegraph-llm/src/hugegraph_llm/operators/llm_op/gremlin_generate.py 
b/hugegraph-llm/src/hugegraph_llm/operators/llm_op/gremlin_generate.py
index f117a5e..955fc9e 100644
--- a/hugegraph-llm/src/hugegraph_llm/operators/llm_op/gremlin_generate.py
+++ b/hugegraph-llm/src/hugegraph_llm/operators/llm_op/gremlin_generate.py
@@ -15,101 +15,87 @@
 # specific language governing permissions and limitations
 # under the License.
 
-
-import re
+import asyncio
 import json
-from typing import Optional, List, Dict, Any
+import re
+from typing import Optional, List, Dict, Any, Union
 
 from hugegraph_llm.models.llms.base import BaseLLM
+from hugegraph_llm.models.llms.init_llm import LLMs
+from hugegraph_llm.utils.log import log
+from hugegraph_llm.config import prompt
 
 
-def gremlin_examples(examples: List[Dict[str, str]]) -> str:
-    example_strings = []
-    for example in examples:
-        example_strings.append(
-            f"- query: {example['query']}\n"
-            f"- gremlin: {example['gremlin']}")
-    return "\n\n".join(example_strings)
-
-
-def gremlin_generate_prompt(inp: str) -> str:
-    return f"""Generate gremlin from the following user input.
-The output format must be: "gremlin: generated gremlin".
-
-The query is: {inp}"""
-
-
-def gremlin_generate_with_schema_prompt(schema: str, inp: str) -> str:
-    return f"""Given the graph schema:
-{schema}
-Generate gremlin from the following user input.
-The output format must be: "gremlin: generated gremlin".
-
-The query is: {inp}"""
-
-
-def gremlin_generate_with_example_prompt(example: str, inp: str) -> str:
-    return f"""Given the example query-gremlin pairs:
-{example}
-
-Generate gremlin from the following user input.
-The output format must be: "gremlin: generated gremlin".
-
-The query is: {inp}"""
-
-
-def gremlin_generate_with_schema_and_example_prompt(schema: str, example: str, 
inp: str) -> str:
-    return f"""Given the graph schema:
-{schema}
-Given the example query-gremlin pairs:
-{example}
-
-Generate gremlin from the following user input.
-The output format must be: "gremlin: generated gremlin".
-
-The query is: {inp}"""
-
-
-class GremlinGenerate:
+class GremlinGenerateSynthesize:
     def __init__(
             self,
-            llm: BaseLLM,
-            use_schema: bool = False,
-            use_example: bool = False,
-            schema: Optional[dict] = None
+            llm: BaseLLM = None,
+            schema: Optional[Union[dict, str]] = None,
+            vertices: Optional[List[str]] = None,
+            gremlin_prompt: Optional[str] = None
     ) -> None:
-        self.llm = llm
-        self.use_schema = use_schema
-        self.use_example = use_example
+        self.llm = llm or LLMs().get_text2gql_llm()
+        if isinstance(schema, dict):
+            schema = json.dumps(schema, ensure_ascii=False)
         self.schema = schema
+        self.vertices = vertices
+        self.gremlin_prompt = gremlin_prompt or prompt.gremlin_generate_prompt
 
-    def run(self, context: Dict[str, Any]) -> Dict[str, Any]:
-        query = context.get("query", "")
-        examples = context.get("match_result", [])
-        if not self.use_schema and not self.use_example:
-            prompt = gremlin_generate_prompt(query)
-        elif not self.use_schema and self.use_example:
-            prompt = 
gremlin_generate_with_example_prompt(gremlin_examples(examples), query)
-        elif self.use_schema and not self.use_example:
-            prompt = 
gremlin_generate_with_schema_prompt(json.dumps(self.schema), query)
-        else:
-            prompt = gremlin_generate_with_schema_and_example_prompt(
-                json.dumps(self.schema),
-                gremlin_examples(examples),
-                query
-            )
-        response = self.llm.generate(prompt=prompt)
-        context["result"] = self._extract_gremlin(response)
+    def _extract_gremlin(self, response: str) -> str:
+        match = re.search("```gremlin.*```", response, re.DOTALL)
+        assert match is not None, f"No gremlin found in response: {response}"
+        return match.group()[len("```gremlin"):-len("```")].strip()
+
+    def _format_examples(self, examples: Optional[List[Dict[str, str]]]) -> 
Optional[str]:
+        if not examples:
+            return None
+        example_strings = []
+        for example in examples:
+            example_strings.append(
+                f"- query: {example['query']}\n"
+                f"- gremlin:\n```gremlin\n{example['gremlin']}\n```")
+        return "\n\n".join(example_strings)
+
+    def _format_vertices(self, vertices: Optional[List[str]]) -> Optional[str]:
+        if not vertices:
+            return None
+        return "\n".join([f"- {vid}" for vid in vertices])
+
+    async def async_generate(self, context: Dict[str, Any]):
+        async_tasks = {}
+        query = context.get("query")
+        raw_example = [{'query': 'who is peter', 'gremlin': "g.V().has('name', 
'peter')"}]
+        raw_prompt = self.gremlin_prompt.format(
+            query=query,
+            schema=self.schema,
+            example=self._format_examples(examples=raw_example),
+            vertices=self._format_vertices(vertices=self.vertices)
+        )
+        async_tasks["raw_answer"] = 
asyncio.create_task(self.llm.agenerate(prompt=raw_prompt))
+
+        examples = context.get("match_result")
+        init_prompt = self.gremlin_prompt.format(
+            query=query,
+            schema=self.schema,
+            example=self._format_examples(examples=examples),
+            vertices=self._format_vertices(vertices=self.vertices)
+        )
+        async_tasks["initialized_answer"] = 
asyncio.create_task(self.llm.agenerate(prompt=init_prompt))
+
+        raw_response = await async_tasks["raw_answer"]
+        initialized_response = await async_tasks["initialized_answer"]
+        log.debug("Text2Gremlin with tmpl prompt:\n %s,\n LLM Response: %s", 
init_prompt, initialized_response)
+
+        context["result"] = 
self._extract_gremlin(response=initialized_response)
+        context["raw_result"] = self._extract_gremlin(response=raw_response)
+        context["call_count"] = context.get("call_count", 0) + 2
 
-        context["call_count"] = context.get("call_count", 0) + 1
         return context
 
-    def _extract_gremlin(self, response: str) -> str:
-        match = re.search(r'gremlin[:：][^\n]+\n?', response)
-        if match is None:
-            return "Unable to generate gremlin from your query."
-        return match.group()[len("gremlin:"):].strip()
-
+    def run(self, context: Dict[str, Any]) -> Dict[str, Any]:
+        query = context.get("query", "")
+        if not query:
+            raise ValueError("query is required")
 
-if __name__ == '__main__':
-    print(gremlin_examples([{"query": "hello", "gremlin": "g.V()"}]))
+        context = asyncio.run(self.async_generate(context))
+        return context
diff --git a/hugegraph-llm/src/hugegraph_llm/resources/demo/css.py 
b/hugegraph-llm/src/hugegraph_llm/resources/demo/css.py
index 0c56a40..117a068 100644
--- a/hugegraph-llm/src/hugegraph_llm/resources/demo/css.py
+++ b/hugegraph-llm/src/hugegraph_llm/resources/demo/css.py
@@ -29,4 +29,13 @@ footer {
     max-height: 250px;
     overflow-y: auto; /* enable scroll */
 }
+
+/* FIXME: wrap code is not work as expected now */
+.wrap-code {
+    white-space: pre-wrap; /* CSS3 */
+    white-space: -moz-pre-wrap; /* Mozilla, since 1999 */
+    white-space: -pre-wrap; /* Opera 4-6 */
+    white-space: -o-pre-wrap; /* Opera 7 */
+    word-wrap: break-word; /* Internet Explorer 5.5+ */
+}
 """
diff --git a/hugegraph-llm/src/hugegraph_llm/resources/demo/text2gremlin.csv 
b/hugegraph-llm/src/hugegraph_llm/resources/demo/text2gremlin.csv
new file mode 100644
index 0000000..9b7f07e
--- /dev/null
+++ b/hugegraph-llm/src/hugegraph_llm/resources/demo/text2gremlin.csv
@@ -0,0 +1,99 @@
+query,gremlin
+腾讯适合合作吗,"g.V().has('company','name','腾讯').as('a').project('公司信息','法人','对外投资企业数量','投资人-自然人','高管','投资人-公司','最终受益人-自然人','最终受益人-公司').by(valueMap('description',
 'email', 'phone', 'operatingStatus', 'registrationAddress', 'salaryTreatment', 
'registeredCapital', 'registeredCapitalCurrency', 
'financingInformation')).by(select('a').in('legalPerson').values('name')).by(select('a').out('companyInvest').values('name').count()).by(select('a').in('personInvest').values('name').fold()).by(select('a').i
 [...]
+四川省有哪些企业？,"g.V().has('company','province','四川').limit(20).values('name')"
+腾讯是在哪年成立的,"g.V().has('company','name','腾讯').values('establishmentYear')"
+给我一份2011年成立的公司名单,"g.V().has('company','establishmentYear',2011).limit(20).values('name')"
+5278600是哪个企业的电话？经常骚扰,"g.V().has('company','phone', 
containing('5278600')).values('name')"
+北京有哪些通用航空生产服务行业的公司,"g.V().has('company','city','北京').has('industry',containing('通用航空生产服务业')).values('name')"
+沈阳1999年成立的计算机外围设备制造企业有哪些,"g.V().has('company', 
'city','沈阳').has('establishmentYear', 
1999).has('industry',containing('计算机外围设备制造')).limit(20).values('name')"
+沈阳2000年以前成立的计算机外围设备制造企业有哪些,"g.V().has('company', 
'city','沈阳').has('establishmentYear', 
lt(2000)).has('industry',containing('计算机外围设备制造')).limit(20).values('name')"
+马化腾是哪家公司法人,"g.V().has('person', 'name', 
'马化腾').out('legalPerson').limit(20).values('name')"
+马化腾是哪家公司股东,"g.V().has('person', 'name', 
'马化腾').out('personInvest').limit(20).values('name')"
+腾讯的投资人都有谁,"g.V().has('company', 'name', '腾讯').inE('personInvest', 
'companyInvest').as('s').otherV().as('p').project('shareholdingRatio', 
'name').by(select('s').values('shareholdingRatio')).by(select('p').values('name'))"
+腾讯的注册资本是多少,"g.V().has('company', 'name', '腾讯').valueMap('registeredCapital', 
'registeredCapitalCurrency')"
+腾讯的工资待遇怎么样？,"g.V().has('company', 'name', '腾讯').values('salaryTreatment')"
+腾讯最近在招聘吗？,"g.V().has('company', 'name', '腾讯').values('recruitmentInfo')"
+腾讯的党委书记是谁,"g.V().has('company', 'name', 
'腾讯').inE('serve').as('po').outV().as('pe').project('name', 
'position').by(select('po').values('position')).by(select('pe').values('name'))"
+腾讯中马化腾的职务,"g.V().has('company', 'name', 
'腾讯').inE('serve').where(outV().has('name', '马化腾')).values('position')"
+腾讯股权构成和股东信息,"g.V().has('company', 'name', '腾讯').inE('personInvest', 
'companyInvest').as('a').outV().project('name', 
'info').by(values('name')).by(select('a').valueMap())"
+腾讯的地址和法人信息,"g.V().has('company','name','腾讯').as('a').in('legalPerson').project('registrationAddress',
 
'legalPerson').by(select('a').values('registrationAddress')).by(values('name'))"
+腾讯和美团的关系是什么,"g.V().has('company','name','腾讯').bothE().where(otherV().has('company','name',
 '美团')).label()"
+腾讯的马化腾,"g.V().has('company', 'name', '腾讯').in().has('person', 'name', 
'马化腾').outE('legalPerson','actualControllerPerson','personInvest','serve').as('a').limit(20).inV().as('b').project('edge',
 'name').by(select('a').label()).by(select('b').values('name'))"
+腾讯的统一社会信用代码,"g.V().has('company', 'name', '腾讯').values('unifiedCreditCode')"
+腾讯的员工工资是多少,"g.V().hasLabel('company').has('name', 
'腾讯').values('salaryTreatment')"
+腾讯详细工商信息,"g.V().has('company', 'name', '腾讯').match( 
__.as('a').valueMap('name', 'registrationAddress', 'registeredCapital', 
'industry', 'businessScope').fold().as('company'), 
__.as('a').in('legalPerson').values('name').fold().as('legalPerson'),__.as('a').out('branch').values('name').fold().as('branch'),__.as('a').in('personInvest').values('name').fold().as('personInvest'),__.as('a').in('companyInvest').values('name').fold().as('companyInvest')).select('company','legalPerson',
 'branch', ' [...]
+腾讯的马化腾和张志东是合作伙伴关系吗？,"g.V().has('person', 
'name','马化腾').out('partners').has('person','name','张志东').hasNext()"
+腾讯的马化腾的合作伙伴都有谁？,"g.V().has('person', 
'name','马化腾').out('partners').limit(20).values('name')"
+马化腾都投资了哪些公司？,"g.V().has('person', 
'name','马化腾').out('personInvest').limit(20).values('name')"
+腾讯都控股了哪些公司，且控股比例是多少？,"g.V().has('company', 'name', 
'腾讯').outE('controllingShareholderCompany').as('a').inV().project('name', 
'info').by(values('name')).by(select('a').valueMap())"
+马化腾在腾讯的持股比例是多少？,"g.V().has('company', 'name', 
'腾讯').inE('personInvest').where(outV().has('person','name','马化腾')).values('shareholdingRatio')"
+马化腾在腾讯的认缴出资额是多少？,"g.V().has('company', 'name', 
'腾讯').inE('personInvest').where(outV().has('person','name','马化腾')).valueMap('capitalContribution','unitOfContribution')"
+马化腾在腾讯的认缴出资日期是什么时候？,"g.V().has('company', 'name', 
'腾讯').inE('personInvest').where(outV().has('person','name','马化腾')).values('contributionDate')"
+腾讯实际人数与缴纳社保人数多少,"g.V().has('company', 'name', 
'腾讯').valueMap('insuredNumberOfPeople','numberOfEmployees')"
+腾讯已实缴的资金有多少,"g.V().has('company', 'name', 
'腾讯').valueMap('paidInCapital','paidInCapitalCurrency')"
+腾讯有多少家子公司,"g.V().has('company', 'name', '腾讯').out('branch').count()"
+腾讯有多少员工，主要做什么的,"g.V().has('company', 'name', 
'腾讯').valueMap('numberOfEmployees', 'businessScope')"
+腾讯的实际控制人和法人是谁？,"g.V().has('company', 'name', 
'腾讯').inE().hasLabel('legalPerson', 'actualControllerPerson', 
'actualControllerCompany').as('a').outV().as('b').project('name', 
'label').by(select('b').values('name')).by(select('a').label())"
+制造业有哪些公司上海,"g.V().has('company', 
'city','上海').has('industry',containing('制造业')).limit(20).values('name')"
+2000年之前成立的企业有哪些,"g.V().has('company', 'establishmentYear', 
lt(2000)).values('name')"
+和腾讯有关系的公司和人有哪些,"g.V().has('company', 'name', 
'腾讯').bothE().limit(20).as('a').otherV().as('b').project('name', 
'label').by(select('b').values('name')).by(select('a').label())"
+2008年到2015年之间成立的企业有哪些,"g.V().has('company','establishmentYear', between(2008, 
2015)).limit(20).values('name')"
+腾讯有官方的微信公众号吗？,"g.V().has('company', 'name', '腾讯').values('wechatPublicNumber')"
+腾讯旗下有几家公司,"g.V().has('company','name','腾讯').as('a').match(__.as('a').out('branch').values('name').count().fold().as('num'),__.as('a').out('branch').values('name').fold().as('company_name')).select('num','company_name')"
+腾讯的子公司或分公司有哪些,"g.V().has('company','name','腾讯').as('a').match(__.as('a').out('branch').values('name').count().fold().as('num'),__.as('a').out('branch').limit(10).values('name').fold().as('company_name')).select('num','company_name')"
+腾讯的法人代表是谁，该公司注册资本是多少？,"g.V().has('company', 'name', 
'腾讯').as('a').project('legalPerson','registeredCapital').by(select('a').in('legalPerson').values('name')).by(valueMap('registeredCapital','registeredCapitalCurrency'))"
+腾讯各股东占比,"g.V().has('company', 'name', '腾讯').inE('personInvest', 
'companyInvest').as('a').outV().as('b').project('name', 
'info').by(select('b').values('name')).by(select('a').valueMap())"
+腾讯的主要负责人,"g.V().has('company', 'name', 
'腾讯').inE('legalPerson','actualControllerPerson','serve').as('a').outV().project('label','name').by(select('a').label()).by(values('name'))"
+腾讯的实际控制人对其股权比例为多少,"g.V().has('company', 'name', 
'腾讯').match(__.as('a').in('actualControllerPerson').as('b'),__.as('a').inE('personInvest').as('c').outV().as('d'),where('b',
 eq('d')).by('name')).project('name', 
'info').by(select('d').values('name')).by(select('c').valueMap())"
+腾讯都控股了哪些公司，其认缴出资额分别是多少？,"g.V().has('company', 'name', 
'腾讯').outE('controllingShareholderCompany').as('e').inV().as('v').project('name','info').by(select('v').values('name')).by(select('e').valueMap())"
+腾讯的马化腾还在其他哪些公司任职？,"g.V().has('company', 'name', 
'腾讯').as('a').in().has('person','name', 
'马化腾').out('serve')where(neq('a')).values('name')"
+腾讯的老板信息,"g.V().has('company', 'name', 
'腾讯').inE().hasLabel('legalPerson','actualControllerPerson', 
'actualControllerCompany').as('a').outV().as('b').project('label', 
'name').by(select('a').label()).by(select('b').values('name'))"
+腾讯和美团的关系,"g.V().has('company', 'name', 
'腾讯').bothE().where(otherV().has('company', 'name', '美团')).label()"
+腾讯的法定代表人马化腾有几家公司,"g.V().has('company', 'name', 
'腾讯').in('actualControllerPerson', 'legalPerson').has('person', 'name', 
'马化腾').outE('legalPerson','actualControllerPerson','personInvest','serve').as('a').limit(20).inV().as('b').project('edge',
 'name').by(select('a').label()).by(select('b').values('name'))"
+腾讯的马化腾和王兴的股份比例,"g.V().has('company', 'name', 
'腾讯').inE('personInvest').as('a').outV().has('name',within('马化腾', 
'王兴')).as('b').project('name','shareholdingRatio').by(select('b').values('name')).by(select('a').values('shareholdingRatio'))"
+腾讯和美团的关系,"g.V().has('name', '腾讯').bothE().where(otherV().has('name', 
'美团')).label()"
+腾讯的股东马化腾的相关信息,"g.V().has('company', 'name', 
'腾讯').in('personInvest').has('person', 'name', 
'马化腾').outE('legalPerson','actualControllerPerson','personInvest','serve').as('a').limit(20).inV().as('b').project('edge',
 'name').by(select('a').label()).by(select('b').values('name'))"
+腾讯在招战略分析岗位,"g.V().has('company','name','腾讯').valueMap('recruitmentInfo')"
+腾讯和中科院有关系吗,"g.V().has('name', '腾讯').bothE().where(otherV().has('name', 
'中科院')).label()"
+腾讯的社保信息,"g.V().has('company', 'name', '腾讯').valueMap('insuredNumberOfPeople', 
'unifiedCreditCode', 'industry', 
'taxpayerIdentificationNumber','administrativeDivision', 'province')"
+腾讯的知识产权情况,"g.V().has('company', 'name', 
'腾讯').valueMap('copyrightForWorks','websiteRegistrationRecord','patentInformation')"
+腾讯的法定代表人马化腾的背景,"g.V().has('company', 'name', 
'腾讯').in('legalPerson').has('person', 'name', 
'马化腾').outE('legalPerson','actualControllerPerson','personInvest','serve').as('a').limit(20).inV().as('b').project('edge',
 'name').by(select('a').label()).by(select('b').values('name'))"
+腾讯马化腾共持股多少,"g.V().has('company', 'name', 
'腾讯').inE('personInvest').where(outV().has('name','马化腾')).valueMap()"
+腾讯的老板成立过哪些公司,"g.V().has('company', 'name', 
'腾讯').in('legalPerson').outE('legalPerson','actualControllerPerson','personInvest','serve').as('a').limit(20).inV().as('b').project('edge',
 'name').by(select('a').label()).by(select('b').values('name'))"
+腾讯的23条对外投资信息,"g.V().has('company', 'name', 
'腾讯').outE('companyInvest').as('a').inV().as('b').project('name', 
'info').by(select('b').values('name')).by(select('a').valueMap())"
+腾讯领导层,"g.V().has('company', 'name', 
'腾讯').inE('serve').as('a').outV().as('b').project('name','position').by(select('b').values('name')).by(select('a').values('position'))"
+腾讯的马化腾还有哪些企业任职,"g.V().has('company', 'name', '腾讯').as('a').in().has('person', 
'name', '马化腾').out('serve').where(neq('a')).values('name')"
+腾讯黄永刚注册资本,"g.V().has('company', 'name', 
'腾讯').inE('personInvest').where(outV().has('person', 
'name','黄永刚')).valueMap('capitalContribution','unitOfContribution')"
+腾讯的16家控股企业都是哪16家,"g.V().has('company', 'name', 
'腾讯').out('controllingShareholderCompany').values('name').limit(16)"
+腾讯历任董事会秘书,"g.V().has('company', 'name', '腾讯').inE('serve').has('position', 
containing('董事会秘书')).outV().values('name')"
+腾讯的十大股东是谁,"g.V().has('company', 'name', '腾讯').inE('personInvest', 
'companyInvest').order().by('shareholdingRatio',desc).limit(10).as('a').outV().as('b').project('name','shareholdingRatio').by(select('b').values('name')).by(select('a').values('shareholdingRatio'))"
+腾讯的法人和实际控制人都是谁,"g.V().has('company', 'name', '腾讯').as('a')
+.project('法人', '实际控制人')
+.by(__.in('legalPerson').values('name').fold())
+.by(__.in('actualControllerPerson','actualControllerCompany').values('name').fold())
+.select('法人','实际控制人').by(__.coalesce(identity(), __.constant('未知')))"
+腾讯的法人和董事长分别是谁,"g.V().has('company', 'name', '腾讯').as('a')
+.project('法人', '董事长')
+.by(__.in('legalPerson').values('name').fold())
+.by(__.inE('serve').has('position', 
containing('董事长')).outV().values('name').fold())
+.select('法人','董事长').by(__.coalesce(identity(), __.constant('未知')))"
+腾讯的CEO投资了哪些企业,"g.V().has('company', 'name', '腾讯').inE('serve').has('position', 
containing('CEO')).outV().outE('legalPerson','actualControllerPerson','personInvest','serve').as('a').limit(20).inV().as('b').project('edge',
 'name').by(select('a').label()).by(select('b').values('name'))"
+腾讯的总经理有控股公司吗？,"g.V().has('company', 'name', '腾讯').inE('serve').has('position', 
containing('总经理')).outV().out('controllingShareholderPerson').values('name')"
+腾讯老板的合作伙伴有哪些,"g.V().has('company', 'name', 
'腾讯').in('legalPerson','actualControllerPerson').dedup().project('name','partners').by(values('name')).by(out('partners').values('name').fold())"
+腾讯的最总受益人和董事长是同一个人吗？,"g.V().has('company', 'name', '腾讯').as('c')
+.project('最终受益人', '董事长')
+.by(__.in('finalBeneficiaryPerson').values('name').fold().coalesce(identity(), 
__.constant('未知')))
+.by(__.inE('serve').has('position', 
containing('董事长')).outV().values('name').fold().coalesce(identity(), 
__.constant('未知')))
+.select('最终受益人', '董事长')"
+腾讯的股东、法人、最终受益人、董事、董事长分别有哪些,"g.V().has('company', 'name', '腾讯').as('a')
+.project('股东','法人','最终受益人','董事','董事长')
+.by(__.in('personInvest').values('name').fold())
+.by(__.in('legalPerson').values('name').fold())
+.by(__.in('actualControllerPerson').values('name').fold())
+.by(__.inE('serve').has('position', 
containing('董事')).outV().values('name').fold())
+.by(__.inE('serve').has('position', 
containing('董事长')).outV().values('name').fold())"
+腾讯的关联公司,"g.V().has('company','name','腾讯').project('branch','companyInvest').by(out('branch').values('name').fold()).by(out('companyInvest').values('name').fold())"
+腾讯的实缴资本和注册资本分别是多少,"g.V().has('company', 'name', 
'腾讯').valueMap('registeredCapital', 
'registeredCapitalCurrency','paidInCapital','paidInCapitalCurrency')"
+腾讯有哪些知识产权,"g.V().has('company','name','腾讯').valueMap('copyrightForWorks', 
'patentInformation', 'websiteRegistrationRecord')"
+腾讯的作品著作权有哪些,"g.V().has('company', 'name', '腾讯').values('copyrightForWorks')"
+腾讯有哪些岗位,"g.V().has('company','name','腾讯').values('recruitmentInfo')"
diff --git a/hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py 
b/hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py
index 73d7057..28b867d 100644
--- a/hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py
+++ b/hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py
@@ -65,7 +65,7 @@ def extract_graph(input_file, input_text, schema, 
example_prompt) -> str:
             builder.import_schema(from_hugegraph=schema)
     else:
         return "ERROR: please input with correct schema/format."
-    builder.chunk_split(texts, "document", "zh").extract_info(example_prompt, 
"property_graph")
+    builder.chunk_split(texts, "document", "zh").extract_info(example_prompt, 
"triples")
 
     try:
         context = builder.run()
diff --git a/hugegraph-python-client/src/pyhugegraph/utils/util.py 
b/hugegraph-python-client/src/pyhugegraph/utils/util.py
index f0a8015..90f27c2 100644
--- a/hugegraph-python-client/src/pyhugegraph/utils/util.py
+++ b/hugegraph-python-client/src/pyhugegraph/utils/util.py
@@ -20,6 +20,7 @@ import json
 import traceback
 
 import requests
+
 from pyhugegraph.utils.exceptions import (
     NotAuthorizedError,
     NotFoundError,
@@ -29,39 +30,34 @@ from pyhugegraph.utils.log import log
 
 
 def create_exception(response_content):
-    data = json.loads(response_content)
-    if "ServiceUnavailableException" in data["exception"]:
-        raise ServiceUnavailableException(
-            f'ServiceUnavailableException, "message": "{data["message"]}",'
-            f' "cause": "{data["cause"]}"'
-        )
+    try:
+        data = json.loads(response_content)
+        if "ServiceUnavailableException" in data.get("exception", ""):
+            raise ServiceUnavailableException(
+                f'ServiceUnavailableException, "message": "{data["message"]}",'
+                f' "cause": "{data["cause"]}"'
+            )
+    except (json.JSONDecodeError, KeyError) as e:
+        raise Exception(f"Error parsing response content: {response_content}") 
from e
     raise Exception(response_content)
 
 
 def check_if_authorized(response):
     if response.status_code == 401:
-        raise NotAuthorizedError(
-            f"Please check your username and password. {str(response.content)}"
-        )
+        raise NotAuthorizedError(f"Please check your username and password. 
{str(response.content)}")
     return True
 
 
 def check_if_success(response, error=None):
-    if (not str(response.status_code).startswith("20")) and 
check_if_authorized(
-            response
-    ):
+    if (not str(response.status_code).startswith("20")) and 
check_if_authorized(response):
         if error is None:
             error = NotFoundError(response.content)
 
         req = response.request
         req_body = req.body if req.body else "Empty body"
         response_body = response.text if response.text else "Empty body"
-        # Log the detailed information
-        print(
-            f"\033[93mError-Client:\n"
-            f"Request URL: {req.url}, Request Body: {req_body}\nResponse Body: 
"
-            f"{response_body}\033[0m"
-        )
+        log.error("Error-Client: Request URL: %s, Request Body: %s, Response 
Body: %s",
+                  req.url, req_body, response_body)
         raise error
     return True
 
@@ -108,18 +104,17 @@ class ResponseValidation:
 
                 req_body = response.request.body if response.request.body else 
"Empty body"
                 req_body = req_body.encode('utf-8').decode('unicode_escape')
-                log.error(  # pylint: disable=logging-fstring-interpolation
-                    f"{method}: {e}\n[Body]: {req_body}\n[Server Exception]: 
{details}"
-                )
+                log.error("%s: %s\n[Body]: %s\n[Server Exception]: %s",
+                          method, 
str(e).encode('utf-8').decode('unicode_escape'), req_body, details)
 
                 if response.status_code == 404:
                     raise NotFoundError(response.content) from e
+                if response.status_code == 400:
+                    raise Exception(f"Server Exception: {details}") from e
                 raise e
 
         except Exception:  # pylint: disable=broad-exception-caught
-            log.error(  # pylint: disable=logging-fstring-interpolation
-                f"Unhandled exception occurred: {traceback.format_exc()}"
-            )
+            log.error("Unhandled exception occurred: %s", 
traceback.format_exc())
 
         return result

(incubator-hugegraph-ai) branch main updated: feat(llm): added the process of text2gql in graphrag V1.0 (#105)

Reply via email to