Re: [PR] docs: add a blog for agentic graphrag [incubator-hugegraph-doc]

via GitHub Fri, 31 Oct 2025 03:23:37 -0700


Copilot commented on code in PR #424:
URL: 
https://github.com/apache/incubator-hugegraph-doc/pull/424#discussion_r2480915486



##########
content/en/blog/hugegraph-ai/agentic_graphrag.md:
##########
@@ -0,0 +1,452 @@
+---
+date: 2025-10-29
+title: "Agentic GraphRAG"
+linkTitle: "Agentic GraphRAG"
+---
+
+# Project Background
+
+To address the problem of temporal discrepancies between model training data 
and real-world data, Retrieval-Augmented Generation (RAG) technology has 
emerged. RAG, as the name suggests, is a technique that retrieves relevant data 
from external data sources (Retrieval) to augment (Argument) the quality of the 
answers generated (Generation) by large language models.
+
+The earliest RAG employed a simple Retrieval-Generation architecture. We take 
the user's question, perform some pre-processing (keyword extraction, etc.), 
obtain the pre-processed question, and then use an Embedding Model to grab 
relevant information from a vast amount of data as a Prompt, which is then fed 
to the large language model to enhance the quality of its responses.
+
+However, relying solely on semantic similarity matching to retrieve relevant 
information may not handle all situations, as the information that can enhance 
answer quality may not always be semantically similar to the question itself. A 
common example is: "Tell me the ontological view of the disciple of the 
philosopher who proposed that water is the origin of all things." Our data may 
not directly contain the answer to this question. The knowledge base might 
contain:
+
+1. Thales proposed that water is the origin of all things.
+2. Anaximander was a disciple of Thales.
+3. Anaximander identified the Apeiron, which has no formal definition, as the 
origin of all things.
+
+If we rely solely on semantic similarity matching, we are likely to only 
retrieve the first sentence to augment the large language model's answer. 
However, without information from sentences 2 and 3, and if the large language 
model lacks philosophy-related knowledge in its training dxata, it will be 
unable to correctly answer the question and might even "hallucinate."

Review Comment:
   Corrected spelling of 'dxata' to 'data'.
   ```suggestion
   If we rely solely on semantic similarity matching, we are likely to only 
retrieve the first sentence to augment the large language model's answer. 
However, without information from sentences 2 and 3, and if the large language 
model lacks philosophy-related knowledge in its training data, it will be 
unable to correctly answer the question and might even "hallucinate."
   ```



##########
content/cn/blog/hugegraph-ai/agentic_graphrag.md:
##########
@@ -0,0 +1,449 @@
+---
+date: 2025-10-29

Review Comment:
   The date '2025-10-29' is in the future. This should match the correct 
publication date and should be consistent with the English version. Verify the 
intended publication date.
   ```suggestion
   date: 2024-05-29
   ```



##########
content/en/blog/hugegraph-ai/agentic_graphrag.md:
##########
@@ -0,0 +1,452 @@
+---
+date: 2025-10-29

Review Comment:
   The date '2025-10-29' is in the future. If this is intended to be the 
publication date, it should likely be '2024-10-29' or the current actual date. 
Using a future date may cause issues with date-based sorting or filtering of 
blog posts.
   ```suggestion
   date: 2024-10-29
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] docs: add a blog for agentic graphrag [incubator-hugegraph-doc]

Reply via email to