Re: [PR] Fix website content [geaflow-website]

via GitHub Mon, 13 Oct 2025 22:38:24 -0700


Leomrlin commented on code in PR #6:
URL: https://github.com/apache/geaflow-website/pull/6#discussion_r2427962785



##########
blog/28.md:
##########
@@ -1,49 +1,49 @@
 ---
-title: 流图计算之增量match原理与应用
+title: Principles and Applications of Incremental Match in Streaming Graph 
Computing
 date: 2025-6-3
 ---
 
 
![](https://intranetproxy.alipay.com/skylark/lark/0/2025/png/23857192/1743162676746-973d8e75-11b5-43d7-8832-724e7332b964.png)
 
-## 问题背景
-
-在流式计算中，数据往往不是全部一批到来，而会源源不断地进行输入和计算，在图计算/图查询领域，也存在类似的场景，图的点边不断地从数据源读取，进行构图，从而形成增量图。在增量图查询中，图随时发生着变化，在不同的图版本中，进行图查询的结果也会有所不同。对于某一次新增的点边，构成了一个新的版本的图，如果重新对全图（即当前所有点边）进行图遍历，开销较大，并且也会和历史数据有重复。由于历史的数据已经计算过一遍，理想情况下，只需要对增量所影响的部分进行计算/查询，而不需要对全图重新进行查询。
+## Problem Background
+In streaming computing, data rarely arrives all at once but is continuously 
input and processed. Similarly, in graph computing/graph querying scenarios, 
vertices and edges are constantly read from data sources to construct graphs 
incrementally. In incremental graph queries, the graph evolves continuously, 
leading to different query results across graph versions. When new 
vertices/edges form an updated graph version, recomputing through the entire 
graph incurs high overhead and duplicates historical computations. Since 
historical data has already been processed, ideally only the delta-affected 
portions should be computed/queried without full-graph re-execution.
 
 <!-- truncate -->
 
-<font style="color:rgb(51, 51, 51);">GQL（Graph Query Language）</font><font 
style="color:rgb(0, 0, 0);">是国际标准化组织（ISO）为标准化图查询语言所制定的一个标准，</font><font 
style="color:rgb(51, 51, 51);">用于在图上执行查询的语言。Geaflow 
是蚂蚁图计算团队开源的流图计算引擎，专注于处理动态变化的图数据，支持大规模、高并发的实时图计算场景。</font>本文将介绍在 Geaflow 
引擎中，对增量图使用 GQL 进行增量 Match 的方法，目的尽可能地只对增量的数据进行查询，避免冗余的全量计算。
+<font style="color:rgb(51, 51, 51);">GQL (Graph Query Language)</font> <font 
style="color:rgb(0, 0, 0);">is an international standard developed by ISO for 
graph query languages,</font> <font style="color:rgb(51, 51, 51);">used to 
execute queries on graphs. Geaflow is an open-source streaming graph engine by 
Ant Group’s graph computing team, specializing in dynamically changing graph 
data and supporting large-scale, high-concurrency real-time graph computing 
scenarios.</font> This article introduces Geaflow’s approach to incremental 
GQL-based Match queries on dynamic graphs, aiming to execute queries solely on 
delta data while avoiding redundant full computations.
 
 
![画板](https://intranetproxy.alipay.com/skylark/lark/0/2025/jpeg/23857192/1741574572676-ff7e2c56-14d0-470c-b21d-604f928c6ec9.jpeg)
 
-## 当前问题
-
-<font style="color:rgb(0, 0, 0);">Geaflow 引擎基于点中心框架（vertex 
center），通过迭代的方式，每一轮迭代中，每个点向其他点发送消息，并在下一轮收到消息时进行处理、分析。</font>在 Geaflow 的框架中，GQL 
的查询需要从前往后进行 Traversal 遍历走图，即从起始节点开始出发，进行扩散，依次进行点边匹配，直到匹配到所需要的查询 
pattern。在动态图里场景，如果只使用当前批次新增的点边触发计算，增量的结果会有缺失，例如下面例子所示。
-
-<div style="text-align: center;">
-<img 
src="https://intranetproxy.alipay.com/skylark/lark/0/2025/jpeg/23857192/1741576149930-b169b7da-0600-4fca-b6ad-5eadcfdbff5b.jpeg";
 alt='画板'  height="281" width="486"></div>
-
-如上问题关键在于如果只考虑增量的部分，则点 A1 无法触发计算，但是点 A1 实际包含于增量结果中。所以需要设法让点 A1 
参与计算，我们考虑一种从新增点扩充子图的方法，将 a 触发。将整个查询分为 2 个阶段，Evolve 扩展阶段和 Traversal 阶段。在 Evolve 
阶段中，从起始点开始，向邻居发送 EvolveMessage，后续的 iteration 中，收到 EvolveMessage 的点加入到 
EvolveVertices 集合中。而后的 Traversal 阶段则会使用 EvolveVertices 里的点触发遍历，即表示当前窗口的触发点。
+## Current Challenges
+<font style="color:rgb(0, 0, 0);">The Geaflow engine adopts a vertex-centric 
framework, where each vertex sends messages iteratively. Vertices process 
received messages in subsequent iterations.</font> For GQL queries, traversal 
starts from initial vertices for pattern matching (e.g., from node `A` to `B` 
to `C`). In dynamic graphs, if only newly added vertices/edges trigger 
computation, results may be incomplete, as illustrated below:
 
-## 方案步骤
+<div style="text-align: center;">  
+<img 
src="https://intranetproxy.alipay.com/skylark/lark/0/2025/jpeg/23857192/1741576149930-b169b7da-0600-4fca-b6ad-5eadcfdbff5b.jpeg";
 alt='画板'  height="281" width="486">  
+</div>  
 
-整体流程示例图如下：
+The key issue is that **Vertex A1 cannot trigger computation if only the delta 
is considered**, yet it belongs to the incremental results. To resolve this, we 
propose a subgraph expansion method from new vertices. The query is divided 
into two phases:
+1. **Evolve Phase**: Propagate `EvolveMessage` from new vertices to neighbors, 
adding recipients to the `EvolveVertices` set.
+2. **Traversal Phase**: Use `EvolveVertices` as traversal triggers for the 
current window.
 
+## Solution Workflow
+Overall process:  
 
![](https://intranetproxy.alipay.com/skylark/lark/0/2025/png/23857192/1741599519420-37fd1d9f-6623-44b3-87e4-5ac5275b876f.png)
 
-1. 首先得到 query 的计划的迭代次数 N，需向外扩充 N-1 度(<font 
style="color:#000000;">maxEvolveIteration=N-1)</font>，即可覆盖当前 query。框架的最大迭代数将设置为 
N + maxEvolveIteration（N>2）
+**Steps:**
+1. Determine the query’s iteration count `N`. Expand `N-1` hops outward 
(`maxEvolveIteration = N-1`) to cover the query. The max iteration becomes `N + 
maxEvolveIteration` (when `N>2`).
 
 ```sql
 例如

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Fix website content [geaflow-website]

Reply via email to