Appointat commented on code in PR #716:
URL: https://github.com/apache/geaflow/pull/716#discussion_r2652565722


##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/ChatRobot.java:
##########
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.Gson;
+import java.util.List;
+
+public class ChatRobot {
+
+    private ModelInfo modelInfo;
+
+    public ChatRobot() {
+        this.modelInfo = new ModelInfo();
+    }
+
+    public ChatRobot(String model) {
+        this.modelInfo = new ModelInfo();
+        this.modelInfo.setModel(model);
+    }
+
+    public String singleSentence(String sentence) {
+        OfflineModelDirect model = new OfflineModelDirect();
+        ModelContext context = ModelContext.emptyContext();
+        context.setModelInfo(modelInfo);
+        context.userSay(sentence);
+        return model.chat(context);
+    }
+

Review Comment:
   The method name singleSentence() is not accurate; it actually sends a single 
message and retrieves a reply.
   Suggestion: rename to chat() or sendMessage()



##########
geaflow-ai/src/test/resources/graph_ldbc_sf/Comment/part-00000-0ea3b18b-e932-4946-9be9-d17b750eafea-c0001.csv:
##########
@@ -0,0 +1,58 @@
+creationDate|id|locationIP|browserUsed|content|length

Review Comment:
   Can you upload these CSV data somewhere else, instead of the `src`? Or can 
you package these data into some cloud/remote storage? Otherwise, this data 
would make the Java package larger. (However, given that the current PR is 
still very preliminary, I think this comment can be ignored.)



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/OfflineModelDirect.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.Gson;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.geaflow.common.utils.RetryCommand;
+
+public class OfflineModelDirect {
+
+
+    public String chat(ModelContext context) {
+        ModelInfo info = context.getModelInfo();
+        OkHttpDirectConnector connector = new OkHttpDirectConnector(
+                info.getUrl(), info.getApi(), info.getUserToken());
+        String request = new Gson().toJson(context);
+        org.apache.geaflow.ai.common.model.Response response = 
connector.post(request);
+        if (response.choices != null && !response.choices.isEmpty()) {
+            for (Response.Choice choice : response.choices) {
+                if (choice.message != null) {
+                    return choice.message.content;
+                }
+            }
+
+        }
+        return null;
+    }
+
+    public List<ChatRobot.EmbeddingResult> embedding(ModelEmbedding context) {
+        ModelInfo info = context.getModelInfo();
+        OkHttpDirectConnector connector = new OkHttpDirectConnector(
+                info.getUrl(), info.getApi(), info.getUserToken());

Review Comment:
   Creating a new OkHttpClient instance on every call leads to: establishing a 
new connection each time and resource leaks. 
   Since we’re not yet at a level to optimize this, it is recommended to add a 
TODO: reuse the Connector or OkHttpClient as a singleton or as a member 
variable.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/ChatRobot.java:
##########
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.Gson;
+import java.util.List;
+
+public class ChatRobot {
+
+    private ModelInfo modelInfo;
+
+    public ChatRobot() {
+        this.modelInfo = new ModelInfo();
+    }
+
+    public ChatRobot(String model) {
+        this.modelInfo = new ModelInfo();
+        this.modelInfo.setModel(model);
+    }
+
+    public String singleSentence(String sentence) {
+        OfflineModelDirect model = new OfflineModelDirect();
+        ModelContext context = ModelContext.emptyContext();
+        context.setModelInfo(modelInfo);
+        context.userSay(sentence);
+        return model.chat(context);
+    }
+
+    public String embedding(String... inputs) {
+        OfflineModelDirect model = new OfflineModelDirect();
+        ModelEmbedding context = ModelEmbedding.embedding(modelInfo, inputs);
+        List<EmbeddingResult> embeddingResults = model.embedding(context);
+        Gson gson = new Gson();
+        StringBuilder builder = new StringBuilder();
+        for (EmbeddingResult result : embeddingResults) {
+            builder.append("\n");
+            String json = gson.toJson(result);
+            builder.append(json);
+        }
+        return builder.toString();
+    }
+
+    public EmbeddingResult embeddingSingle(String input) {
+        OfflineModelDirect model = new OfflineModelDirect();
+        ModelEmbedding context = ModelEmbedding.embedding(modelInfo, input);
+        List<EmbeddingResult> embeddingResults = model.embedding(context);
+        return embeddingResults.get(0);
+    }
+
+    public ModelInfo getModelInfo() {
+        return modelInfo;
+    }
+
+    public void setModelInfo(ModelInfo modelInfo) {
+        this.modelInfo = modelInfo;
+    }
+
+    public static class EmbeddingResult {
+        public String input;
+        public double[] embedding;
+
+        public EmbeddingResult(String input, double[] embedding) {
+            this.input = input;
+            this.embedding = embedding;
+        }

Review Comment:
   what about merge it to "EmbeddingResponse"?



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/index/vector/TraversalVector.java:
##########
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.index.vector;
+
+public class TraversalVector implements IVector {
+
+    private final String[] vec;
+
+    public TraversalVector(String... vec) {
+        if (vec.length % 3 != 0) {
+            throw new RuntimeException("Traversal vector shold be src-edge-dst 
pairs");

Review Comment:
   shold -> should



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/OfflineModelDirect.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.Gson;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.geaflow.common.utils.RetryCommand;
+
+public class OfflineModelDirect {
+
+
+    public String chat(ModelContext context) {
+        ModelInfo info = context.getModelInfo();
+        OkHttpDirectConnector connector = new OkHttpDirectConnector(
+                info.getUrl(), info.getApi(), info.getUserToken());
+        String request = new Gson().toJson(context);
+        org.apache.geaflow.ai.common.model.Response response = 
connector.post(request);
+        if (response.choices != null && !response.choices.isEmpty()) {
+            for (Response.Choice choice : response.choices) {
+                if (choice.message != null) {
+                    return choice.message.content;
+                }
+            }
+
+        }
+        return null;
+    }
+
+    public List<ChatRobot.EmbeddingResult> embedding(ModelEmbedding context) {
+        ModelInfo info = context.getModelInfo();
+        OkHttpDirectConnector connector = new OkHttpDirectConnector(
+                info.getUrl(), info.getApi(), info.getUserToken());
+        ModelEmbedding requestContext = new ModelEmbedding(null, 
context.input);
+        requestContext.setModel(context.getModel());
+        String request = new Gson().toJson(requestContext);
+        final EmbeddingResponse response = RetryCommand.run(() -> {
+            return connector.embeddingPost(request);
+        }, 10, 3000);
+        if (response == null) {
+            return new ArrayList<>();
+        }
+        List<ChatRobot.EmbeddingResult> embeddingResults = new ArrayList<>();
+        for (EmbeddingResponse.EmbeddingVector v : response.data) {
+            int index = v.index;
+            String input = context.input[index];
+            double[] vector = v.embedding;
+            ChatRobot.EmbeddingResult result = new 
ChatRobot.EmbeddingResult(input, vector);
+            embeddingResults.add(result);
+        }

Review Comment:
   response.data may be null. No check of context.input[index]. If the index is 
out of bounds, an exception will be thrown.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/ModelInfo.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+public class ModelInfo {

Review Comment:
   Rename to ModelConfig may be better. This is the mainstream naming.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/Response.java:
##########
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.annotations.SerializedName;
+import java.util.List;
+
+
+public class Response {

Review Comment:
   If you need to integrate an API response compatible with the OpenAI/Gemini 
API, please import these variables.  Viewing the confusing point in this code, 
why is the `choice` needed? If the `Response` class is a generic class (not 
only for LLM responses), then I think usage and choice may be not necessary; 
these information could be stored as meta attributes of the message.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/index/vector/KeywordVector.java:
##########
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.index.vector;
+
+import java.util.Arrays;
+
+public class KeywordVector implements IVector {
+
+    private final String[] vec;
+
+    public KeywordVector(String... vec) {
+        this.vec = vec;
+    }
+
+    public String[] getVec() {
+        return vec;
+    }
+
+    @Override
+    public double match(IVector other) {
+        if (other.getType() != this.getType()) {

Review Comment:
   I hope that the way to determine the type inside this is consistent (in 
general, getType is more efficient, but the code will be a bit 
dirtier/redundant). Reference code:
   
   ```
   // EmbeddingVector - use instanceof
   if (!(other instanceof EmbeddingVector)) {
       return 0.0;
   }
   
   // KeywordVector - use getType()
   if (other.getType() != this.getType()) {
       return 0.0;
   }
   ```



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/graph/LocalFileGraphAccessor.java:
##########
@@ -0,0 +1,152 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.graph;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import java.util.function.Function;
+import org.apache.geaflow.ai.graph.io.*;
+
+public class LocalFileGraphAccessor implements GraphAccessor {
+
+    private final String resourcePath;
+    private final ClassLoader resourceClassLoader;
+    private final Graph graph;
+
+    public LocalFileGraphAccessor(ClassLoader classLoader, String 
resourcePath, Long limit,
+                                  Function<Vertex, Vertex> vertexMapper,
+                                  Function<Edge, Edge> edgeMapper) {
+        this.resourcePath = resourcePath;
+        this.resourceClassLoader = classLoader;
+        try {
+            this.graph = GraphFileReader.getGraph(resourceClassLoader, 
resourcePath, limit,
+                    vertexMapper, edgeMapper);
+        } catch (Throwable e) {
+            throw new RuntimeException("Init local graph error", e);
+        }
+    }
+
+    @Override
+    public GraphSchema getGraphSchema() {
+        return graph.getGraphSchema();
+    }
+
+    @Override
+    public GraphVertex getVertex(String label, String id) {
+        return new GraphVertex(graph.getVertex(label, id));
+    }
+
+    @Override
+    public GraphEdge getEdge(String label, String src, String dst) {
+        return new GraphEdge(graph.getEdge(label, src, dst));
+    }
+
+    @Override
+    public Iterator<GraphVertex> scanVertex() {
+        return new GraphVertexIterator(graph.scanVertex());
+    }
+
+    @Override
+    public Iterator<GraphEdge> scanEdge(GraphVertex vertex) {
+        return new GraphEdgeIterator(graph.scanEdge(vertex));
+    }
+
+    @Override
+    public List<GraphEntity> expand(GraphEntity entity) {
+        List<GraphEntity> results = new ArrayList<>();
+        if (entity instanceof GraphVertex) {
+            Iterator<Edge> iterator = graph.scanEdge((GraphVertex) entity);
+            while (iterator.hasNext()) {
+                results.add(new GraphEdge(iterator.next()));
+            }
+        } else if (entity instanceof GraphEdge) {
+            GraphEdge graphEdge = (GraphEdge) entity;
+            results.add(new GraphVertex(graph.getVertex(null, 
graphEdge.getEdge().getSrcId())));
+            results.add(new GraphVertex(graph.getVertex(null, 
graphEdge.getEdge().getDstId())));
+        }
+        return results;
+    }
+
+    @Override
+    public GraphAccessor copy() {
+        return this;
+    }
+
+    @Override
+    public String getType() {
+        return this.getClass().getSimpleName();
+    }
+
+
+    private static class GraphVertexIterator implements Iterator<GraphVertex> {
+
+        private final Iterator<Vertex> vertexIterator;
+
+        public GraphVertexIterator(Iterator<Vertex> vertexIterator) {
+            this.vertexIterator = vertexIterator;
+        }
+
+        @Override
+        public boolean hasNext() {
+            return vertexIterator.hasNext();
+        }
+
+        @Override
+        public GraphVertex next() {
+            if (!hasNext()) {
+                throw new NoSuchElementException();
+            }
+            Vertex nextVertex = vertexIterator.next();
+            return new GraphVertex(nextVertex);
+        }
+
+        @Override
+        public void remove() {

Review Comment:
   It is a read-only graph accessor, why does it support remove? (This comment 
may be related to personal development habits; I generally think the accessor 
is read-only)



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/ChatRobot.java:
##########
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.Gson;
+import java.util.List;
+
+public class ChatRobot {
+
+    private ModelInfo modelInfo;
+
+    public ChatRobot() {
+        this.modelInfo = new ModelInfo();
+    }
+
+    public ChatRobot(String model) {
+        this.modelInfo = new ModelInfo();
+        this.modelInfo.setModel(model);
+    }
+
+    public String singleSentence(String sentence) {
+        OfflineModelDirect model = new OfflineModelDirect();
+        ModelContext context = ModelContext.emptyContext();
+        context.setModelInfo(modelInfo);
+        context.userSay(sentence);
+        return model.chat(context);
+    }
+
+    public String embedding(String... inputs) {
+        OfflineModelDirect model = new OfflineModelDirect();
+        ModelEmbedding context = ModelEmbedding.embedding(modelInfo, inputs);
+        List<EmbeddingResult> embeddingResults = model.embedding(context);
+        Gson gson = new Gson();
+        StringBuilder builder = new StringBuilder();
+        for (EmbeddingResult result : embeddingResults) {
+            builder.append("\n");
+            String json = gson.toJson(result);
+            builder.append(json);
+        }
+        return builder.toString();
+    }

Review Comment:
   ChatRobot is responsible for both chat and embedding, with a mixed set of 
responsibilities.
   
   Suggestion: consider splitting into ChatService and EmbeddingService? May be 
better?



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/OfflineModelDirect.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.Gson;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.geaflow.common.utils.RetryCommand;
+
+public class OfflineModelDirect {
+
+
+    public String chat(ModelContext context) {
+        ModelInfo info = context.getModelInfo();
+        OkHttpDirectConnector connector = new OkHttpDirectConnector(
+                info.getUrl(), info.getApi(), info.getUserToken());
+        String request = new Gson().toJson(context);
+        org.apache.geaflow.ai.common.model.Response response = 
connector.post(request);

Review Comment:
   embedding() has a retry mechanism (10 attempts, 3-second interval) but 
chat() does not retry. Suggestion: uniformly add a retry mechanism.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/OfflineModelDirect.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.Gson;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.geaflow.common.utils.RetryCommand;
+
+public class OfflineModelDirect {

Review Comment:
   I have some misunderstandings about the term 'offline'. In the context, does 
'offline' mean the “offline (batch) processing/calling” model (like LLM) 
relative to streaming computation? Perhaps OnlineModelClient or 
RemoteModelClient would be more accurate.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/OkHttpDirectConnector.java:
##########
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.Gson;
+import java.io.IOException;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import okhttp3.MediaType;
+import okhttp3.OkHttpClient;
+import okhttp3.Request;
+import okhttp3.RequestBody;
+
+public class OkHttpDirectConnector {
+
+    private static Gson GSON = new Gson();
+
+    private String endpoint;
+    private String useApi;
+    private String userToken;
+    private OkHttpClient client;
+
+    public OkHttpDirectConnector(String endpoint, String useApi, String 
userToken) {
+        this.endpoint = endpoint;
+        this.useApi = useApi;
+        this.userToken = userToken;
+        OkHttpClient.Builder builder = new OkHttpClient.Builder();
+        builder.callTimeout(300, TimeUnit.SECONDS);
+        builder.connectTimeout(300, TimeUnit.SECONDS);
+        builder.readTimeout(300, TimeUnit.SECONDS);
+        builder.writeTimeout(300, TimeUnit.SECONDS);
+        this.client = builder.build();
+    }
+
+    public org.apache.geaflow.ai.common.model.Response post(String bodyJson) {
+        RequestBody requestBody = RequestBody.create(
+                MediaType.parse("application/json; charset=utf-8"),
+                bodyJson
+        );
+
+        String url = endpoint + useApi;
+        System.out.println(url);
+        Request request = new Request.Builder()
+                .url(url)
+                .addHeader("Authorization", "Bearer " + userToken)
+                .addHeader("Content-Type", "application/json; charset=utf-8")
+                .post(requestBody)
+                .build();
+
+        try (okhttp3.Response response = client.newCall(request).execute()) {
+            if (response.isSuccessful() && response.body() != null) {
+                String responseBody = response.body().string();
+                return GSON.fromJson(responseBody, 
org.apache.geaflow.ai.common.model.Response.class);
+            } else {
+                System.out.println("Request failed with code: " + 
response.code());
+            }
+        } catch (IOException e) {
+            e.printStackTrace();
+        }

Review Comment:
   The logic for catching function exceptions can be optimized:
   
   - post() catches the exception, prints the stack trace, and returns null;
   - embeddingPost() wraps it in a RuntimeException and throws it.
   
   Therefore, a unified approach is recommended.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/index/vector/TraversalVector.java:
##########
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.index.vector;
+
+public class TraversalVector implements IVector {
+
+    private final String[] vec;
+
+    public TraversalVector(String... vec) {
+        if (vec.length % 3 != 0) {
+            throw new RuntimeException("Traversal vector shold be src-edge-dst 
pairs");
+        }
+        this.vec = vec;
+    }
+
+    @Override
+    public double match(IVector other) {
+        return 0;
+    }
+
+    @Override
+    public VectorType getType() {
+        return VectorType.TraversalVector;
+    }
+
+    @Override
+    public String toString() {
+
+        StringBuilder builder = new StringBuilder();

Review Comment:
   ```
   @Override
   public String toString() {
       StringBuilder sb = new StringBuilder("TraversalVector{vec=");
       
       for (int i = 0; i < vec.length; i++) {
           if (i > 0) {
               sb.append(i % 3 == 0 ? "; " : "-");
           }
           sb.append(vec[i]);
           if (i % 3 == 2) {
               sb.append(">");
           }
       }
       
       return sb.append('}').toString();
   }
   ```



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/OkHttpDirectConnector.java:
##########
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.common.model;
+
+import com.google.gson.Gson;
+import java.io.IOException;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import okhttp3.MediaType;
+import okhttp3.OkHttpClient;
+import okhttp3.Request;
+import okhttp3.RequestBody;
+
+public class OkHttpDirectConnector {
+
+    private static Gson GSON = new Gson();
+
+    private String endpoint;
+    private String useApi;
+    private String userToken;
+    private OkHttpClient client;
+
+    public OkHttpDirectConnector(String endpoint, String useApi, String 
userToken) {
+        this.endpoint = endpoint;
+        this.useApi = useApi;
+        this.userToken = userToken;
+        OkHttpClient.Builder builder = new OkHttpClient.Builder();
+        builder.callTimeout(300, TimeUnit.SECONDS);
+        builder.connectTimeout(300, TimeUnit.SECONDS);
+        builder.readTimeout(300, TimeUnit.SECONDS);
+        builder.writeTimeout(300, TimeUnit.SECONDS);
+        this.client = builder.build();
+    }
+
+    public org.apache.geaflow.ai.common.model.Response post(String bodyJson) {
+        RequestBody requestBody = RequestBody.create(
+                MediaType.parse("application/json; charset=utf-8"),
+                bodyJson
+        );
+
+        String url = endpoint + useApi;
+        System.out.println(url);

Review Comment:
   Please use a logger. Other similar code also needs to be corrected.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/index/vector/KeywordVector.java:
##########
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.index.vector;
+
+import java.util.Arrays;
+
+public class KeywordVector implements IVector {
+
+    private final String[] vec;
+
+    public KeywordVector(String... vec) {
+        this.vec = vec;
+    }
+
+    public String[] getVec() {
+        return vec;
+    }
+
+    @Override
+    public double match(IVector other) {
+        if (other.getType() != this.getType()) {
+            return 0.0;
+        }
+        KeywordVector otherKeyword = (KeywordVector) other;
+        int count = 0;
+        for (String keyword1 : this.vec) {
+            for (String keyword2 : otherKeyword.vec) {
+                if (keyword1.equals(keyword2)) {
+                    count++;
+                    break;
+                }
+            }
+        }

Review Comment:
   add a TODO: use HashSet to optimaize it to O(n)



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/graph/LocalFileGraphAccessor.java:
##########
@@ -0,0 +1,152 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.graph;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import java.util.function.Function;
+import org.apache.geaflow.ai.graph.io.*;
+
+public class LocalFileGraphAccessor implements GraphAccessor {
+
+    private final String resourcePath;
+    private final ClassLoader resourceClassLoader;
+    private final Graph graph;
+
+    public LocalFileGraphAccessor(ClassLoader classLoader, String 
resourcePath, Long limit,
+                                  Function<Vertex, Vertex> vertexMapper,
+                                  Function<Edge, Edge> edgeMapper) {
+        this.resourcePath = resourcePath;
+        this.resourceClassLoader = classLoader;
+        try {
+            this.graph = GraphFileReader.getGraph(resourceClassLoader, 
resourcePath, limit,
+                    vertexMapper, edgeMapper);
+        } catch (Throwable e) {
+            throw new RuntimeException("Init local graph error", e);
+        }
+    }
+
+    @Override
+    public GraphSchema getGraphSchema() {
+        return graph.getGraphSchema();
+    }
+
+    @Override
+    public GraphVertex getVertex(String label, String id) {
+        return new GraphVertex(graph.getVertex(label, id));
+    }
+
+    @Override
+    public GraphEdge getEdge(String label, String src, String dst) {
+        return new GraphEdge(graph.getEdge(label, src, dst));
+    }
+
+    @Override
+    public Iterator<GraphVertex> scanVertex() {
+        return new GraphVertexIterator(graph.scanVertex());
+    }
+
+    @Override
+    public Iterator<GraphEdge> scanEdge(GraphVertex vertex) {
+        return new GraphEdgeIterator(graph.scanEdge(vertex));
+    }
+
+    @Override
+    public List<GraphEntity> expand(GraphEntity entity) {
+        List<GraphEntity> results = new ArrayList<>();
+        if (entity instanceof GraphVertex) {
+            Iterator<Edge> iterator = graph.scanEdge((GraphVertex) entity);
+            while (iterator.hasNext()) {
+                results.add(new GraphEdge(iterator.next()));
+            }
+        } else if (entity instanceof GraphEdge) {
+            GraphEdge graphEdge = (GraphEdge) entity;
+            results.add(new GraphVertex(graph.getVertex(null, 
graphEdge.getEdge().getSrcId())));
+            results.add(new GraphVertex(graph.getVertex(null, 
graphEdge.getEdge().getDstId())));
+        }
+        return results;
+    }
+
+    @Override
+    public GraphAccessor copy() {
+        return this;
+    }
+
+    @Override
+    public String getType() {
+        return this.getClass().getSimpleName();
+    }
+
+
+    private static class GraphVertexIterator implements Iterator<GraphVertex> {
+
+        private final Iterator<Vertex> vertexIterator;
+
+        public GraphVertexIterator(Iterator<Vertex> vertexIterator) {
+            this.vertexIterator = vertexIterator;
+        }
+
+        @Override
+        public boolean hasNext() {
+            return vertexIterator.hasNext();
+        }
+
+        @Override
+        public GraphVertex next() {
+            if (!hasNext()) {

Review Comment:
   If I’m not mistaken, vertexIterator.next() itself checks and throws 
NoSuchElementException. Therefore, GraphVertexIterator.next() performs a 
duplicate hasNext check. It is recommended to delegate this exception checking 
to the underlying iterator.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/operator/SessionOperator.java:
##########
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.operator;
+
+import java.util.*;
+import java.util.stream.Collectors;
+import org.apache.geaflow.ai.graph.GraphAccessor;
+import org.apache.geaflow.ai.graph.GraphEdge;
+import org.apache.geaflow.ai.graph.GraphEntity;
+import org.apache.geaflow.ai.graph.GraphVertex;
+import org.apache.geaflow.ai.index.IndexStore;
+import org.apache.geaflow.ai.index.vector.IVector;
+import org.apache.geaflow.ai.index.vector.VectorType;
+import org.apache.geaflow.ai.search.VectorSearch;
+import org.apache.geaflow.ai.subgraph.SubGraph;
+
+public class SessionOperator implements SearchOperator {
+
+    private final GraphAccessor graphAccessor;
+    private final IndexStore indexStore;
+
+    public SessionOperator(GraphAccessor accessor, IndexStore store) {
+        this.graphAccessor = Objects.requireNonNull(accessor);
+        this.indexStore = Objects.requireNonNull(store);
+    }
+
+    @Override
+    public List<SubGraph> apply(List<SubGraph> subGraphList, VectorSearch 
search) {
+        List<IVector> keyWordVectors = 
search.getVectorMap().get(VectorType.KeywordVector);
+        if (keyWordVectors == null || keyWordVectors.isEmpty()) {
+            if (subGraphList == null) {
+                return new ArrayList<>();
+            }
+            return new ArrayList<>(subGraphList);
+        }
+        List<String> contents = new ArrayList<>(keyWordVectors.size());
+        for (IVector v : keyWordVectors) {
+            contents.add(v.toString());
+        }
+        String query = String.join("  ", contents);
+        List<GraphEntity> globalResults = searchWithGlobalGraph(query);
+        if (subGraphList == null || subGraphList.isEmpty()) {
+            List<GraphVertex> startVertices = new ArrayList<>();
+            for (GraphEntity resEntity : globalResults) {
+                if (resEntity instanceof GraphVertex) {
+                    startVertices.add((GraphVertex) resEntity);
+                }
+            }
+            //Apply to subgraph
+            return startVertices.stream().map(v -> {
+                SubGraph subGraph = new SubGraph();
+                subGraph.addVertex(v);
+                return subGraph;
+            }).collect(Collectors.toList());
+        } else {
+            Map<GraphEntity, List<IVector>> extendEntityIndexMap = new 
HashMap<>();
+            //Traverse all extension points of the subgraph and search within 
the extension area
+            for (SubGraph subGraph : subGraphList) {
+                List<GraphEntity> extendEntities = getSubgraphExpand(subGraph);
+                for (GraphEntity extendEntity : extendEntities) {
+                    List<IVector> entityIndex = 
indexStore.getEntityIndex(extendEntity);
+                    extendEntityIndexMap.put(extendEntity, entityIndex);
+                }
+            }
+            //recall compute
+            GraphSearchStore searchStore = 
initSearchStore(extendEntityIndexMap);
+            List<GraphEntity> matchEntities = searchStore.search(query, 
graphAccessor);
+            Set<GraphEntity> matchEntitiesSet = new HashSet<>(matchEntities);
+
+            //Apply to subgraph
+            List<SubGraph> subGraphs = new ArrayList<>(subGraphList);
+            for (SubGraph subGraph : subGraphs) {
+                Set<GraphEntity> subgraphEntitySet = new 
HashSet<>(subGraph.getGraphEntityList());
+                List<GraphEntity> extendEntities = getSubgraphExpand(subGraph);
+                for (GraphEntity extendEntity : extendEntities) {
+                    if (matchEntitiesSet.contains(extendEntity)
+                            && !subgraphEntitySet.contains(extendEntity)) {
+                        subgraphEntitySet.add(extendEntity);
+                        subGraph.addEntity(extendEntity);
+                    }
+                }
+            }
+            return subGraphs;
+        }
+    }
+
+    private List<GraphEntity> getSubgraphExpand(SubGraph subGraph) {
+        List<GraphEntity> entityList = subGraph.getGraphEntityList();
+        List<GraphEntity> expandEntities = new ArrayList<>();
+        for (GraphEntity entity : entityList) {
+            List<GraphEntity> entityExpand = graphAccessor.expand(entity);
+            expandEntities.addAll(entityExpand);
+        }
+        return expandEntities;
+    }
+
+    private List<GraphEntity> searchWithGlobalGraph(String query) {
+        Map<GraphEntity, List<IVector>> entityIndexMap = new HashMap<>();
+        Iterator<GraphVertex> vertexIterator = graphAccessor.scanVertex();
+        while (vertexIterator.hasNext()) {
+            GraphVertex vertex = vertexIterator.next();
+            //Read all vertices indices from the index and add them to the 
candidate set.
+            List<IVector> vertexIndex = indexStore.getEntityIndex(vertex);
+            entityIndexMap.put(vertex, vertexIndex);
+        }
+        //recall compute
+        GraphSearchStore searchStore = initSearchStore(entityIndexMap);
+        return searchStore.search(query, graphAccessor);
+    }
+
+    private GraphSearchStore initSearchStore(Map<GraphEntity, List<IVector>> 
entityIndexMap) {
+        GraphSearchStore searchStore = new GraphSearchStore();
+        for (Map.Entry<GraphEntity, List<IVector>> entry : 
entityIndexMap.entrySet()) {

Review Comment:
   Do we rebuild the in-memory index on every search (GraphSearchStore includes 
Lucene)? Is there a better optimization method? For example, 
clearing/initializing the GraphSearchStore instead of rebuilding.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/graph/LocalFileGraphAccessor.java:
##########
@@ -0,0 +1,152 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.graph;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import java.util.function.Function;
+import org.apache.geaflow.ai.graph.io.*;
+
+public class LocalFileGraphAccessor implements GraphAccessor {
+
+    private final String resourcePath;
+    private final ClassLoader resourceClassLoader;
+    private final Graph graph;
+
+    public LocalFileGraphAccessor(ClassLoader classLoader, String 
resourcePath, Long limit,
+                                  Function<Vertex, Vertex> vertexMapper,
+                                  Function<Edge, Edge> edgeMapper) {
+        this.resourcePath = resourcePath;
+        this.resourceClassLoader = classLoader;
+        try {
+            this.graph = GraphFileReader.getGraph(resourceClassLoader, 
resourcePath, limit,
+                    vertexMapper, edgeMapper);
+        } catch (Throwable e) {
+            throw new RuntimeException("Init local graph error", e);
+        }
+    }
+
+    @Override
+    public GraphSchema getGraphSchema() {
+        return graph.getGraphSchema();
+    }
+
+    @Override
+    public GraphVertex getVertex(String label, String id) {
+        return new GraphVertex(graph.getVertex(label, id));
+    }
+
+    @Override
+    public GraphEdge getEdge(String label, String src, String dst) {
+        return new GraphEdge(graph.getEdge(label, src, dst));
+    }

Review Comment:
   getVertex/getEdge may return wrapped objects that contain null, so a 
non-null check is required.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/index/EmbeddingIndexStore.java:
##########
@@ -0,0 +1,251 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.index;
+
+import com.google.gson.Gson;
+import java.io.*;
+import java.nio.charset.Charset;
+import java.util.*;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.geaflow.ai.common.model.ChatRobot;
+import org.apache.geaflow.ai.common.model.ModelInfo;
+import org.apache.geaflow.ai.common.model.ModelUtils;
+import org.apache.geaflow.ai.graph.GraphAccessor;
+import org.apache.geaflow.ai.graph.GraphEdge;
+import org.apache.geaflow.ai.graph.GraphEntity;
+import org.apache.geaflow.ai.graph.GraphVertex;
+import org.apache.geaflow.ai.index.vector.EmbeddingVector;
+import org.apache.geaflow.ai.index.vector.IVector;
+import org.apache.geaflow.ai.verbalization.VerbalizationFunction;
+
+public class EmbeddingIndexStore implements IndexStore {
+
+    private GraphAccessor graphAccessor;
+    private VerbalizationFunction verbFunc;
+    private String indexFilePath;
+    private ModelInfo modelInfo;
+    private Map<GraphEntity, List<ChatRobot.EmbeddingResult>> indexStoreMap;
+
+    public void initStore(GraphAccessor graphAccessor, VerbalizationFunction 
func,
+                          String indexFilePath, ModelInfo modelInfo) {
+        this.graphAccessor = graphAccessor;
+        this.verbFunc = func;
+        this.indexFilePath = indexFilePath;
+        this.modelInfo = modelInfo;
+        this.indexStoreMap = new HashMap<>();
+
+        //Read index items from indexFilePath
+        Map<String, GraphEntity> key2EntityMap = new HashMap<>();
+        for (Iterator<GraphVertex> itV = this.graphAccessor.scanVertex(); 
itV.hasNext(); ) {
+            GraphVertex vertex = itV.next();
+            key2EntityMap.put(ModelUtils.getGraphEntityKey(vertex), vertex);
+            for (Iterator<GraphEdge> itE = 
this.graphAccessor.scanEdge(vertex); itE.hasNext(); ) {
+                GraphEdge edge = itE.next();
+                key2EntityMap.put(ModelUtils.getGraphEntityKey(edge), edge);
+            }
+        }
+        System.out.println("Success to scan entities. total entities num: " + 
key2EntityMap.size());
+
+        try {
+            File indexFile = new File(this.indexFilePath);
+
+            if (!indexFile.exists()) {
+                File parentDir = indexFile.getParentFile();
+                if (parentDir != null && !parentDir.exists()) {
+                    parentDir.mkdirs();
+                }
+                indexFile.createNewFile();
+                System.out.println("Success to create new index store file. 
Path: " + this.indexFilePath);
+            }
+        } catch (Throwable e) {
+            throw new RuntimeException(e);
+        }
+
+
+        long count = 0;
+        try (BufferedReader reader = new BufferedReader(
+                new InputStreamReader(
+                        new FileInputStream(this.indexFilePath),
+                        Charset.defaultCharset()))) {
+            String line;
+            while ((line = reader.readLine()) != null) {
+                line = line.trim();
+                if (line.isEmpty()) {
+                    continue;
+                }
+                try {
+                    ChatRobot.EmbeddingResult embedding =
+                            new Gson().fromJson(line, 
ChatRobot.EmbeddingResult.class);
+                    String key = embedding.input;
+                    GraphEntity entity = key2EntityMap.get(key);
+                    if (entity != null) {
+                        this.indexStoreMap.computeIfAbsent(entity, k -> new 
ArrayList<>()).add(embedding);
+                    }
+                    count++;
+                } catch (Throwable e) {
+                    System.out.println("Cannot parse embedding item: " + line);
+                }
+            }
+        } catch (Throwable e) {
+            throw new RuntimeException(e);
+        }
+
+        System.out.println("Success to read index store file. items num: " + 
count);
+        System.out.println("Success to rebuild index with file. index num: " + 
this.indexStoreMap.size());
+
+
+        //Scan entities in the graph, make new index items
+        ChatRobot chatRobot = new ChatRobot();
+        chatRobot.setModelInfo(modelInfo);
+
+        final int BATCH_SIZE = 32;

Review Comment:
   Magic numbers should be extracted as constants or configured or environment 
variables. Other magic numbers should also be changed.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/graph/GraphVertex.java:
##########
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.graph;
+
+import java.util.Objects;
+import org.apache.geaflow.ai.graph.io.Vertex;
+
+public class GraphVertex implements GraphEntity {
+
+    public final Vertex vertex;

Review Comment:
   Be `private`?



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/operator/EmbeddingOperator.java:
##########
@@ -0,0 +1,193 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.operator;
+
+import java.util.*;
+import java.util.stream.Collectors;
+import org.apache.geaflow.ai.graph.GraphAccessor;
+import org.apache.geaflow.ai.graph.GraphEntity;
+import org.apache.geaflow.ai.graph.GraphVertex;
+import org.apache.geaflow.ai.index.IndexStore;
+import org.apache.geaflow.ai.index.vector.EmbeddingVector;
+import org.apache.geaflow.ai.index.vector.IVector;
+import org.apache.geaflow.ai.index.vector.VectorType;
+import org.apache.geaflow.ai.search.VectorSearch;
+import org.apache.geaflow.ai.subgraph.SubGraph;
+
+public class EmbeddingOperator implements SearchOperator {
+
+    private final GraphAccessor graphAccessor;
+    private final IndexStore indexStore;
+    private double threshold;
+    private int topN;
+
+    public EmbeddingOperator(GraphAccessor accessor, IndexStore store) {
+        this.graphAccessor = Objects.requireNonNull(accessor);
+        this.indexStore = Objects.requireNonNull(store);
+        this.threshold = 0.50;

Review Comment:
   Should it be placed in the configuration, or, another optiona -- as a 
configurable parameter?



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/graph/io/CsvFileReader.java:
##########
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one

Review Comment:
   I have not reviewed the io folder yet.



##########
geaflow-ai/src/main/java/org/apache/geaflow/ai/graph/LocalFileGraphAccessor.java:
##########
@@ -0,0 +1,152 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.geaflow.ai.graph;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.NoSuchElementException;
+import java.util.function.Function;
+import org.apache.geaflow.ai.graph.io.*;
+
+public class LocalFileGraphAccessor implements GraphAccessor {
+
+    private final String resourcePath;
+    private final ClassLoader resourceClassLoader;
+    private final Graph graph;
+
+    public LocalFileGraphAccessor(ClassLoader classLoader, String 
resourcePath, Long limit,
+                                  Function<Vertex, Vertex> vertexMapper,
+                                  Function<Edge, Edge> edgeMapper) {
+        this.resourcePath = resourcePath;
+        this.resourceClassLoader = classLoader;
+        try {
+            this.graph = GraphFileReader.getGraph(resourceClassLoader, 
resourcePath, limit,
+                    vertexMapper, edgeMapper);
+        } catch (Throwable e) {
+            throw new RuntimeException("Init local graph error", e);
+        }
+    }
+
+    @Override
+    public GraphSchema getGraphSchema() {
+        return graph.getGraphSchema();
+    }
+
+    @Override
+    public GraphVertex getVertex(String label, String id) {
+        return new GraphVertex(graph.getVertex(label, id));
+    }
+
+    @Override
+    public GraphEdge getEdge(String label, String src, String dst) {
+        return new GraphEdge(graph.getEdge(label, src, dst));
+    }

Review Comment:
   The other parts of the code also have similar risks and need to be checked 
one by one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to