Re: [PR] fix(llm): improve rag demo error handling [hugegraph-ai]

via GitHub Tue, 19 May 2026 21:57:26 -0700


imbajin commented on code in PR #335:
URL: https://github.com/apache/hugegraph-ai/pull/335#discussion_r3271304031



##########
hugegraph-llm/src/hugegraph_llm/nodes/base_node.py:
##########
@@ -75,7 +75,13 @@ def run(self):
             node_info = f"Node type: {type(self).__name__}, Node object: 
{self}"
             err_msg = f"Node failed: 
{exc}\n{node_info}\n{traceback.format_exc()}"
             return CStatus(-1, err_msg)
-        # For unexpected exceptions, re-raise to let them propagate or be 
caught elsewhere
+        except Exception as exc:  # pylint: disable=broad-exception-caught
+            import traceback

Review Comment:
   ⚠️ `import traceback` 在第 73 行的 `except (ValueError, ...)` 
块里已经内嵌过一次，这里又内嵌一次。建议提到模块顶部，并把这两段几乎相同的 `node_info` / `err_msg` 拼接抽成一个小工具函数（例如 
`_format_node_err(self, exc)`），避免后续 drift。另外两处的 logging 行为不一致——上面那个 `except` 没有 
`log.error`，下面这个有，建议统一。



##########
hugegraph-llm/src/hugegraph_llm/nodes/base_node.py:
##########
@@ -75,7 +75,13 @@ def run(self):
             node_info = f"Node type: {type(self).__name__}, Node object: 
{self}"
             err_msg = f"Node failed: 
{exc}\n{node_info}\n{traceback.format_exc()}"
             return CStatus(-1, err_msg)
-        # For unexpected exceptions, re-raise to let them propagate or be 
caught elsewhere
+        except Exception as exc:  # pylint: disable=broad-exception-caught

Review Comment:
   ‼️ 这里新增的 catch-all `except Exception` 会把本 PR 在 `schema_build.py` 改成 `raise 
ValueError` 的契约重新降级成 `CStatus(-1, ...)`——两层改动相互抵消，对调用方而言行为退回到改动前。请二选一：要么去掉这里的 
catch-all（保留原注释 'let them propagate'），让上层真的能感知 schema 构建失败；要么保留这里、撤回 
`schema_build.py` 的 raise，统一一层错误处理。



##########
hugegraph-llm/src/hugegraph_llm/models/llms/litellm.py:
##########
@@ -75,7 +75,7 @@ def generate(
             return response.choices[0].message.content
         except (RateLimitError, BudgetExceededError, APIError) as e:
             log.error("Error in LiteLLM call: %s", e)
-            return f"Error: {str(e)}"
+            raise

Review Comment:
   ⚠️ 改成 `raise` 后有两点要跟进：
   1. 上面的 `@retry(retry=retry_if_exception_type((RateLimitError, 
BudgetExceededError, APIError)))` 现在会真的重试 
`BudgetExceededError`——但预算超限重试一次毫无意义且浪费配额。建议从 retry 列表里移除 
`BudgetExceededError`，或单独 `except BudgetExceededError` 后立即 raise 不走 tenacity。
   2. 同文件 `generate_streaming`（约 173 行）依然 `yield f"Error: 
{str(e)}"`，错误契约只改了一半，调用方对同一个类需要写两套错误处理。请同步修改流式路径，或在 PR 描述里说明差异原因。
   3. 这两个公共方法的契约从「返回字符串」改成「抛异常」，需要 grep 现有调用方是否做过 `result.startswith("Error:")` 
之类的判断，并补一条 `RateLimitError` / `APIError` 真正向上传播的测试。



##########
hugegraph-python-client/src/pyhugegraph/utils/util.py:
##########
@@ -101,9 +101,18 @@ def __call__(self, response: requests.Response, method: 
str, path: str):
                 log.info("Resource %s not found (404)", path)
             else:
                 try:
-                    details = response.json().get("exception", "key 
'exception' not found")
+                    body = response.json()
+                    status = body.get("status")
+                    status_message = status.get("message") if 
isinstance(status, dict) else None
+                    details = (
+                        body.get("exception")
+                        or status_message
+                        or body.get("message")
+                        or response.text
+                        or "unknown error"
+                    )
                 except (ValueError, KeyError):
-                    details = "key 'exception' not found"
+                    details = response.text or "unknown error"

Review Comment:
   ⚠️ `except (ValueError, KeyError)` 没接住 `AttributeError` / `TypeError`：当 
`body` 不是 dict（例如 HugeGraph 偶发返回 list 或纯字符串错误体）时，上面的 `body.get("exception")` / 
`body.get("status")` 会抛 `AttributeError`，绕过这里的 fallback 被外层 broad except 
吞掉。建议扩到 `(ValueError, KeyError, AttributeError, TypeError)`。
   
   另外建议把 top-level `body.get("message")` 提到 `status_message` 前面——HugeGraph 
Server 错误体里 top-level `message` 通常比嵌套 `status.message` 更具可读性。



##########
hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/fetch_graph_data.py:
##########
@@ -44,8 +44,16 @@ def res = [:];
         return res;
         """
 
-        result = self.graph.gremlin().exec(groovy_code)["data"]
-
+        response = self.graph.gremlin().exec(groovy_code)
+        result = response.get("data") if isinstance(response, dict) else None
         if isinstance(result, list) and len(result) > 0:
-            graph_summary.update({key: result[i].get(key) for i, key in 
enumerate(keys)})
+            if len(result) == 1 and isinstance(result[0], dict) and any(key in 
result[0] for key in keys):
+                graph_summary.update({key: result[0].get(key) for key in keys})

Review Comment:
   ⚠️ 边界：当 Gremlin 返回 `{"data": [{}]}`（例如空图但脚本仍给出空 dict）时，`any(key in result[0] 
for key in keys)` 为 False，会落到下面的多行分支用 `result[0]` / `result[1]`...，所有字段被设为 
`None` 而不是预期的 `0` / `[]`。建议简化为「单元素 dict 永远走新分支」，去掉 `any(...)` 二次门控；并补两个回归测试：(a) 
`[{}]` 空图；(b) 5 行旧形状（每行一个 key）。
   
   ```suggestion
               if len(result) == 1 and isinstance(result[0], dict):
                   graph_summary.update({key: result[0].get(key) for key in 
keys})
   ```



##########
hugegraph-llm/src/hugegraph_llm/models/embeddings/ollama.py:
##########
@@ -81,14 +81,13 @@ def get_texts_embeddings(self, texts: List[str], 
batch_size: int = 32) -> List[L
 
     async def async_get_text_embedding(self, text: str) -> List[float]:
         """Get embedding for a single text asynchronously."""
-        response = await self.async_client.embeddings(model=self.model, 
prompt=text)
-        return list(response["embedding"])
+        response = await self.async_client.embed(model=self.model, 
input=[text])

Review Comment:
   ⚠️ async 路径切到 `embed(input=...)` 这次没有任何回归测试。建议补：(1) batch 切片在 `batch_size` 
边界正确（多 batch 串接顺序与输入一致）；(2) 响应缺少 `embeddings` key 时给出明确错误（当前 raw `KeyError` 
信息不友好）；(3) `embeddings` 为空列表 / `IndexError` 的边界。可参考同步版 `get_texts_embeddings` 
已有的 `hasattr(self.client, 'embed')` 防御一并加到 async 版。



##########
hugegraph-python-client/src/tests/api/test_response_validation.py:
##########
@@ -0,0 +1,38 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from unittest.mock import Mock
+
+import pytest
+import requests

Review Comment:
   🧹 项目（`hugegraph-llm/AGENTS.md` 明确写）测试约定是 `unittest` + `python -m unittest 
discover`，但本测试用了 pytest（`pytest.raises` + 顶层函数）。建议改成 `unittest.TestCase` + 
`assertRaisesRegex`，与同 PR 的 `test_fetch_graph_data.py` 风格保持一致；否则 unittest 
discover 跑不到这个用例。



##########
hugegraph-llm/src/hugegraph_llm/models/llms/openai.py:
##########
@@ -70,7 +70,10 @@ def generate(
                 max_tokens=self.max_tokens,
                 messages=messages,
             )
-            log.info("Token usage: %s", completions.usage.model_dump_json())
+            if not hasattr(completions, "choices"):

Review Comment:
   ⚠️ `not hasattr(completions, "choices")` 在使用官方 `openai` SDK 时实际上是死分支：SDK 
要么返回带 `choices` 的 `ChatCompletion`，要么直接 raise 
`APIError`/`APIConnectionError`，不会返回一个没有 `choices` 属性的对象。真正会发生的失败模式是 `choices 
== []`（例如 content filter 触发）。建议改成判空：
   
   ```suggestion
               if not completions.choices:
                   raise RuntimeError(f"Empty choices in LLM response: 
{str(completions)[:200]}")
               if completions.usage:
                   log.info("Token usage: %s", 
completions.usage.model_dump_json())
   ```
   并补一条 mock `choices=[]` 的单测；`agenerate` 同步处理。



##########
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py:
##########
@@ -202,5 +202,5 @@ def create_app():
         host=args.host,
         port=args.port,
         factory=True,
-        reload=True,
+        reload=False,

Review Comment:
   🧹 `reload=True → False` 是 demo 开发体验的回退，但 commit / 描述没说明动机。如果是为了规避某个 reload 
引发的具体 bug（例如 multi-process / lifespan），请在代码注释或 commit message 
里写清楚；否则建议恢复，或改成读环境变量按需开关，例如 `reload=os.getenv("HG_DEV_RELOAD") == "1"`。



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] fix(llm): improve rag demo error handling [hugegraph-ai]

Reply via email to