nic-6443 commented on code in PR #13609:
URL: https://github.com/apache/apisix/pull/13609#discussion_r3479089195


##########
t/plugin/ai-proxy-kafka-log.t:
##########
@@ -137,6 +137,10 @@ X-AI-Fixture: openai/chat-basic.json
 send data to kafka:
 llm_request
 llm_summary
+tool_count
+cache_read_input_tokens
+cache_creation_input_tokens
+reasoning_tokens
 You are a mathematician
 gpt-35-turbo-instruct
 llm_response_text

Review Comment:
   Good point. Added TEST 9 in 665456727 — it calls `set_logging` directly and 
asserts every field, including the conditionally-set `has_tool_calls` / 
`end_user_id` / `content_risk_level`. Those can't appear in TEST 2's basic 
request (they're only set on tool calls / a `user` field / content moderation), 
so a focused test is the reliable way to guard all keys against a future 
refactor.



##########
docs/en/latest/plugins/ai-proxy.md:
##########
@@ -2082,6 +2082,17 @@ The following example demonstrates how you can log LLM 
request related informati
 * `llm_model`: LLM model.
 * `llm_prompt_tokens`: Number of tokens in the prompt.
 * `llm_completion_tokens`: Number of chat completion tokens in the prompt.
+* `llm_total_tokens`: Total number of tokens used (prompt plus completion).
+* `llm_cache_read_input_tokens`: Number of input tokens read from cache.

Review Comment:
   Fixed in 665456727 — corrected to "in the response" / "响应中".



##########
docs/en/latest/plugins/ai-proxy-multi.md:
##########
@@ -2606,6 +2606,17 @@ The following example demonstrates how you can log LLM 
request related informati
 * `llm_model`: LLM model.
 * `llm_prompt_tokens`: Number of tokens in the prompt.
 * `llm_completion_tokens`: Number of chat completion tokens in the prompt.
+* `llm_total_tokens`: Total number of tokens used (prompt plus completion).
+* `llm_cache_read_input_tokens`: Number of input tokens read from cache.

Review Comment:
   Fixed in 665456727 — corrected to "in the response" / "响应中".



##########
docs/zh/latest/plugins/ai-proxy.md:
##########
@@ -2082,6 +2082,17 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 * `llm_model`:LLM 模型。
 * `llm_prompt_tokens`:提示中的令牌数量。
 * `llm_completion_tokens`:提示中的聊天完成令牌数量。
+* `llm_total_tokens`:使用的总令牌数(提示加完成)。
+* `llm_cache_read_input_tokens`:从缓存读取的输入令牌数量。

Review Comment:
   Fixed in 665456727 — corrected to "in the response" / "响应中".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to