nic-6443 commented on code in PR #13609: URL: https://github.com/apache/apisix/pull/13609#discussion_r3479089195
########## t/plugin/ai-proxy-kafka-log.t: ########## @@ -137,6 +137,10 @@ X-AI-Fixture: openai/chat-basic.json send data to kafka: llm_request llm_summary +tool_count +cache_read_input_tokens +cache_creation_input_tokens +reasoning_tokens You are a mathematician gpt-35-turbo-instruct llm_response_text Review Comment: Good point. Added TEST 9 in 665456727 — it calls `set_logging` directly and asserts every field, including the conditionally-set `has_tool_calls` / `end_user_id` / `content_risk_level`. Those can't appear in TEST 2's basic request (they're only set on tool calls / a `user` field / content moderation), so a focused test is the reliable way to guard all keys against a future refactor. ########## docs/en/latest/plugins/ai-proxy.md: ########## @@ -2082,6 +2082,17 @@ The following example demonstrates how you can log LLM request related informati * `llm_model`: LLM model. * `llm_prompt_tokens`: Number of tokens in the prompt. * `llm_completion_tokens`: Number of chat completion tokens in the prompt. +* `llm_total_tokens`: Total number of tokens used (prompt plus completion). +* `llm_cache_read_input_tokens`: Number of input tokens read from cache. Review Comment: Fixed in 665456727 — corrected to "in the response" / "响应中". ########## docs/en/latest/plugins/ai-proxy-multi.md: ########## @@ -2606,6 +2606,17 @@ The following example demonstrates how you can log LLM request related informati * `llm_model`: LLM model. * `llm_prompt_tokens`: Number of tokens in the prompt. * `llm_completion_tokens`: Number of chat completion tokens in the prompt. +* `llm_total_tokens`: Total number of tokens used (prompt plus completion). +* `llm_cache_read_input_tokens`: Number of input tokens read from cache. Review Comment: Fixed in 665456727 — corrected to "in the response" / "响应中". ########## docs/zh/latest/plugins/ai-proxy.md: ########## @@ -2082,6 +2082,17 @@ curl "http://127.0.0.1:9080/anything" -X POST \ * `llm_model`:LLM 模型。 * `llm_prompt_tokens`:提示中的令牌数量。 * `llm_completion_tokens`:提示中的聊天完成令牌数量。 +* `llm_total_tokens`:使用的总令牌数(提示加完成)。 +* `llm_cache_read_input_tokens`:从缓存读取的输入令牌数量。 Review Comment: Fixed in 665456727 — corrected to "in the response" / "响应中". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
