This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 44a8a161cf6 add more information for scan bytes in audit log (#3174)
44a8a161cf6 is described below

commit 44a8a161cf6d29e1480d269d25dcc66f9df04776
Author: yiguolei <[email protected]>
AuthorDate: Tue Dec 9 16:53:17 2025 +0800

    add more information for scan bytes in audit log (#3174)
    
    ## Versions
    
    - [ x] dev
    - [x ] 4.x
    - [ ] 3.x
    - [ ] 2.1
    
    ## Languages
    
    - [x ] Chinese
    - [ x] English
    
    ## Docs Checklist
    
    - [x ] Checked by AI
    - [ ] Test Cases Built
    
    Co-authored-by: yiguolei <[email protected]>
---
 docs/admin-manual/system-tables/internal_schema/audit_log.md | 12 ++++++++----
 .../admin-manual/system-tables/internal_schema/audit_log.md  | 12 ++++++++----
 .../admin-manual/system-tables/internal_schema/audit_log.md  |  4 ++++
 .../admin-manual/system-tables/internal_schema/audit_log.md  |  7 +++++++
 4 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/docs/admin-manual/system-tables/internal_schema/audit_log.md 
b/docs/admin-manual/system-tables/internal_schema/audit_log.md
index ea6097e3d7f..9f1ac0bc5f4 100644
--- a/docs/admin-manual/system-tables/internal_schema/audit_log.md
+++ b/docs/admin-manual/system-tables/internal_schema/audit_log.md
@@ -32,10 +32,10 @@ Store audit logs
 | scan_bytes        | bigint       | Amount of data scanned                    
                   |
 | scan_rows         | bigint       | Number of rows scanned                    
                   |
 | return_rows       | bigint       | Number of rows returned                   
                   |
-| shuffleSendRows             | bigint  | The number of rows transferred 
between nodes during statement execution. Supported since version 3.0. |
-| shuffleSendBytes            | bigint    | The amount of data transferred 
between nodes during statement execution. Supported since version 3.0. | 
-| scanBytesFromLocalStorage   | bigint    | The amount of data read from the 
local disk. Supported since version 3.0. |
-| scanBytesFromRemoteStorage  | bigint    | The amount of data read from the 
remote storage. Supported since version 3.0. |
+| shuffle_send_rows             | bigint  | The number of rows transferred 
between nodes during statement execution. Supported since version 3.0. |
+| shuffle_send_bytes            | bigint    | The amount of data transferred 
between nodes during statement execution. Supported since version 3.0. | 
+| scan_bytes_from_local_storage   | bigint    | The amount of data read from 
the local disk. Supported since version 3.0. |
+| scan_bytes_from_remote_storage  | bigint    | The amount of data read from 
the remote storage. Supported since version 3.0. |
 | stmt_id           | bigint       | Statement ID                              
                   |
 | stmt_type                   | string    | Statement type. Supported since 
version 3.0. |
 | is_query          | tinyint      | Whether it is a query                     
                   |
@@ -54,5 +54,9 @@ Store audit logs
 
 - `client_ip`: If a proxy service is used and the IP pass-through is not 
enabled, the proxy service IP may be recorded here instead of the real client 
IP.
 - `state`: `EOF` indicates that the query is executed successfully. `OK` 
indicates that the DDL and DML statements are executed successfully. `ERR` 
indicates that the statement execution fails.
+- `scan_bytes`: Indicates the size of data processed by the BE. It represents 
the uncompressed size of data read from disk, including data read from Doris' 
internal page cache, and truly reflects the amount of data that a query needs 
to process. And this value not equal to `scan_bytes_from_local_storage` + 
`scan_bytes_from_remote_storage`.
+- `scan_rows`: Indicates the number of rows scanned during query execution. 
Since Doris is a columnar storage database, it first scans the columns with 
predicate filters, and then scans other columns based on the filtered results. 
Therefore, the number of rows scanned for different columns is actually 
different. In fact, the number of rows scanned for predicate columns is greater 
than that for non-predicate columns, and this value reflects the number of rows 
scanned for predicate columns [...]
+- `scan_bytes_from_local_storage`: Indicates the size of data read from local 
disk, which is the size before compression. Data read from Doris' page cache is 
not counted, while data from the operating system's page cache is included in 
this statistic.
+- `scan_bytes_from_remote_storage`: Indicates the size of data read from 
remote storage, which is the size before compression.
 
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/system-tables/internal_schema/audit_log.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/system-tables/internal_schema/audit_log.md
index 0969f95fc5c..76633edec6b 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/system-tables/internal_schema/audit_log.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/system-tables/internal_schema/audit_log.md
@@ -32,10 +32,10 @@
 | scan_bytes                                   | bigint       | 扫描的数据量。        
                   |
 | scan_rows                                    | bigint       | 扫描行数           
                    |
 | return_rows                                  | bigint       | 返回的行数          
                   |
-| shuffleSendRows             | bigint  | 语句执行过程中,节点间传输的行数。3.0 版本开始支持。|
-| shuffleSendBytes            | bigint    | 语句执行过程中,节点间传输的数据量。3.0 版本开始支持。 |
-| scanBytesFromLocalStorage   | bigint    | 从本地磁盘读取的数据量。3.0 版本开始支持。 |
-| scanBytesFromRemoteStorage  | bigint    | 从远端存储读取的数据量。3.0 版本开始支持。 |
+| shuffle_send_rows             | bigint  | 语句执行过程中,节点间传输的行数。3.0 版本开始支持。|
+| shuffle_send_bytes            | bigint    | 语句执行过程中,节点间传输的数据量。3.0 版本开始支持。 |
+| scan_bytes_from_local_storage   | bigint    | 从本地磁盘读取的数据量。3.0 版本开始支持。 |
+| scan_bytes_from_remote_storage  | bigint    | 从远端存储读取的数据量。3.0 版本开始支持。 |
 | stmt_id                                      | bigint       | 语句 ID          
                      |
 | stmt_type                   | string    | 语句类型。3.0 版本开始支持。 |
 | is_query                                     | tinyint      | 是否是查询          
                   |
@@ -53,4 +53,8 @@
 
 - `client_ip`:如果使用了代理服务,并且没有开启 IP 透传功能,则这里可能记录的是代理服务的 IP 而不是真实客户端 IP。
 - `state`:`EOF` 表示查询执行成功。`OK` 表示 DDL、DML 语句执行成功。`ERR` 表示语句执行失败。
+- `scan_bytes`: 表示BE 处理的数据的大小,它表示从磁盘读取的数据解压后的大小,包括了从Doris 内部的page cache 
中读取的数据,它真实的反应了一个查询需要处理的数据量。 所以这个值并不等于 `scan_bytes_from_local_storage` + 
`scan_bytes_from_remote_storage`。
+- `scan_rows`:表示查询执行过程中扫描的行数,由于Doris 
是列存数据库,所以会首先扫描有谓词过滤的列,根据过滤后的结果再扫描其他列,所以不同的列扫描的行数实际不一样,实际是谓词列扫描的行数比非谓词列多,而这个值反应了查询执行过程中谓词列扫描的行数。
+- `scan_bytes_from_local_storage`:表示从本地磁盘读取的数据大小,这是压缩前的数据,如果需要读取的数据位于Doris的 
page cache 中,则不会被统计在内,但是如果位于操作系统的page cache内,则会被统计在内。
+- `scan_bytes_from_remote_storage`:表示从远端存储读取的数据大小,这是压缩前的数据。
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/admin-manual/system-tables/internal_schema/audit_log.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/admin-manual/system-tables/internal_schema/audit_log.md
index 0969f95fc5c..8f747b7d8ab 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/admin-manual/system-tables/internal_schema/audit_log.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/admin-manual/system-tables/internal_schema/audit_log.md
@@ -53,4 +53,8 @@
 
 - `client_ip`:如果使用了代理服务,并且没有开启 IP 透传功能,则这里可能记录的是代理服务的 IP 而不是真实客户端 IP。
 - `state`:`EOF` 表示查询执行成功。`OK` 表示 DDL、DML 语句执行成功。`ERR` 表示语句执行失败。
+- `scan_bytes`: 表示BE 处理的数据的大小,它表示从磁盘读取的数据解压后的大小,包括了从Doris 内部的page cache 
中读取的数据,它真实的反应了一个查询需要处理的数据量。 所以这个值并不等于 `scan_bytes_from_local_storage` + 
`scan_bytes_from_remote_storage`。
+- `scan_rows`:表示查询执行过程中扫描的行数,由于Doris 
是列存数据库,所以会首先扫描有谓词过滤的列,根据过滤后的结果再扫描其他列,所以不同的列扫描的行数实际不一样,实际是谓词列扫描的行数比非谓词列多,而这个值反应了查询执行过程中谓词列扫描的行数。
+- `scan_bytes_from_local_storage`:表示从本地磁盘读取的数据大小,这是压缩前的数据,如果需要读取的数据位于Doris的 
page cache 中,则不会被统计在内,但是如果位于操作系统的page cache内,则会被统计在内。
+- `scan_bytes_from_remote_storage`:表示从远端存储读取的数据大小,这是压缩前的数据。
 
diff --git 
a/versioned_docs/version-4.x/admin-manual/system-tables/internal_schema/audit_log.md
 
b/versioned_docs/version-4.x/admin-manual/system-tables/internal_schema/audit_log.md
index ea6097e3d7f..c181b7070bc 100644
--- 
a/versioned_docs/version-4.x/admin-manual/system-tables/internal_schema/audit_log.md
+++ 
b/versioned_docs/version-4.x/admin-manual/system-tables/internal_schema/audit_log.md
@@ -54,5 +54,12 @@ Store audit logs
 
 - `client_ip`: If a proxy service is used and the IP pass-through is not 
enabled, the proxy service IP may be recorded here instead of the real client 
IP.
 - `state`: `EOF` indicates that the query is executed successfully. `OK` 
indicates that the DDL and DML statements are executed successfully. `ERR` 
indicates that the statement execution fails.
+- `scan_bytes`: Indicates the size of data processed by the BE. It represents 
the uncompressed size of data read from disk, including data read from Doris' 
internal page cache, and truly reflects the amount of data that a query needs 
to process. And this value not equal to `scan_bytes_from_local_storage` + 
`scan_bytes_from_remote_storage`.
+- `scan_rows`: Indicates the number of rows scanned during query execution. 
Since Doris is a columnar storage database, it first scans the columns with 
predicate filters, and then scans other columns based on the filtered results. 
Therefore, the number of rows scanned for different columns is actually 
different. In fact, the number of rows scanned for predicate columns is greater 
than that for non-predicate columns, and this value reflects the number of rows 
scanned for predicate columns [...]
+- `scan_bytes_from_local_storage`: Indicates the size of data read from local 
disk, which is the size before compression. Data read from Doris' page cache is 
not counted, while data from the operating system's page cache is included in 
this statistic.
+- `scan_bytes_from_remote_storage`: Indicates the size of data read from 
remote storage, which is the size before compression.
+
+
+
 
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to