This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 71b80dd261a [opt] opt debug tool doc (#3305)
71b80dd261a is described below

commit 71b80dd261a9487d116f58a4abbefcc09ed2d9f1
Author: Mingyu Chen (Rayner) <[email protected]>
AuthorDate: Mon Jan 26 20:14:11 2026 +0800

    [opt] opt debug tool doc (#3305)
---
 community/developer-guide/debug-tool.md            | 406 ++++++++++----------
 .../current/developer-guide/debug-tool.md          | 409 +++++++++++----------
 2 files changed, 434 insertions(+), 381 deletions(-)

diff --git a/community/developer-guide/debug-tool.md 
b/community/developer-guide/debug-tool.md
index 8e221abc967..1be773ac77c 100644
--- a/community/developer-guide/debug-tool.md
+++ b/community/developer-guide/debug-tool.md
@@ -1,7 +1,8 @@
 ---
 {
-    "title": "Debug Tool",
-    "language": "en"
+    "title": "Debugging Tools",
+    "language": "en",
+    "description": "A comprehensive guide to debugging tools and methods for 
Apache Doris, including FE and BE debugging techniques, memory profiling with 
Jemalloc and TCMalloc, memory leak detection with LSAN and ASAN, and CPU 
profiling with pprof and perf."
 }
 ---
 
@@ -24,185 +25,192 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Debug Tool
+# Debugging Tools
 
-In the process of using and developing Doris, we often encounter scenarios 
that need to debug Doris. Here are some common debugging tools.
+During Doris usage and development, debugging is often necessary. This 
document introduces commonly used debugging tools and methods.
 
-**The name of the BE binary that appears in this doc is `doris_be`, which was 
`palo_be` in previous versions.**
+**Note: The BE binary file name `doris_be` mentioned in this document was 
`palo_be` in earlier versions.**
 
-## FE debugging
+## FE Debugging
 
-Fe is a java process. Here are just a few simple and commonly used java 
debugging commands.
+FE is a Java process. Below are some commonly used Java debugging commands.
 
-1. Statistics of current memory usage details
+### 1. Memory Usage Statistics
 
-    ```
-    jmap -histo:live pid > 1. jmp
-    ```
+```bash
+jmap -histo:live pid > 1.jmp
+```
 
-    This command can enumerate and sort the memory occupation of living 
objects. (replace PID with Fe process ID)
+This command lists the memory usage of live objects sorted by size (replace 
pid with the FE process ID).
 
-    ```
-    num     #instances         #bytes  class name
-    ----------------------------------------------
-    1:         33528       10822024  [B
-    2:         80106        8662200  [C
-    3:           143        4688112  [Ljava.util.concurrent.ForkJoinTask;
-    4:         80563        1933512  java. lang.String
-    5:         15295        1714968  java. lang.Class
-    6:         45546        1457472  java. util. concurrent. 
ConcurrentHashMap$Node
-    7:         15483        1057416  [Ljava.lang.Object;
-    ```
+```text
+ num     #instances         #bytes  class name
+----------------------------------------------
+   1:         33528       10822024  [B
+   2:         80106        8662200  [C
+   3:           143        4688112  [Ljava.util.concurrent.ForkJoinTask;
+   4:         80563        1933512  java.lang.String
+   5:         15295        1714968  java.lang.Class
+   6:         45546        1457472  java.util.concurrent.ConcurrentHashMap$Node
+   7:         15483        1057416  [Ljava.lang.Object;
+```
 
-    You can use this method to view the total memory occupied by the currently 
living objects (at the end of the file) and analyze which objects occupy more 
memory.
+This method allows you to view the total memory occupied by live objects (at 
the end of the file) and analyze which objects consume more memory.
 
-    Note that this method will trigger fullgc because `: live 'is specified.
+**Note:** This method triggers a FullGC due to the `:live` parameter.
 
-2. Check JVM memory usage
+### 2. JVM Memory Usage
 
-    ```
-    jstat -gcutil pid 1000 1000
-    ```
+```bash
+jstat -gcutil pid 1000 1000
+```
 
-    This command can scroll through the memory usage of each region of the 
current JVM. (replace PID with Fe process ID)
+This command checks JVM memory usage in each region every second (replace pid 
with the FE process ID).
 
-    ```
-    S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     
GCT
-    0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
-    0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
-    0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
-    0.00   0.00  22.92   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
-    0.00   0.00  22.92   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
-    ```
+```text
+  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT
+  0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+  0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+  0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+  0.00   0.00  22.92   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+  0.00   0.00  22.92   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+```
 
-    The main focus is on the percentage of old area (o) (3% in the example). 
If the occupancy is too high, oom or fullgc may occur.
+Focus on the Old generation (O) percentage (3.03% in the example). High usage 
may lead to OOM or FullGC.
 
-3. Print Fe thread stack
+### 3. Print FE Thread Stack
 
-    ```
-    jstack -l pid > 1. js
-    ```
+```bash
+jstack -l pid > 1.js
+```
 
-    This command can print the thread stack of the current Fe. (replace PID 
with Fe process ID).
-    `-L ` the parameter will detect whether there is deadlock at the same 
time. This method can check the operation of Fe thread, whether there is 
deadlock, where it is stuck, etc.
+This command prints the current FE thread stack (replace pid with the FE 
process ID).
 
-## BE debugging
+The `-l` parameter also detects deadlocks. This method can be used to view FE 
thread execution status, detect deadlocks, and locate blocking positions.
 
-### Memory
+## BE Debugging
 
-Debugging memory is generally divided into two aspects. One is whether the 
total amount of memory use is reasonable. On the one hand, the excessive amount 
of memory use may be due to memory leak in the system, on the other hand, it 
may be due to improper use of program memory. The second is whether there is a 
problem of memory overrun and illegal access, such as program access to memory 
with an illegal address, use of uninitialized memory, etc. For the debugging of 
memory, we usually use [...]
+### Memory Debugging
 
-#### Jemalloc HEAP PROFILE
+Memory debugging focuses on two aspects:
 
-> Doris 1.2.2 version starts to use Jemalloc as the memory allocator by 
default.
+1. **Memory usage reasonability**: Excessive memory usage may indicate memory 
leaks or improper memory usage.
+2. **Memory access legality**: Detecting memory overflows, illegal access, 
accessing invalid addresses, or using uninitialized memory.
 
-For the principle analysis of Heap Profile, refer to [Heap Profiling Principle 
Analysis](https://cn.pingcap.com/blog/an-explanation-of-the-heap-profiling-principle/).
 It should be noted that Heap Profile records virtual memory
+The following tools can be used for tracking and analysis.
 
-Supports real-time and periodic Heap Dump, and then uses `jeprof` to parse the 
generated Heap Profile.
+#### Jemalloc Heap Profile
 
-##### 1. Real-time Heap Dump, used to analyze real-time memory
+> **Note:** Doris 1.2.2 and later versions use Jemalloc as the default memory 
allocator.
 
-Change `prof:false` in `JEMALLOC_CONF` in `be.conf` to `prof:true`, change 
`prof_active:false` to `prof_active:true` and restart Doris BE, then use the 
Jemalloc Heap Dump HTTP interface to generate a Heap Profile file on the 
corresponding BE machine.
+For Heap Profiling principles, refer to [Heap Profiling Principle 
Explanation](https://cn.pingcap.com/blog/an-explanation-of-the-heap-profiling-principle/).
 Note that Heap Profile records virtual memory.
 
-> For Doris 2.1.8 and 3.0.4 and later versions, `prof` in `JEMALLOC_CONF` is 
already `true` by default, no need to modify.
+Jemalloc supports both real-time and periodic Heap Dump methods, then uses the 
`jeprof` tool to parse the generated Heap Profile.
 
-For Doris versions before 2.1.8 and 3.0.4, there is no `prof_active` in 
`JEMALLOC_CONF`, just change `prof:false` to `prof:true`.
+##### 1. Real-time Heap Dump (for analyzing real-time memory)
 
-```shell
+In `be.conf`, change `prof:false` to `prof:true` and `prof_active:false` to 
`prof_active:true` in `JEMALLOC_CONF`, then restart Doris BE. Use the Jemalloc 
Heap Dump HTTP interface to generate Heap Profile files on the BE machine.
+
+> **Version Notes:**
+> - Doris 2.1.8, 3.0.4 and later: `prof` is already `true` by default in 
`JEMALLOC_CONF`, no modification needed.
+> - Before Doris 2.1.8 and 3.0.4: `JEMALLOC_CONF` doesn't have `prof_active` 
option, just change `prof:false` to `prof:true`.
+
+```bash
 curl http://be_host:be_webport/jeheap/dump
 ```
 
-The directory where the Heap Profile file is located can be configured in 
`be.conf` through the `jeprofile_dir` variable, which defaults to 
`${DORIS_HOME}/log`
+**Configuration:**
 
-The default sampling interval is 512K, which usually only records 10% of the 
memory, and the impact on performance is usually less than 10%. You can modify 
`lg_prof_sample` in `JEMALLOC_CONF` in `be.conf`, which defaults to `19` (2^19 
B = 512K). Reducing `lg_prof_sample` can sample more frequently to make the 
Heap Profile closer to the real memory, but this will bring greater performance 
loss.
+- **Heap Profile directory**: Configure via `jeprofile_dir` in `be.conf`, 
defaults to `${DORIS_HOME}/log`.
+- **Sampling interval**: Defaults to 512KB, typically recording ~10% of memory 
with <10% performance impact. Modify `lg_prof_sample` in `JEMALLOC_CONF` 
(default `19`, i.e., 2^19 B = 512KB). Decreasing `lg_prof_sample` increases 
sampling frequency for more accurate profiles but higher overhead.
 
-If you are doing performance testing, keep `prof:false` to avoid the 
performance loss of Heap Dump.
+**Performance tip:** Keep `prof:false` during performance testing to avoid 
Heap Dump overhead.
 
-##### 2. Regular Heap Dump for long-term memory observation
+##### 2. Periodic Heap Dump (for long-term memory observation)
 
-Change `prof:false` of `JEMALLOC_CONF` in `be.conf` to `prof:true`. The 
directory where the Heap Profile file is located is `${DORIS_HOME}/log` by 
default. The file name prefix is ​​`JEMALLOC_PROF_PRFIX` in `be.conf`, and the 
default is `jemalloc_heap_profile_`.
+Change `prof:false` to `prof:true` in `JEMALLOC_CONF` in `be.conf`. Heap 
Profile files default to `${DORIS_HOME}/log` with prefix specified by 
`JEMALLOC_PROF_PRFIX` (default `jemalloc_heap_profile_`).
 
-> Before Doris 2.1.6, `JEMALLOC_PROF_PRFIX` is empty and needs to be changed 
to any value as the profile file name
+> **Note:** Before Doris 2.1.6, `JEMALLOC_PROF_PRFIX` was empty and needs to 
be set.
 
-1. Dump when the cumulative memory application reaches a certain value:
+**Dump triggers:**
 
-Change `lg_prof_interval` of `JEMALLOC_CONF` in `be.conf` to 34. At this time, 
the profile is dumped once when the cumulative memory application reaches 16GB 
(2^35 B = 16GB). You can change it to any value to adjust the dump interval.
+1. **Dump after cumulative memory allocation**
 
-> Before Doris 2.1.6, `lg_prof_interval` defaults to 32.
+   Change `lg_prof_interval` to `34` in `JEMALLOC_CONF` to dump after 
cumulative 16GB allocation (2^34 B = 16GB).
 
-2. Dump every time the memory reaches a new high:
+   > **Note:** Before Doris 2.1.6, `lg_prof_interval` defaulted to `32`.
 
-Change `prof_gdump` in `JEMALLOC_CONF` in `be.conf` to `true` and restart BE.
+2. **Dump on memory peak**
 
-3. Dump when the program exits, and detect memory leaks:
+   Change `prof_gdump` to `true` in `JEMALLOC_CONF` and restart BE.
 
-Change `prof_leak` and `prof_final` in `JEMALLOC_CONF` in `be.conf` to `true` 
and restart BE.
+3. **Dump on exit and detect leaks**
 
-4. Dump the cumulative value (growth) of memory instead of the real-time value:
+   Change `prof_leak` and `prof_final` to `true` in `JEMALLOC_CONF` and 
restart BE.
 
-Change `prof_accum` in `JEMALLOC_CONF` in `be.conf` to `true` and restart BE.
+4. **Dump cumulative (growth) instead of real-time values**
 
-Use `jeprof --alloc_space` to display the cumulative value of heap dump.
+   Change `prof_accum` to `true` in `JEMALLOC_CONF` and restart BE. Use 
`jeprof --alloc_space` to display cumulative heap dump.
 
-##### 3. `jeprof` parses Heap Profile
+##### 3. Parse Heap Profile with `jeprof`
 
-Use `be/bin/jeprof` to parse the Heap Profile of the above dump. If the 
process memory is too large, the parsing process may take several minutes. 
Please wait patiently.
+Use `be/bin/jeprof` to parse dumped Heap Profiles. Parsing may take minutes 
for large memory processes.
 
-If there is no `jeprof` binary in the `be/bin` directory of the Doris BE 
deployment path, you can package the `jeprof` in the `doris/tools` directory 
and upload it to the server.
+If `jeprof` binary is missing from `be/bin`, upload `jeprof` from 
`doris/tools` directory.
 
-> The addr2line version is required to be 2.35.2 or above, see QA-1 below for 
details
-> Try to have Heap Dump and `jeprof` analyze Heap Profile on the same server, 
that is, analyze Heap Profile directly on the machine running Doris BE as much 
as possible, see QA-2 below for details
+> **Notes:**
+> - Requires addr2line version 2.35.2+, see QA-1 below.
+> - Execute Heap Dump and `jeprof` parsing on the same machine running Doris 
BE, see QA-2 below.
 
-1. Analyze a single Heap Profile file
+**1. Analyze single Heap Profile**
 
-```shell
+```bash
 jeprof --dot ${DORIS_HOME}/lib/doris_be ${DORIS_HOME}/log/profile_file
 ```
 
-After executing the above command, paste the text output by the terminal to 
the [online dot drawing website](http://www.webgraphviz.com/) to generate a 
memory allocation graph, and then analyze it.
+Paste terminal output to [online dot 
visualization](http://www.webgraphviz.com/) to generate memory allocation 
diagram.
 
-If the server is convenient for file transfer, you can also use the following 
command to directly generate a call relationship graph. The result.pdf file is 
transferred to the local computer for viewing. You need to install the 
dependencies required for drawing.
+To generate PDF directly (requires dependencies):
 
-```shell
+```bash
 yum install ghostscript graphviz
 jeprof --pdf ${DORIS_HOME}/lib/doris_be ${DORIS_HOME}/log/profile_file > 
result.pdf
 ```
 
-[graphviz](http://www.graphviz.org/): Without this library, pprof can only be 
converted to text format, but this method is not easy to view. After installing 
this library, pprof can be converted to svg, pdf and other formats, and the 
call relationship is clearer.
+**2. Analyze diff between two Heap Profiles**
 
-2. Analyze the diff of two heap profile files
-
-```shell
+```bash
 jeprof --dot ${DORIS_HOME}/lib/doris_be --base=${DORIS_HOME}/log/profile_file 
${DORIS_HOME}/log/profile_file2
 ```
 
-Multiple heap files can be generated by running the above command multiple 
times over a period of time. You can select an earlier heap file as a baseline 
and compare and analyze their diff with a later heap file. The method for 
generating a call graph is the same as above.
+Compare heap files from different times to analyze diff by using earlier file 
as baseline.
 
-##### 4. QA
+##### 4. Common Issues (QA)
 
-1. Many errors appear after running jeprof: `addr2line: Dwarf Error: found 
dwarf version xxx, this reader only handles version xxx`.
+**QA-1: Errors after running jeprof: `addr2line: Dwarf Error: found dwarf 
version xxx, this reader only handles version xxx`**
 
-GCC 11 and later use DWARF-v5 by default, which requires Binutils 2.35.2 and 
above. Doris Ldb_toolchain uses GCC 11. See: 
https://gcc.gnu.org/gcc-11/changes.html.
+GCC 11+ defaults to DWARF-v5, requiring Binutils 2.35.2+. Doris Ldb_toolchain 
uses GCC 11.
 
-Replace addr2line to 2.35.2, refer to:
-```
-// Download addr2line source code
+Solution: Upgrade addr2line to 2.35.2.
+
+```bash
+# Download addr2line source
 wget https://ftp.gnu.org/gnu/binutils/binutils-2.35.tar.bz2
 
-// Install dependencies, if necessary
+# Install dependencies if needed
 yum install make gcc gcc-c++ binutils
 
-// Compile & install addr2line
+# Compile & install addr2line
 tar -xvf binutils-2.35.tar.bz2
 cd binutils-2.35
 ./configure --prefix=/usr/local
 make
 make install
 
-// Verify
+# Verify
 addr2line -h
 
-// Replace addr2line
+# Replace addr2line
 chmod +x addr2line
 mv /usr/bin/addr2line /usr/bin/addr2line.bak
 mv /bin/addr2line /bin/addr2line.bak
@@ -210,25 +218,26 @@ cp addr2line /bin/addr2line
 cp addr2line /usr/bin/addr2line
 hash -r
 ```
-Note that addr2line 2.3.9 cannot be used, which may be incompatible and cause 
the memory to keep growing.
-
-2. Many errors appear after running `jeprof`: `addr2line: DWARF error: invalid 
or unhandled FORM value: 0x25`, and the parsed Heap stack is the memory address 
of the code, not the function name
 
-Usually, it is because the execution of Heap Dump and the execution of 
`jeprof` to parse Heap Profile are not on the same server, which causes 
`jeprof` to fail to parse the function name using the symbol table. Try to 
complete the operation of Dump Heap and `jeprof` parsing on the same machine, 
that is, try to parse the Heap Profile directly on the machine running Doris BE.
+**Note:** Don't use addr2line 2.3.9, which may be incompatible and cause 
memory growth.
 
-Or confirm the Linux kernel version of the machine running Doris BE, download 
the `be/bin/doris_be` binary file and the Heap Profile file to the machine with 
the same kernel version and execute `jeprof`.
+**QA-2: Errors after running `jeprof`: `addr2line: DWARF error: invalid or 
unhandled FORM value: 0x25`, parsed heap stacks show memory addresses instead 
of function names**
 
-3. If the Heap stack after directly parsing the Heap Profile on the machine 
running Doris BE is still the memory address of the code, not the function name
+Usually occurs when Heap Dump and `jeprof` parsing are on different servers, 
causing symbol table resolution failure.
 
-Use the following script to manually parse the Heap Profile and modify these 
variables:
+Solution:
+- Execute Dump Heap and `jeprof` parsing on the same machine running Doris BE.
+- Or download `be/bin/doris_be` binary and Heap Profile to a machine with 
matching Linux kernel version and run `jeprof`.
 
-- heap: the file name of the Heap Profile.
+**QA-3: If heap stacks still show memory addresses instead of function names 
after parsing on the BE machine**
 
-- bin: the file name of the `be/bin/doris_be` binary
+Use this script for manual parsing. Modify these variables:
 
-- llvm_symbolizer: the path of the llvm symbol table parser, the version 
should preferably be the version used to compile the `be/bin/doris_be` binary.
+- `heap`: Heap Profile filename.
+- `bin`: `be/bin/doris_be` binary filename.
+- `llvm_symbolizer`: Path to llvm symbolizer, preferably the version used to 
compile the binary.
 
-```
+```bash
 #!/bin/bash
 ## @brief
 ## @author zhoufei
@@ -279,27 +288,30 @@ fi
 # vim: et tw=80 ts=2 sw=2 cc=80:
 ```
 
-4. If all the above methods do not work
-
-- Try to recompile the `be/bin/doris_be` binary on the machine running Doris 
BE, that is, compile, run, and `jeprof` analyze on the same machine.
+**QA-4: If none of the above methods work**
 
-- After the above operation, if the Heap stack is still the memory address of 
the code, try `USE_JEMALLOC=OFF ./build.sh --be` to compile Doris BE using 
TCMalloc, and then refer to the above section to use TCMalloc Heap Profile to 
analyze memory.
+- Try recompiling `be/bin/doris_be` on the BE machine to compile, run, and 
parse on the same machine.
+- If heap stacks still show addresses, try compiling with TCMalloc using 
`USE_JEMALLOC=OFF ./build.sh --be`, then use TCMalloc Heap Profile as described 
below.
 
-#### TCMalloc HEAP PROFILE
+#### TCMalloc Heap Profile
 
-> Doris 1.2.1 and earlier versions use TCMalloc. Doris 1.2.2 version uses 
Jemalloc by default. To switch to TCMalloc, you can compile like this: 
`USE_JEMALLOC=OFF sh build.sh --be`.
+> **Note:** Doris 1.2.1 and earlier use TCMalloc. Doris 1.2.2+ default to 
Jemalloc. To switch back to TCMalloc, compile with `USE_JEMALLOC=OFF sh 
build.sh --be`.
 
-When using TCMalloc, when a large memory application is encountered, the 
application stack will be printed to the be.out file, and the general 
expression is as follows:
+When using TCMalloc, large memory allocations print stacks to `be.out`:
 
-```
+```text
 tcmalloc: large alloc 1396277248 bytes == 0x3f3488000 @  0x2af6f63 0x2c4095b 
0x134d278 0x134bdcb 0x133d105 0x133d1d0 0x19930ed
 ```
 
-This indicates that Doris be is trying to apply memory of '1396277248 bytes' 
on this stack. We can use the 'addr2line' command to restore the stack to a 
letter that we can understand. The specific example is shown below.
+This indicates Doris BE attempted to allocate `1396277248 bytes` at this 
stack. Use `addr2line` to convert to readable information:
 
+```bash
+addr2line -e lib/doris_be 0x2af6f63 0x2c4095b 0x134d278 0x134bdcb 0x133d105 
0x133d1d0 0x19930ed
 ```
-$ addr2line -e lib/doris_be  0x2af6f63 0x2c4095b 0x134d278 0x134bdcb 0x133d105 
0x133d1d0 0x19930ed
 
+Output example:
+
+```text
 
/home/ssd0/zc/palo/doris/core/thirdparty/src/gperftools-gperftools-2.7/src/tcmalloc.cc:1335
 
/home/ssd0/zc/palo/doris/core/thirdparty/src/gperftools-gperftools-2.7/src/tcmalloc.cc:1357
 
/home/disk0/baidu-doris/baidu/bdg/doris-baidu/core/be/src/exec/hash_table.cpp:267
@@ -309,20 +321,24 @@ $ addr2line -e lib/doris_be  0x2af6f63 0x2c4095b 
0x134d278 0x134bdcb 0x133d105 0
 thread.cpp:?
 ```
 
-Sometimes the application of memory is not caused by the application of large 
memory, but by the continuous accumulation of small memory. Then there is no 
way to locate the specific application information by viewing the log, so you 
need to get the information through other ways.
+Sometimes memory issues come from accumulating small allocations, not visible 
in logs. Use TCMalloc's [HEAP 
PROFILE](https://gperftools.github.io/gperftools/heapprofile.html) feature. Set 
`HEAPPROFILE` environment variable before starting Doris BE:
 
-At this time, we can take advantage of TCMalloc's 
[heapprofile](https://gperftools.github.io/gperftools/heapprofile.html). If the 
heapprofile function is set, we can get the overall memory application usage of 
the process. The usage is to set the 'heapprofile' environment variable before 
starting Doris be. For example:
-
-```
-export HEAPPROFILE=/tmp/doris_be.hprof
+```bash
+export TCMALLOC_SAMPLE_PARAMETER=64000 HEAP_PROFILE_ALLOCATION_INTERVAL=-1 
HEAP_PROFILE_INUSE_INTERVAL=-1 HEAP_PROFILE_TIME_INTERVAL=5 
HEAPPROFILE=/tmp/doris_be.hprof
 ./bin/start_be.sh --daemon
 ```
 
-In this way, when the dump condition of the heapprofile is met, the overall 
memory usage will be written to the file in the specified path. Later, we can 
use the 'pprof' tool to analyze the output content.
+> **Note:** HEAPPROFILE requires absolute path, and directory must exist.
 
+When HEAPPROFILE dump conditions are met, memory usage writes to specified 
file. Use `pprof` tool to analyze output.
+
+```bash
+pprof --text lib/doris_be /tmp/doris_be.hprof.0012.heap | head -30
 ```
-$ pprof --text lib/doris_be /tmp/doris_be.hprof.0012.heap | head -30
 
+Output example:
+
+```text
 Using local file lib/doris_be.
 Using local file /tmp/doris_be.hprof.0012.heap.
 Total: 668.6 MB
@@ -339,30 +355,35 @@ Total: 668.6 MB
      1.7   0.3%  98.4%      1.7   0.3% doris::SegmentReader::_load_index
 ```
 
-Contents of each column of the above documents:
+**Column meanings:**
 
-* Column 1: the memory size directly applied by the function, in MB
-* Column 4: the total memory size of the function and all the functions it 
calls.
-* The second column and the fifth column are the proportion values of the 
first column and the fourth column respectively.
-* The third column is the cumulative value of the second column.
+- **Column 1**: Memory directly allocated by function (MB).
+- **Column 2**: Percentage of column 1.
+- **Column 3**: Cumulative value of column 2.
+- **Column 4**: Total memory occupied by function and all called functions 
(MB).
+- **Column 5**: Percentage of column 4.
 
-Of course, it can also generate call relation pictures, which is more 
convenient for analysis. For example, the following command can generate a call 
graph in SVG format.
+Generate call relationship graph in SVG format:
 
-```
-pprof --svg lib/doris_be /tmp/doris_be.hprof.0012.heap > heap.svg 
+```bash
+pprof --svg lib/doris_be /tmp/doris_be.hprof.0012.heap > heap.svg
 ```
 
-**NOTE: turning on this option will affect the execution performance of the 
program. Please be careful to turn on the online instance.**
+**Performance tip:** This option affects performance. Use cautiously on 
production instances.
 
-##### pprof remote server
+##### pprof Remote Server
 
-Although heapprofile can get all the memory usage information, it has some 
limitations. 1. Restart be. 2. You need to enable this command all the time, 
which will affect the performance of the whole process.
+HEAP PROFILE has limitations: 1. Requires BE restart; 2. Continuous enabling 
impacts performance.
 
-For Doris be, you can also use the way of opening and closing the heap profile 
dynamically to analyze the memory application of the process. Doris supports 
the [remote server debugging of 
gperftools](https://gperftools.github.io/gperftools/pprof_remote_servers.html). 
Then you can use 'pprof' to directly perform dynamic head profile on the remote 
running Doris be. For example, we can check the memory usage increment of Doris 
through the following command
+Doris BE supports dynamic heap profiling. Doris supports GPerftools [remote 
server 
debugging](https://gperftools.github.io/gperftools/pprof_remote_servers.html). 
Use `pprof` to dynamically profile remote running Doris BE. Example for viewing 
memory usage increment:
 
+```bash
+pprof --text --seconds=60 http://be_host:be_webport/pprof/heap
 ```
-$ pprof --text --seconds=60 http://be_host:be_webport/pprof/heap 
 
+Output example:
+
+```text
 Total: 1296.4 MB
    484.9  37.4%  37.4%    484.9  37.4% doris::StorageByteBuffer::create
    272.2  21.0%  58.4%    273.3  21.1% doris::RowBlock::init
@@ -379,27 +400,27 @@ Total: 1296.4 MB
     10.0   0.8%  93.4%     10.0   0.8% 
doris::PlainTextLineReader::PlainTextLineReader
 ```
 
-The output of this command is the same as the output and view mode of heap 
profile, which will not be described in detail here. Statistics will be enabled 
only during execution of this command, which has a limited impact on process 
performance compared with heap profile.
+Output and viewing method match HEAP PROFILE. This command only enables 
statistics during execution, causing less performance impact than HEAP PROFILE.
 
-#### LSAN
+#### LSAN (Memory Leak Detection)
 
-[LSAN](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer)
 is an address checking tool, GCC has been integrated. When we compile the 
code, we can enable this function by turning on the corresponding compilation 
options. When the program has a determinable memory leak, it prints the leak 
stack. Doris be has integrated this tool, only need to compile with the 
following command to generate be binary with memory leak detection version.
+[LSAN](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer)
 is an address checking tool integrated in GCC. Enable during compilation to 
activate this feature. When determinable memory leaks occur, leak stacks are 
printed. Doris BE has integrated this tool. Compile with:
 
-```
+```bash
 BUILD_TYPE=LSAN ./build.sh
 ```
 
-When the system detects a memory leak, it will output the corresponding 
information in be. Out. For the following demonstration, we intentionally 
insert a memory leak code into the code. We insert the following code into the 
`open` function of `StorageEngine`.
+When memory leaks are detected, corresponding information outputs to `be.out`. 
For demonstration, we intentionally inject a memory leak in the `StorageEngine` 
`open` function:
 
-```
-    char* leak_buf = new char[1024];
-    strcpy(leak_buf, "hello world");
-    LOG(INFO) << leak_buf;
+```cpp
+char* leak_buf = new char[1024];
+strcpy(leak_buf, "hello world");
+LOG(INFO) << leak_buf;
 ```
 
-We get the following output in be.out
+Then `be.out` shows:
 
-```
+```text
 =================================================================
 ==24732==ERROR: LeakSanitizer: detected memory leaks
 
@@ -412,33 +433,33 @@ Direct leak of 1024 byte(s) in 1 object(s) allocated from:
 SUMMARY: LeakSanitizer: 1024 byte(s) leaked in 1 allocation(s).
 ```
 
-From the above output, we can see that 1024 bytes have been leaked, and the 
stack information of memory application has been printed out.
+Output shows 1024 bytes leaked with memory allocation stack trace.
 
-**NOTE: turning on this option will affect the execution performance of the 
program. Please be careful to turn on the online instance.**
+**Performance tip:** This option affects performance. Use cautiously on 
production instances.
 
-**NOTE: if the LSAN switch is turned on, the TCMalloc will be automatically 
turned off**
+**Note:** Enabling LSAN automatically disables TCMalloc.
 
-#### ASAN
+#### ASAN (Address Legality Detection)
 
-Except for the unreasonable use and leakage of memory. Sometimes there will be 
memory access illegal address and other errors. At this time, we can use 
[ASAN](https://github.com/google/sanitizers/wiki/addresssanitizer) to help us 
find the cause of the problem. Like LSAN, ASAN is integrated into GCC. Doris 
can open this function by compiling as follows
+Besides improper memory usage and leaks, illegal address access errors can 
occur. Use [ASAN](https://github.com/google/sanitizers/wiki/AddressSanitizer) 
to find root causes. Like LSAN, ASAN is integrated in GCC. Compile Doris with:
 
-```
+```bash
 BUILD_TYPE=ASAN ./build.sh
 ```
 
-Execute the binary generated by compilation. When the detection tool finds any 
abnormal access, it will immediately exit and output the stack illegally 
accessed in be.out. The output of ASAN is the same as that of LSAN. Here we 
also actively inject an address access error to show the specific content 
output. We still inject an illegal memory access into the 'open' function of 
'storageengine'. The specific error code is as follows
+When abnormal access is detected, the binary exits immediately and outputs 
illegal access stack to `be.out`. ASAN output analysis uses the same method as 
LSAN. For demonstration, inject an address access error in the `StorageEngine` 
`open` function:
 
-```
-    char* invalid_buf = new char[1024];
-    for (int i = 0; i < 1025; ++i) {
-        invalid_buf[i] = i;
-    }
-    LOG(INFO) << invalid_buf;
+```cpp
+char* invalid_buf = new char[1024];
+for (int i = 0; i < 1025; ++i) {
+    invalid_buf[i] = i;
+}
+LOG(INFO) << invalid_buf;
 ```
 
-We get the following output in be.out
+Then `be.out` shows:
 
-```
+```text
 =================================================================
 ==23284==ERROR: AddressSanitizer: heap-buffer-overflow on address 
0x61900008bf80 at pc 0x00000129f56a bp 0x7fff546eed90 sp 0x7fff546eed88
 WRITE of size 1 at 0x61900008bf80 thread T0
@@ -447,7 +468,7 @@ WRITE of size 1 at 0x61900008bf80 thread T0
     #2 0x7fa5580fbbd4 in __libc_start_main 
(/opt/compiler/gcc-4.8.2/lib64/libc.so.6+0x21bd4)
     #3 0xd30794  
(/home/ssd0/zc/palo/doris/core/output3/be/lib/doris_be+0xd30794)
 
-0x61900008bf80 is located 0 bytes to the right of 1024-byte region 
[0x61900008bb80,0x61900008bf80]
+0x61900008bf80 is located 0 bytes to the right of 1024-byte region 
[0x61900008bb80,0x61900008bf80)
 allocated by thread T0 here:
     #0 0xdeb040 in operator new[](unsigned long) 
../../../../gcc-7.3.0/libsanitizer/asan/asan_new_delete.cc:82
     #1 0x129f50d in doris::StorageEngine::open(doris::EngineOptions const&, 
doris::StorageEngine**) 
/home/ssd0/zc/palo/doris/core/be/src/olap/storage_engine.cpp:104
@@ -457,66 +478,69 @@ allocated by thread T0 here:
 SUMMARY: AddressSanitizer: heap-buffer-overflow 
/home/ssd0/zc/palo/doris/core/be/src/olap/storage_engine.cpp:106 in 
doris::StorageEngine::open(doris::EngineOptions const&, doris::StorageEngine**)
 ```
 
-From this message, we can see that at the address of `0x61900008bf80`, we 
tried to write a byte, but this address is illegal. We can also see the 
application stack of the address `[0x61900008bb80, 0x61900008bf80]`.
+This shows an attempted one-byte write to illegal address `0x61900008bf80`, 
and the allocation stack for region `[0x61900008bb80,0x61900008bf80)`.
 
-**NOTE: turning on this option will affect the execution performance of the 
program. Please be careful to turn on the online instance.**
+**Performance tip:** This option affects performance. Use cautiously on 
production instances.
 
-**NOTE: if the ASAN switch is turned on, the TCMalloc will be automatically 
turned off**
+**Note:** Enabling ASAN automatically disables TCMalloc.
 
-In addition, if stack information is output in be.out, but there is no 
function symbol, then we need to handle it manually to get readable stack 
information. The specific processing method needs a script to parse the output 
of ASAN. At this time, we need to use 
[asan_symbolize](https://llvm.org/svn/llvm-project/compiler-rt/trunk/lib/asan/scripts/asan_symbolize.py)
 to help with parsing. The specific usage is as follows:
+If `be.out` stack output lacks function symbols, manual processing is needed. 
Use the 
[asan_symbolize](https://llvm.org/svn/llvm-project/compiler-rt/trunk/lib/asan/scripts/asan_symbolize.py)
 script to parse ASAN output:
 
-```
+```bash
 cat be.out | python asan_symbolize.py | c++filt
 ```
 
-With the above command, we can get readable stack information.
+This command produces readable stack information.
 
-### CPU
+### CPU Debugging
 
-When the CPU idle of the system is very low, it means that the CPU of the 
system has become the main bottleneck. At this time, it is necessary to analyze 
the current CPU usage. For the be of Doris, there are two ways to analyze the 
CPU bottleneck of Doris.
+When system CPU Idle is low, CPU is the main bottleneck. Analyze current CPU 
usage. For Doris BE, there are two methods to analyze CPU bottlenecks.
 
 #### pprof
 
-[pprof](https://github.com/google/pprof): from gperftools, it is used to 
transform the content generated by gperftools into a format that is easy for 
people to read, such as PDF, SVG, text, etc.
+[pprof](https://github.com/google/pprof) from gperftools converts gperftools 
output to readable formats like PDF, SVG, Text.
 
-Because Doris has integrated and compatible with GPERF rest interface, users 
can analyze remote Doris be through the 'pprof' tool. The specific usage is as 
follows:
+Since Doris has integrated and is compatible with GPerf REST interface, use 
`pprof` tool to analyze remote Doris BE:
 
-```
-pprof --svg --seconds=60 http://be_host:be_webport/pprof/profile > be.svg 
+```bash
+pprof --svg --seconds=60 http://be_host:be_webport/pprof/profile > be.svg
 ```
 
-In this way, a CPU consumption graph of be execution can be generated.
+This command generates a BE CPU consumption graph.
 
 ![CPU Pprof](/images/cpu-pprof-demo.png)
 
-#### perf + flamegragh
+#### perf + FlameGraph
 
-This is a quite common CPU analysis method. Compared with `pprof`, this method 
must be able to log in to the physical machine of the analysis object. However, 
compared with pprof, which can only collect points on time, perf can collect 
stack information through different events. The specific usage is as follows:
+This is a very general CPU analysis method. Unlike `pprof`, this method 
requires login to the physical machine. But compared to pprof's timed sampling, 
perf can collect stack information through different events.
 
-[perf](https://perf.wiki.kernel.org/index.php/main_page): Linux kernel comes 
with performance analysis tool. [here](http://www.brendangregg.com/perf.html) 
there are some examples of perf usage.
+**Tool introduction:**
 
-[flamegraph](https://github.com/brendangregg/flamegraph): a visualization tool 
used to show the output of perf in the form of flame graph.
+- [perf](https://perf.wiki.kernel.org/index.php/Main_Page): Linux kernel 
built-in performance analysis tool. 
[Here](http://www.brendangregg.com/perf.html) are some perf usage examples.
+- [FlameGraph](https://github.com/brendangregg/FlameGraph): Visualization tool 
to display perf output as flame graphs.
 
-```
+**Usage:**
+
+```bash
 perf record -g -p be_pid -- sleep 60
 ```
 
-This command counts the CPU operation of be for 60 seconds and generates 
perf.data. For the analysis of perf.data, the command of perf can be used for 
analysis.
+This command profiles BE CPU usage for 60 seconds and generates `perf.data` 
file. Analyze `perf.data` with perf command:
 
-```
+```bash
 perf report
 ```
 
-The analysis results in the following pictures
+Analysis example:
 
 ![Perf Report](/images/perf-report-demo.png)
 
-To analyze the generated content. Of course, you can also use flash graph to 
complete the visual display.
+Or visualize with FlameGraph:
 
-```
+```bash
 perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl 
> be.svg
 ```
 
-This will also generate a graph of CPU consumption at that time.
+This also generates a CPU consumption graph.
 
-![CPU Flame](/images/cpu-flame-demo.svg)
+![CPU Flame](/images/cpu-flame-demo.svg)
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/developer-guide/debug-tool.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/developer-guide/debug-tool.md
index 4060bc44ed1..0082e9583ed 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/developer-guide/debug-tool.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/developer-guide/debug-tool.md
@@ -1,7 +1,8 @@
 ---
 {
     "title": "调试工具",
-    "language": "zh-CN"
+    "language": "zh-CN",
+    "description": "介绍 Apache Doris 的常用调试工具和方法,包括 FE 和 BE 
的调试技巧,如内存分析、线程分析、性能监控等实用调试手段。"
 }
 ---
 
@@ -26,182 +27,192 @@ under the License.
 
 # 调试工具
 
-在Doris的使用、开发过程中,经常会遇到需要对Doris进行调试的场景,这里介绍一些常用的调试工具。
+在 Doris 的使用和开发过程中,经常需要对 Doris 进行调试。本文档介绍了一些常用的调试工具和方法。
 
-**文中的出现的BE二进制文件名称 `doris_be`,在之前的版本中为 `palo_be`。**
+**注意:文中出现的 BE 二进制文件名称 `doris_be` 在早期版本中为 `palo_be`。**
 
 ## FE 调试
 
-FE 是 Java 进程。这里只列举一下简单常用的 java 调试命令。
+FE 是 Java 进程,以下列举一些常用的 Java 调试命令。
 
-1. 统计当前内存使用明细
+### 1. 统计当前内存使用明细
 
-    ```
-    jmap -histo:live pid > 1.jmp
-    ```
+```bash
+jmap -histo:live pid > 1.jmp
+```
 
-    该命令可以列举存活的对象的内存占用并排序。(pid 换成 FE 进程 id)
+该命令可以列举存活对象的内存占用情况并排序(将 pid 替换为 FE 进程 ID)。
 
-    ```
-     num     #instances         #bytes  class name
-    ----------------------------------------------
-       1:         33528       10822024  [B
-       2:         80106        8662200  [C
-       3:           143        4688112  [Ljava.util.concurrent.ForkJoinTask;
-       4:         80563        1933512  java.lang.String
-       5:         15295        1714968  java.lang.Class
-       6:         45546        1457472  
java.util.concurrent.ConcurrentHashMap$Node
-       7:         15483        1057416  [Ljava.lang.Object;
-    ```
+```text
+ num     #instances         #bytes  class name
+----------------------------------------------
+   1:         33528       10822024  [B
+   2:         80106        8662200  [C
+   3:           143        4688112  [Ljava.util.concurrent.ForkJoinTask;
+   4:         80563        1933512  java.lang.String
+   5:         15295        1714968  java.lang.Class
+   6:         45546        1457472  java.util.concurrent.ConcurrentHashMap$Node
+   7:         15483        1057416  [Ljava.lang.Object;
+```
 
-    可以通过这个方法查看目前存活对象占用的总内存(在文件最后),以及分析哪些对象占用了更多的内存。
+通过该方法可以查看当前存活对象占用的总内存(在文件末尾),以及分析哪些对象占用了更多的内存。
 
-    注意,这个方法因指定了 `:live`,因此会触发 FullGC。
+**注意:** 该方法因指定了 `:live` 参数,会触发 FullGC。
 
-2. 查看 JVM 内存使用
+### 2. 查看 JVM 内存使用
 
-    ```
-    jstat -gcutil pid 1000 1000
-    ```
+```bash
+jstat -gcutil pid 1000 1000
+```
 
-    该命令可以滚动查看当前 JVM 各区域的内存使用情况。(pid 换成 FE 进程 id)
+该命令可以每隔 1 秒查看一次当前 JVM 各区域的内存使用情况(将 pid 替换为 FE 进程 ID)。
 
-    ```
-      S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     
GCT
-      0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794   
 2.043
-      0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794   
 2.043
-      0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794   
 2.043
-      0.00   0.00  22.92   3.03  95.74  92.77     68    1.249     5    0.794   
 2.043
-      0.00   0.00  22.92   3.03  95.74  92.77     68    1.249     5    0.794   
 2.043
-    ```
+```text
+  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT
+  0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+  0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+  0.00   0.00  22.61   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+  0.00   0.00  22.92   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+  0.00   0.00  22.92   3.03  95.74  92.77     68    1.249     5    0.794    
2.043
+```
 
-    其中主要关注 Old区(O)的占用百分比(如示例中为 3%)。如果占用过高,则可能出现 OOM 或 FullGC。
+重点关注 Old 区(O)的占用百分比(如示例中为 3.03%)。如果占用过高,则可能出现 OOM 或 FullGC。
 
-3. 打印 FE 线程堆栈
+### 3. 打印 FE 线程堆栈
 
-    ```
-    jstack -l pid > 1.js
-    ```
+```bash
+jstack -l pid > 1.js
+```
 
-    该命令可以打印当前 FE 的线程堆栈。(pid 换成 FE 进程 id)。
+该命令可以打印当前 FE 的线程堆栈(将 pid 替换为 FE 进程 ID)。
 
-    `-l` 参数会同时检测是否有死锁。该方法可以查看 FE 线程运行情况,是否有死锁,哪里卡住了等问题。
+`-l` 参数会同时检测是否存在死锁。该方法可用于查看 FE 线程运行情况、是否存在死锁、定位阻塞位置等问题。
 
 ## BE 调试
 
-### 内存
+### 内存调试
 
-对于内存的调试一般分为两个方面。一个是内存使用的总量是否合理,内存使用量过大一方面可能是由于系统存在内存泄露,另一方面可能是因为程序内存使用不当。其次就是是否存在内存越界、非法访问的问题,比如程序访问一个非法地址的内存,使用了未初始化内存等。对于内存方面的调试我们一般使用如下几种方式来进行问题追踪。
+内存调试主要关注两个方面:
 
-#### Jemalloc HEAP PROFILE
+1. **内存使用量是否合理**:内存使用量过大可能是系统存在内存泄漏,或程序内存使用不当。
+2. **内存访问是否合法**:是否存在内存越界、非法访问等问题,例如访问非法地址或使用未初始化的内存。
 
-> Doris 1.2.2 版本开始默认使用 Jemalloc 作为内存分配器.
+针对这些问题,可以使用以下工具进行追踪和分析。
 
-有关 Heap Profile 的原理解析参考 [Heap Profiling 
原理解析](https://cn.pingcap.com/blog/an-explanation-of-the-heap-profiling-principle/),需要注意的是
 Heap Profile 记录的是虚拟内存
+#### Jemalloc Heap Profile
 
-支持实时和定期两种方式 Heap Dump,然后使用 `jeprof` 解析生成的 Heap Profile。
+> **说明:** Doris 1.2.2 版本开始默认使用 Jemalloc 作为内存分配器。
 
-##### 1. 实时 Heap Dump,用于分析实时内存
+Heap Profile 的原理解析可参考 [Heap Profiling 
原理解析](https://cn.pingcap.com/blog/an-explanation-of-the-heap-profiling-principle/)。需要注意的是,Heap
 Profile 记录的是虚拟内存。
 
-将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof:false` 修改为 `prof:true`,将 
`prof_active:false` 修改为 `prof_active:true` 并重启 Doris BE,然后使用 Jemalloc Heap Dump 
HTTP 接口,在对应的BE机器上生成 Heap Profile 文件。
+Jemalloc 支持实时和定期两种 Heap Dump 方式,然后使用 `jeprof` 工具解析生成的 Heap Profile。
 
-> Doris 2.1.8 和 3.0.4 及之后的版本,`JEMALLOC_CONF` 中 `prof` 已经默认为 `true`,无需修改。
-> Doris 2.1.8 和 3.0.4 之前的版本, `JEMALLOC_CONF` 中没有 `prof_active`,只需将 
`prof:false` 修改为 `prof:true` 即可。
+##### 1. 实时 Heap Dump(用于分析实时内存)
 
-```shell
+将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof:false` 修改为 `prof:true`,将 
`prof_active:false` 修改为 `prof_active:true`,然后重启 Doris BE。之后使用 Jemalloc Heap 
Dump HTTP 接口在 BE 机器上生成 Heap Profile 文件。
+
+> **版本说明:**
+> - Doris 2.1.8 和 3.0.4 及之后的版本:`JEMALLOC_CONF` 中 `prof` 已默认为 `true`,无需修改。
+> - Doris 2.1.8 和 3.0.4 之前的版本:`JEMALLOC_CONF` 中没有 `prof_active` 选项,只需将 
`prof:false` 修改为 `prof:true` 即可。
+
+```bash
 curl http://be_host:be_webport/jeheap/dump
 ```
 
-Heap Profile 文件所在目录可以在 `be.conf` 中通过 `jeprofile_dir` 变量进行配置,默认为 
`${DORIS_HOME}/log`
+**配置说明:**
+
+- **Heap Profile 文件目录**:可在 `be.conf` 中通过 `jeprofile_dir` 变量配置,默认为 
`${DORIS_HOME}/log`。
+- **采样间隔**:默认为 512KB,通常只记录约 10% 的内存,性能影响通常小于 10%。可以修改 `be.conf` 中 
`JEMALLOC_CONF` 的 `lg_prof_sample` 参数(默认为 `19`,即 2^19 B = 512KB)。减小 
`lg_prof_sample` 可以更频繁采样,使 Heap Profile 更接近真实内存,但会带来更大的性能损耗。
+
+**性能提示:** 如果在做性能测试,建议保持 `prof:false` 以避免 Heap Dump 的性能开销。
 
-默认采样间隔为 512K,这通常只会有 10% 的内存被记录,对性能的影响通常小于 10%,可以修改 `be.conf` 中 `JEMALLOC_CONF` 
的 `lg_prof_sample`,默认为 `19` (2^19 B = 512K),减小 `lg_prof_sample` 可以更频繁的采样使 Heap 
Profile 接近真实内存,但这会带来更大的性能损耗。
+##### 2. 定期 Heap Dump(用于长时间观测内存)
 
-如果你在做性能测试,保持 `prof:false` 来避免 Heap Dump 的性能损耗。
+将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof:false` 修改为 `prof:true`。Heap Profile 
文件默认保存在 `${DORIS_HOME}/log` 目录,文件名前缀由 `be.conf` 中的 `JEMALLOC_PROF_PRFIX` 指定,默认为 
`jemalloc_heap_profile_`。
 
-##### 2. 定期 Heap Dump,用于长时间观测内存
+> **注意:** 在 Doris 2.1.6 之前,`JEMALLOC_PROF_PRFIX` 为空,需要修改为任意值作为 profile 文件名。
 
-将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof:false` 修改为 `prof:true`,Heap Profile 
文件所在目录默认为 `${DORIS_HOME}/log`, 文件名前缀是 `be.conf` 中的 `JEMALLOC_PROF_PRFIX`,默认是 
`jemalloc_heap_profile_`。
+**Dump 触发方式:**
 
-> 在 Doris 2.1.6 之前,`JEMALLOC_PROF_PRFIX` 为空,需要修改为任意值作为 profile 文件名
+1. **内存累计申请一定值时 Dump**
 
-1. 内存累计申请一定值时dump:
+   将 `be.conf` 中 `JEMALLOC_CONF` 的 `lg_prof_interval` 修改为 `34`,此时内存累计申请 
16GB(2^34 B = 16GB)时会 Dump 一次 profile。可以修改为任意值来调整 Dump 间隔。
 
-   将 `be.conf` 中 `JEMALLOC_CONF` 的 `lg_prof_interval` 修改为 34,此时内存累计申请 16GB 
(2^35 B = 16GB) 时 dump 一次 profile,可以修改为任意值来调整dump间隔。
+   > **注意:** 在 Doris 2.1.6 之前,`lg_prof_interval` 默认就是 `32`。
 
-> 在 Doris 2.1.6 之前,`lg_prof_interval` 默认就是32。
+2. **内存每次达到新高时 Dump**
 
-2. 内存每次达到新高时dump:
+   将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof_gdump` 修改为 `true` 并重启 BE。
 
-   将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof_gdump` 修改为 `true` 并重启BE。
+3. **程序退出时 Dump 并检测内存泄漏**
 
-3. 程序退出时dump, 并检测内存泄漏:
+   将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof_leak` 和 `prof_final` 修改为 `true` 并重启 
BE。
 
-   将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof_leak` 和 `prof_final` 修改为 `true` 并重启BE。
+4. **Dump 内存累计值(growth)而非实时值**
 
-4. dump内存累计值(growth),而不是实时值:
+   将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof_accum` 修改为 `true` 并重启 BE。使用 `jeprof 
--alloc_space` 展示 heap dump 累计值。
 
-   将 `be.conf` 中 `JEMALLOC_CONF` 的 `prof_accum` 修改为 `true` 并重启BE。
-   使用 `jeprof --alloc_space` 展示 heap dump 累计值。
+##### 3. 使用 `jeprof` 解析 Heap Profile
 
-##### 3. `jeprof` 解析 Heap Profile
+使用 `be/bin/jeprof` 解析上面 Dump 的 Heap Profile。如果进程内存较大,解析过程可能需要几分钟,请耐心等待。
 
-使用 `be/bin/jeprof` 解析上面 Dump 的 Heap Profile,如果进程内存太大,解析过程可能需要几分钟,请耐心等待。
+若 Doris BE 部署路径的 `be/bin` 目录下没有 `jeprof` 二进制文件,可以将 `doris/tools` 目录下的 `jeprof` 
打包后上传到服务器。
 
-若 Doris BE 部署路径的 `be/bin` 目录下没有 `jeprof` 这个二进制,可以将 `doris/tools` 目录下的 `jeprof` 
打包后上传到服务器。
+> **注意事项:**
+> - 需要 addr2line 版本为 2.35.2 及以上,详情见下面的 QA-1。
+> - 尽可能在运行 Doris BE 的机器上直接执行 Heap Dump 和 `jeprof` 解析,详情见下面的 QA-2。
 
-> 需要 addr2line 版本为 2.35.2 及以上, 详情见下面的 QA-1
-> 尽可能让执行 Heap Dump 和执行 `jeprof` 解析 Heap Profile 在同一台服务器上,即尽可能在运行 Doris BE 
的机器上直接解析 Heap Profile,详情见下面的 QA-2
+**1. 分析单个 Heap Profile 文件**
 
-1. 分析单个 Heap Profile 文件
+```bash
+jeprof --dot ${DORIS_HOME}/lib/doris_be ${DORIS_HOME}/log/profile_file
+```
 
-```shell
-   jeprof --dot ${DORIS_HOME}/lib/doris_be ${DORIS_HOME}/log/profile_file
-   ```
+执行完上述命令后,将终端输出的文本贴到[在线 dot 绘图网站](http://www.webgraphviz.com/),生成内存分配图进行分析。
 
-   执行完上述命令后将终端输出的文本贴到[在线dot绘图网站](http://www.webgraphviz.com/),生成内存分配图,然后进行分析。
+如果服务器方便传输文件,也可以直接生成调用关系图 PDF 文件。需要先安装绘图所需的依赖项:
 
-   如果服务器方便传输文件,也可以通过如下命令直接生成调用关系图 result.pdf 文件传输到本地后进行查看,需要安装绘图所需的依赖项。
+```bash
+yum install ghostscript graphviz
+jeprof --pdf ${DORIS_HOME}/lib/doris_be ${DORIS_HOME}/log/profile_file > 
result.pdf
+```
 
-```shell
-   yum install ghostscript graphviz
-   jeprof --pdf ${DORIS_HOME}/lib/doris_be ${DORIS_HOME}/log/profile_file > 
result.pdf
-   ```
+[graphviz](http://www.graphviz.org/):在没有这个库时 pprof 只能转换为 text 
格式,但这种方式不易查看。安装后,pprof 可以转换为 SVG、PDF 等格式,调用关系更加清晰。
 
-   [graphviz](http://www.graphviz.org/): 
在没有这个库的时候pprof只可以转化为text格式,但这种方式不易查看。那么安装这个库后,pprof可以转化为svg、pdf等格式,对于调用关系则更加清晰明了。
+**2. 分析两个 Heap Profile 文件的 diff**
 
-2.  分析两个 Heap Profile 文件的diff
+```bash
+jeprof --dot ${DORIS_HOME}/lib/doris_be --base=${DORIS_HOME}/log/profile_file 
${DORIS_HOME}/log/profile_file2
+```
 
-```shell
-   jeprof --dot ${DORIS_HOME}/lib/doris_be 
--base=${DORIS_HOME}/log/profile_file ${DORIS_HOME}/log/profile_file2
-   ```
+通过在一段时间内多次执行 Heap Dump,可以生成多个 heap 文件。选取较早时间的 heap 文件作为 baseline,与较晚时间的 heap 
文件进行对比分析 diff。生成调用关系图的方法同上。
 
-   通过在一段时间内多次运行上述命令可以生成多个 heap 文件,可以选取较早时间的 heap 文件作为 baseline,与较晚时间的 heap 
文件对比分析它们的diff,生成调用关系图的方法同上。
+##### 4. 常见问题(QA)
 
-##### 4. QA
+**QA-1:运行 jeprof 后出现大量错误:`addr2line: Dwarf Error: found dwarf version xxx, 
this reader only handles version xxx`**
 
-1. 运行 jeprof 后出现很多错误: `addr2line: Dwarf Error: found dwarf version xxx, this 
reader only handles version xxx`.
+GCC 11 之后默认使用 DWARF-v5,这要求 Binutils 2.35.2 及以上。Doris Ldb_toolchain 使用了 GCC 
11。参考:https://gcc.gnu.org/gcc-11/changes.html。
 
-GCC 11 之后默认使用 DWARF-v5 ,这要求Binutils 2.35.2 及以上,Doris Ldb_toolchain 用了 GCC 
11。see: https://gcc.gnu.org/gcc-11/changes.html。
+解决方法:升级 addr2line 到 2.35.2 版本。
 
-替换 addr2line 到 2.35.2,参考:
-```
-// 下载 addr2line 源码
+```bash
+# 下载 addr2line 源码
 wget https://ftp.gnu.org/gnu/binutils/binutils-2.35.tar.bz2
 
-// 安装依赖项,如果需要
+# 安装依赖项(如果需要)
 yum install make gcc gcc-c++ binutils
 
-// 编译&安装 addr2line
+# 编译 & 安装 addr2line
 tar -xvf binutils-2.35.tar.bz2
 cd binutils-2.35
 ./configure --prefix=/usr/local
 make
 make install
 
-// 验证
+# 验证
 addr2line -h
 
-// 替换 addr2line
+# 替换 addr2line
 chmod +x addr2line
 mv /usr/bin/addr2line /usr/bin/addr2line.bak
 mv /bin/addr2line /bin/addr2line.bak
@@ -209,23 +220,26 @@ cp addr2line /bin/addr2line
 cp addr2line /usr/bin/addr2line
 hash -r
 ```
-注意,不能使用 addr2line 2.3.9, 这可能不兼容,导致内存一直增长。
 
-2. 运行 `jeprof` 后出现很多错误: `addr2line: DWARF error: invalid or unhandled FORM 
value: 0x25`,解析后的 Heap 栈都是代码的内存地址,而不是函数名称
+**注意:** 不能使用 addr2line 2.3.9,该版本可能不兼容,导致内存一直增长。
 
-通常是因为执行 Heap Dump 和执行 `jeprof` 解析 Heap Profile 不在同一台服务器上,导致 `jeprof` 
使用符号表解析函数名称失败,尽可能在同一台机器上完成 Dump Heap 和 `jeprof` 解析的操作,,即尽可能在运行 Doris BE 
的机器上直接解析 Heap Profile。
+**QA-2:运行 `jeprof` 后出现大量错误:`addr2line: DWARF error: invalid or unhandled FORM 
value: 0x25`,解析后的 Heap 栈都是代码的内存地址而非函数名称**
 
-或者确认下运行 Doris BE 的机器 Linux 内核版本,将 `be/bin/doris_be` 二进制文件和 Heap Profile 
文件下载到相同内核版本的机器上执行 `jeprof`。
+通常是因为执行 Heap Dump 和执行 `jeprof` 解析 Heap Profile 不在同一台服务器上,导致 `jeprof` 
使用符号表解析函数名称失败。
 
-3. 如果在运行 Doris BE 的机器上直接解析 Heap Profile 后的 Heap 栈依然是代码的内存地址,而不是函数名称
+解决方法:
+- 尽可能在同一台机器上完成 Dump Heap 和 `jeprof` 解析的操作,即尽可能在运行 Doris BE 的机器上直接解析 Heap 
Profile。
+- 或者确认运行 Doris BE 的机器 Linux 内核版本,将 `be/bin/doris_be` 二进制文件和 Heap Profile 
文件下载到相同内核版本的机器上执行 `jeprof`。
 
-使用下面的脚本,手动解析 Heap Profile,修改这几个变量:
+**QA-3:如果在运行 Doris BE 的机器上直接解析 Heap Profile 后,Heap 栈依然是代码的内存地址而非函数名称**
 
-- heap: Heap Profile 的文件名。
-- bin: `be/bin/doris_be` 二进制文件名
-- llvm_symbolizer: llvm 符号表解析程序的路径,版本最好是编译 `be/bin/doris_be` 二进制使用的版本。
+使用下面的脚本手动解析 Heap Profile,修改这几个变量:
 
-```
+- `heap`:Heap Profile 的文件名。
+- `bin`:`be/bin/doris_be` 二进制文件名。
+- `llvm_symbolizer`:llvm 符号表解析程序的路径,版本最好是编译 `be/bin/doris_be` 二进制使用的版本。
+
+```bash
 #!/bin/bash
 ## @brief
 ## @author zhoufei
@@ -276,27 +290,30 @@ fi
 # vim: et tw=80 ts=2 sw=2 cc=80:
 ```
 
-4. 如果上面所有的方法都不行
-
-- 尝试在运行 Doris BE 的机器上重新编译 `be/bin/doris_be` 二进制,也就是让编译、运行、`jeprof` 解析在同一台机器上。
+**QA-4:如果上面所有的方法都不行**
 
-- 上面的操作后,如果 Heap 栈依然是代码的内存地址,尝试 `USE_JEMALLOC=OFF ./build.sh --be` 编译使用 
TCMalloc 的 Doris BE,然后参考上面的章节使用 TCMalloc Heap Profile 分析内存。
+- 尝试在运行 Doris BE 的机器上重新编译 `be/bin/doris_be` 二进制,让编译、运行、`jeprof` 解析在同一台机器上。
+- 如果上述操作后 Heap 栈依然是代码的内存地址,尝试使用 `USE_JEMALLOC=OFF ./build.sh --be` 编译使用 
TCMalloc 的 Doris BE,然后参考下面的章节使用 TCMalloc Heap Profile 分析内存。
 
-#### TCMalloc HEAP PROFILE
+#### TCMalloc Heap Profile
 
-> Doris 1.2.1 及之前版本使用 TCMalloc,Doris 1.2.2 版本开始默认使用 Jemalloc,如需切换 TCMalloc 
可以这样编译 `USE_JEMALLOC=OFF sh build.sh --be`。
+> **说明:** Doris 1.2.1 及之前版本使用 TCMalloc,Doris 1.2.2 版本开始默认使用 Jemalloc。如需切换回 
TCMalloc,可使用 `USE_JEMALLOC=OFF sh build.sh --be` 进行编译。
 
-当使用 TCMalloc 时,遇到大内存申请会将申请的堆栈打印到be.out文件中,一般的表现形式如下:
+当使用 TCMalloc 时,遇到大内存申请会将申请的堆栈打印到 `be.out` 文件中,一般的表现形式如下:
 
-```
+```text
 tcmalloc: large alloc 1396277248 bytes == 0x3f3488000 @  0x2af6f63 0x2c4095b 
0x134d278 0x134bdcb 0x133d105 0x133d1d0 0x19930ed
 ```
 
-这个表示在Doris BE在这个堆栈上尝试申请`1396277248 
bytes`的内存。我们可以通过`addr2line`命令去把堆栈还原成我们能够看懂的信,具体的例子如下所示。
+这表示 Doris BE 在该堆栈上尝试申请 `1396277248 bytes` 的内存。可以通过 `addr2line` 
命令将堆栈还原成可读的信息,具体示例如下:
 
+```bash
+addr2line -e lib/doris_be 0x2af6f63 0x2c4095b 0x134d278 0x134bdcb 0x133d105 
0x133d1d0 0x19930ed
 ```
-$ addr2line -e lib/doris_be  0x2af6f63 0x2c4095b 0x134d278 0x134bdcb 0x133d105 
0x133d1d0 0x19930ed
 
+输出示例:
+
+```text
 
/home/ssd0/zc/palo/doris/core/thirdparty/src/gperftools-gperftools-2.7/src/tcmalloc.cc:1335
 
/home/ssd0/zc/palo/doris/core/thirdparty/src/gperftools-gperftools-2.7/src/tcmalloc.cc:1357
 
/home/disk0/baidu-doris/baidu/bdg/doris-baidu/core/be/src/exec/hash_table.cpp:267
@@ -306,22 +323,26 @@ $ addr2line -e lib/doris_be  0x2af6f63 0x2c4095b 
0x134d278 0x134bdcb 0x133d105 0
 thread.cpp:?
 ```
 
-有时内存的申请并不是大内存的申请导致,而是通过小内存不断的堆积导致的。那么就没有办法通过查看日志定位到具体的申请信息,那么就需要通过其他方式来获得信息。
+有时内存申请并非由大内存申请导致,而是通过小内存不断堆积导致。这种情况下无法通过查看日志定位具体的申请信息,就需要通过其他方式来获取信息。
 
-这个时候我们可以利用TCMalloc的[HEAP 
PROFILE](https://gperftools.github.io/gperftools/heapprofile.html)的功能。如果设置了HEAPPROFILE功能,那么我们可以获得进程整体的内存申请使用情况。使用方式是在启动Doris
 BE前设置`HEAPPROFILE`环境变量。比如:
+这时可以利用 TCMalloc 的 [HEAP 
PROFILE](https://gperftools.github.io/gperftools/heapprofile.html) 功能。设置 
HEAPPROFILE 功能后,可以获得进程整体的内存申请使用情况。使用方式是在启动 Doris BE 前设置 `HEAPPROFILE` 环境变量。例如:
 
-```
-export TCMALLOC_SAMPLE_PARAMETER=64000 HEAP_PROFILE_ALLOCATION_INTERVAL=-1 
HEAP_PROFILE_INUSE_INTERVAL=-1  HEAP_PROFILE_TIME_INTERVAL=5 
HEAPPROFILE=/tmp/doris_be.hprof
+```bash
+export TCMALLOC_SAMPLE_PARAMETER=64000 HEAP_PROFILE_ALLOCATION_INTERVAL=-1 
HEAP_PROFILE_INUSE_INTERVAL=-1 HEAP_PROFILE_TIME_INTERVAL=5 
HEAPPROFILE=/tmp/doris_be.hprof
 ./bin/start_be.sh --daemon
 ```
 
-> 需要注意,HEAPPROFILE 需要是绝对路径,且已经存在。
+> **注意:** HEAPPROFILE 需要是绝对路径,且目录必须已经存在。
 
-这样,当满足HEAPPROFILE的dump条件时,就会将内存的整体使用情况写到指定路径的文件中。后续我们就可以通过使用`pprof`工具来对输出的内容进行分析。
+这样,当满足 HEAPPROFILE 的 Dump 条件时,就会将内存的整体使用情况写入到指定路径的文件中。后续可以使用 `pprof` 
工具对输出的内容进行分析。
 
+```bash
+pprof --text lib/doris_be /tmp/doris_be.hprof.0012.heap | head -30
 ```
-$ pprof --text lib/doris_be /tmp/doris_be.hprof.0012.heap | head -30
 
+输出示例:
+
+```text
 Using local file lib/doris_be.
 Using local file /tmp/doris_be.hprof.0012.heap.
 Total: 668.6 MB
@@ -338,30 +359,35 @@ Total: 668.6 MB
      1.7   0.3%  98.4%      1.7   0.3% doris::SegmentReader::_load_index
 ```
 
-上述文件各个列的内容:
+**各列的含义:**
 
-* 第一列:函数直接申请的内存大小,单位MB
-* 第四列:函数以及函数所有调用的函数总共内存大小。
-* 第二列、第五列分别是第一列与第四列的比例值。
-* 第三列是个第二列的累积值。
+- **第一列**:函数直接申请的内存大小,单位 MB。
+- **第二列**:第一列的百分比。
+- **第三列**:第二列的累积值。
+- **第四列**:函数及其所有调用的函数总共占用的内存大小,单位 MB。
+- **第五列**:第四列的百分比。
 
-当然也可以生成调用关系图片,更加方便分析。比如下面的命令就能够生成SVG格式的调用关系图。
+当然也可以生成调用关系图片,更加方便分析。例如下面的命令可以生成 SVG 格式的调用关系图:
 
-```
-pprof --svg lib/doris_be /tmp/doris_be.hprof.0012.heap > heap.svg 
+```bash
+pprof --svg lib/doris_be /tmp/doris_be.hprof.0012.heap > heap.svg
 ```
 
-**注意:开启这个选项是要影响程序的执行性能的,请慎重对线上的实例开启**
+**性能提示:** 开启该选项会影响程序的执行性能,请慎重对线上实例开启。
 
-##### pprof remote server
+##### pprof Remote Server
 
-HEAP PROFILE虽然能够获得全部的内存使用信息,但是也有比较受限的地方。1. 需要重启BE进行。2. 
需要一直开启这个命令,导致对整个进程的性能造成影响。
+HEAP PROFILE 虽然能够获得全部的内存使用信息,但也有一些限制:1. 需要重启 BE;2. 需要一直开启该功能,导致对进程性能造成持续影响。
 
-对Doris BE也可以使用动态开启、关闭heap 
profile的方式来对进程进行内存申请分析。Doris内部支持了GPerftools的[远程server调试](https://gperftools.github.io/gperftools/pprof_remote_servers.html)。那么可以通过`pprof`直接对远程运行的Doris
 BE进行动态的HEAP PROFILE。比如我们可以通过以下命令来查看Doris的内存的使用增量
+对 Doris BE 可以使用动态开启、关闭 heap profile 的方式来分析进程的内存申请情况。Doris 内部支持了 GPerftools 
的[远程 server 
调试](https://gperftools.github.io/gperftools/pprof_remote_servers.html)。可以通过 
`pprof` 工具直接对远程运行的 Doris BE 进行动态的 HEAP PROFILE。例如,通过以下命令查看 Doris 的内存使用增量:
 
+```bash
+pprof --text --seconds=60 http://be_host:be_webport/pprof/heap
 ```
-$ pprof --text --seconds=60 http://be_host:be_webport/pprof/heap 
 
+输出示例:
+
+```text
 Total: 1296.4 MB
    484.9  37.4%  37.4%    484.9  37.4% doris::StorageByteBuffer::create
    272.2  21.0%  58.4%    273.3  21.1% doris::RowBlock::init
@@ -378,27 +404,27 @@ Total: 1296.4 MB
     10.0   0.8%  93.4%     10.0   0.8% 
doris::PlainTextLineReader::PlainTextLineReader
 ```
 
-这个命令的输出与HEAP PROFILE的输出及查看方式一样,这里就不再详细说明。这个命令只有在执行的过程中才会开启统计,相比HEAP 
PROFILE对于进程性能的影响有限。
+这个命令的输出和查看方式与 HEAP PROFILE 的输出一致。该命令只在执行过程中开启统计,相比 HEAP PROFILE 对进程性能的影响更小。
 
-#### LSAN
+#### LSAN(内存泄漏检测工具)
 
-[LSAN](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer)是一个地址检查工具,GCC已经集成。在我们编译代码的时候开启相应的编译选项,就能够开启这个功能。当程序发生可以确定的内存泄露时,会将泄露堆栈打印。Doris
 BE已经集成了这个工具,只需要在编译的时候使用如下的命令进行编译就能够生成带有内存泄露检测版本的BE二进制
+[LSAN](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer)
 是一个地址检查工具,GCC 已经集成。在编译代码时开启相应的编译选项,就能够开启该功能。当程序发生可以确定的内存泄漏时,会将泄漏堆栈打印出来。Doris 
BE 已经集成了该工具,只需在编译时使用如下命令即可生成带有内存泄漏检测版本的 BE 二进制:
 
-```
+```bash
 BUILD_TYPE=LSAN ./build.sh
 ```
 
-当系统检测到内存泄露的时候,就会在be.out里面输出对应的信息。为了下面的演示,我们故意在代码中插入一段内存泄露代码。我们在`StorageEngine`的`open`函数中插入如下代码
+当系统检测到内存泄漏时,就会在 `be.out` 中输出对应的信息。为了演示,我们故意在代码中插入一段内存泄漏代码。在 `StorageEngine` 的 
`open` 函数中插入如下代码:
 
-```
-    char* leak_buf = new char[1024];
-    strcpy(leak_buf, "hello world");
-    LOG(INFO) << leak_buf;
+```cpp
+char* leak_buf = new char[1024];
+strcpy(leak_buf, "hello world");
+LOG(INFO) << leak_buf;
 ```
 
-我们就在be.out中获得了如下的输出
+然后在 `be.out` 中就能获得如下输出:
 
-```
+```text
 =================================================================
 ==24732==ERROR: LeakSanitizer: detected memory leaks
 
@@ -411,33 +437,33 @@ Direct leak of 1024 byte(s) in 1 object(s) allocated from:
 SUMMARY: LeakSanitizer: 1024 byte(s) leaked in 1 allocation(s).
 ```
 
-从上述的输出中,我们能看到有1024个字节被泄露了,并且打印出来了内存申请时的堆栈信息。
+从上述输出中可以看到有 1024 个字节被泄漏,并且打印出了内存申请时的堆栈信息。
 
-**注意:开启这个选项是要影响程序的执行性能的,请慎重对线上的实例开启**
+**性能提示:** 开启该选项会影响程序的执行性能,请慎重对线上实例开启。
 
-**注意:如果开启了LSAN开关的话,tcmalloc就会被自动关闭**
+**注意:** 开启 LSAN 后,TCMalloc 会被自动关闭。
 
-#### ASAN
+#### ASAN(地址合法性检测工具)
 
-除了内存使用不合理、泄露以外。有的时候也会发生内存访问非法地址等错误。这个时候我们可以借助[ASAN](https://github.com/google/sanitizers/wiki/AddressSanitizer)来辅助我们找到问题的原因。与LSAN一样,ASAN也集成在了GCC中。Doris通过如下的方式进行编译就能够开启这个功能
+除了内存使用不合理、泄漏以外,有时也会发生内存访问非法地址等错误。这时可以借助 
[ASAN](https://github.com/google/sanitizers/wiki/AddressSanitizer) 来帮助找到问题的原因。与 
LSAN 一样,ASAN 也集成在了 GCC 中。Doris 通过如下方式进行编译就能开启该功能:
 
-```
+```bash
 BUILD_TYPE=ASAN ./build.sh
 ```
 
-执行编译生成的二进制文件,当检测工具发现有异常访问时,就会立即退出,并将非法访问的堆栈输出在be.out中。对于ASAN的输出与LSAN是一样的分析方法。这里我们也主动注入一个地址访问错误,来展示下具体的内容输出。我们仍然在`StorageEngine`的`open`函数中注入一段非法内存访问,具体的错误代码如下
+执行编译生成的二进制文件后,当检测工具发现异常访问时,就会立即退出,并将非法访问的堆栈输出在 `be.out` 中。对于 ASAN 的输出与 LSAN 
使用相同的分析方法。为了演示,我们主动注入一个地址访问错误。仍然在 `StorageEngine` 的 `open` 函数中注入一段非法内存访问代码:
 
-```
-    char* invalid_buf = new char[1024];
-    for (int i = 0; i < 1025; ++i) {
-        invalid_buf[i] = i;
-    }
-    LOG(INFO) << invalid_buf;
+```cpp
+char* invalid_buf = new char[1024];
+for (int i = 0; i < 1025; ++i) {
+    invalid_buf[i] = i;
+}
+LOG(INFO) << invalid_buf;
 ```
 
-然后我们就会在be.out中获得如下的输出
+然后在 `be.out` 中就会获得如下输出:
 
-```
+```text
 =================================================================
 ==23284==ERROR: AddressSanitizer: heap-buffer-overflow on address 
0x61900008bf80 at pc 0x00000129f56a bp 0x7fff546eed90 sp 0x7fff546eed88
 WRITE of size 1 at 0x61900008bf80 thread T0
@@ -456,66 +482,69 @@ allocated by thread T0 here:
 SUMMARY: AddressSanitizer: heap-buffer-overflow 
/home/ssd0/zc/palo/doris/core/be/src/olap/storage_engine.cpp:106 in 
doris::StorageEngine::open(doris::EngineOptions const&, doris::StorageEngine**)
 ```
 
-从这段信息中该可以看到在`0x61900008bf80`这个地址我们尝试去写一个字节,但是这个地址是非法的。我们也可以看到 
`[0x61900008bb80,0x61900008bf80)`这个地址的申请堆栈。
+从这段信息中可以看到在 `0x61900008bf80` 这个地址尝试写入一个字节,但该地址是非法的。同时也可以看到 
`[0x61900008bb80,0x61900008bf80)` 这个地址区域的申请堆栈。
 
-**注意:开启这个选项是要影响程序的执行性能的,请慎重对线上的实例开启**
+**性能提示:** 开启该选项会影响程序的执行性能,请慎重对线上实例开启。
 
-**注意:如果开启了ASAN开关的话,tcmalloc就会被自动关闭**
+**注意:** 开启 ASAN 后,TCMalloc 会被自动关闭。
 
-另外,如果be.out中输出了堆栈信息,但是并没有函数符号,那么这个时候需要我们手动的处理下才能获得可读的堆栈信息。具体的处理方法需要借助一个脚本来解析ASAN的输出。这个时候我们需要使用[asan_symbolize](https://llvm.org/svn/llvm-project/compiler-rt/trunk/lib/asan/scripts/asan_symbolize.py)来帮忙解析下。具体的使用方式如下:
+另外,如果 `be.out` 中输出的堆栈信息没有函数符号,需要手动处理才能获得可读的堆栈信息。可以使用 
[asan_symbolize](https://llvm.org/svn/llvm-project/compiler-rt/trunk/lib/asan/scripts/asan_symbolize.py)
 脚本来解析 ASAN 的输出,具体使用方式如下:
 
-```
+```bash
 cat be.out | python asan_symbolize.py | c++filt
 ```
 
-通过上述的命令,我们就能够获得可读的堆栈信息了。
+通过上述命令就能获得可读的堆栈信息。
 
-### CPU
+### CPU 调试
 
-当系统的CPU 
Idle很低的时候,说明系统的CPU已经成为了主要瓶颈,这个时候就需要分析一下当前的CPU使用情况。对于Doris的BE可以有如下两种方式来分析Doris的CPU瓶颈。
+当系统的 CPU Idle 很低时,说明 CPU 已经成为主要瓶颈,这时需要分析当前的 CPU 使用情况。对于 Doris BE,有以下两种方式来分析 
CPU 瓶颈。
 
 #### pprof
 
-[pprof](https://github.com/google/pprof): 
来自gperftools,用于将gperftools所产生的内容转化成便于人可以阅读的格式,比如pdf, svg, text等.
+[pprof](https://github.com/google/pprof) 来自 gperftools,用于将 gperftools 
产生的内容转换成便于阅读的格式,如 PDF、SVG、Text 等。
 
-由于Doris内部已经集成了并兼容了GPerf的REST接口,那么用户可以通过`pprof`工具来分析远程的Doris BE。具体的使用方式如下:
+由于 Doris 内部已集成并兼容了 GPerf 的 REST 接口,可以通过 `pprof` 工具分析远程的 Doris BE。具体使用方式如下:
 
-```
-pprof --svg --seconds=60 http://be_host:be_webport/pprof/profile > be.svg 
+```bash
+pprof --svg --seconds=60 http://be_host:be_webport/pprof/profile > be.svg
 ```
 
-这样就能够生成一张BE执行的CPU消耗图。
+该命令会生成一张 BE 执行的 CPU 消耗图。
 
 ![CPU Pprof](/images/cpu-pprof-demo.png)
 
-#### perf + flamegragh
+#### perf + FlameGraph
 
-这个是相当通用的一种CPU分析方式,相比于`pprof`,这种方式必须要求能够登陆到分析对象的物理机上。但是相比于pprof只能定时采点,perf是能够通过不同的事件来完成堆栈信息采集的。具体的使用方式如下:
+这是一种非常通用的 CPU 分析方式。相比 `pprof`,这种方式必须要求能够登录到分析对象的物理机上。但相比 pprof 只能定时采样,perf 
能够通过不同的事件来完成堆栈信息采集。
 
-[perf](https://perf.wiki.kernel.org/index.php/Main_Page): 
linux内核自带性能分析工具。[这里](http://www.brendangregg.com/perf.html)有一些perf的使用例子。
+**工具介绍:**
 
-[FlameGraph](https://github.com/brendangregg/FlameGraph): 
可视化工具,用于将perf的输出以火焰图的形式展示出来。
+- [perf](https://perf.wiki.kernel.org/index.php/Main_Page):Linux 
内核自带的性能分析工具。[这里](http://www.brendangregg.com/perf.html)有一些 perf 的使用示例。
+- [FlameGraph](https://github.com/brendangregg/FlameGraph):可视化工具,用于将 perf 
的输出以火焰图的形式展示。
 
-```
+**使用方法:**
+
+```bash
 perf record -g -p be_pid -- sleep 60
 ```
 
-这条命令会统计60秒钟BE的CPU运行情况,并且生成perf.data。对于perf.data的分析,可以通过perf的命令来进行分析
+该命令会统计 60 秒钟 BE 的 CPU 运行情况,并生成 `perf.data` 文件。对于 `perf.data` 的分析,可以通过 perf 
命令进行:
 
-```
+```bash
 perf report
 ```
 
-分析得到如下的图片
+分析得到的示例:
 
 ![Perf Report](/images/perf-report-demo.png)
 
-来对生成的内容进行分析。当然也可以使用flamegragh完成可视化展示。
+当然也可以使用 FlameGraph 进行可视化展示:
 
-```
+```bash
 perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl 
> be.svg
 ```
 
-这样也会生成一张当时运行的CPU消耗图。
+这样也会生成一张当时运行的 CPU 消耗图。
 
 ![CPU Flame](/images/cpu-flame-demo.svg)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to