FelixYBW commented on code in PR #8065:
URL: https://github.com/apache/incubator-gluten/pull/8065#discussion_r1861341945


##########
tools/workload/benchmark_velox/README.md:
##########
@@ -36,3 +36,23 @@ We also provide a script 
[run_tpc_workload.sh](./run_tpc_workload.sh). This scri
 ## Analyzing Performance Results
 
 You can check the **Show Performance** section in the output notebook after 
execution. It shows the cpu% per query, and draws charts for the cpu%, memory 
throughput, disk throughput/util%, network throughput and pagefaults.
+
+## Set up Performance Analysis Tools
+
+Please check the **Set up perf analysis tools (optional)** section in 
[initialize.ipynb](./initialize.ipynb) to set up the environment required for 
running performance analysis scripts. Once the setup is complete, update the 
following variables in your YAML file (as documented in 
[params.yaml.template](./params.yaml.template)) before running TPC-H/TPC-DS 
Benchmarks:
+
+- server: Hostname or IP to server for perf analysis. Able to connect via ssh. 
Can be localhost if you deploy the perf analysis scripts on the local cluster.
+- base_dir: Specify the directory on perf analysis server. Usually a codename 
for this run.
+- analyze_perf: Whether to upload profile to perf analysis server and run perf 
analysis scripts. Only takes effect if server is set. In this case set to `True`
+- proxy: Proxy used to connect to server for perf analysis. Only needed if the 
perf analysis server is accessed via proxy.
+
+After the workload completes, the tool generates a notebook, executes it 
automatically, and saves the output notebook in the `$HOME/PAUS/base_dir` 
directory with a suffix of `[APPLICATION ID].nbconvert.ipynb`. Additionally, 
the output notebook is converted into an HTML format for improved readability, 
with the same filename, and stored in the `html` sub-folder.
+
+The notebook also produces a trace-viewer JSON file to analyze workload 
statistics. This includes SAR metrics and stage/task-level breakdowns. Using 
this tool, users can compare statistics across stages and queries, identify 
performance bottlenecks, and target specific stages for optimization.
+
+If you have set up and launched Catapult trace-viewer server (refer to the 
**Set up perf analysis tools (optional)** section in 
[initialize.ipynb](./initialize.ipynb)), you can explore a sample trace-viewer 
JSON file. To do so: 
+
+1. Copy the sample file 
[trace_result_tpch_q1.json](./sample/trace_result_tpch_q1.json) to the 
`$HOME/trace_result` directory
+2. Open the following link in your browser to view the results: 
http://[your-host-ip]:1088/tracing_examples/trace_viewer.html#/tracing/test_data/trace_result_tpch_q1.json
+
+This visualization helps to better understand performance metrics and optimize 
accordingly.

Review Comment:
   Can you add one more guide that the json file can be open by chrome 
about://tracing?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to