[ 
https://issues.apache.org/jira/browse/HDDS-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18052465#comment-18052465
 ] 

Wei-Chiu Chuang commented on HDDS-13788:
----------------------------------------

[~cgoyal] feel free to pick this up. This is more like a brain dump for me. The 
text needs to be polished though.

> [Docs] Add performance troubleshooting doc
> ------------------------------------------
>
>                 Key: HDDS-13788
>                 URL: https://issues.apache.org/jira/browse/HDDS-13788
>             Project: Apache Ozone
>          Issue Type: Task
>            Reporter: Wei-Chiu Chuang
>            Priority: Major
>
> I'd like to add a new page for "Troubleshoot performance issues" in the user 
> documentation under Troubleshooting section 
> [https://ozone.apache.org/docs/edge/troubleshooting.html]
> Or it can go to the Observability page. 
> [https://ozone.apache.org/docs/edge/feature/observability.html]
> It will include:
> 1. Flame graph
> If a particular operation runs slow and CPU utilization is high, use flame 
> graph to inspect hotspots.
> Enable Framegraph endpoints (hdds.profiler.endpoint.enabled = true), download 
> async profiler, start async profiler 2.x from end point or command line. The 
> output is exported into a SVG html file.
> The flame graph is collected on a per process basis. To generate flame graphs 
> across a cluster,
>  # Download this repo [https://github.com/jojochuang/ozone_perf.git]
>  # Download async profiler to /opt/async-profiler-2.8.1-linux-x64/
>  # Add cluster hostnames to the file cluster_hosts.txt, one hostname per line.
>  # Update PASSWORDLESS_USER in conf.sh to a user that has passwordless ssh 
> capability in the cluster. This user must also have sudo privileges.
>  # Run ‘./start_profiles.sh’ to kick off profiling
>  # Wait for some time
>  # Run ‘./collect_profiles.sh’ to stop the profiling and to collect 
> flamegraphs. They will be downloaded and compressed into a tarball. This 
> script collects flamegraphs for Ozone OM, SCM, DN and Recon, HDFS NN and DN, 
> Impala daemon and HBase RegionServer.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to