[
https://issues.apache.org/jira/browse/HDDS-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18052465#comment-18052465
]
Wei-Chiu Chuang commented on HDDS-13788:
----------------------------------------
[~cgoyal] feel free to pick this up. This is more like a brain dump for me. The
text needs to be polished though.
> [Docs] Add performance troubleshooting doc
> ------------------------------------------
>
> Key: HDDS-13788
> URL: https://issues.apache.org/jira/browse/HDDS-13788
> Project: Apache Ozone
> Issue Type: Task
> Reporter: Wei-Chiu Chuang
> Priority: Major
>
> I'd like to add a new page for "Troubleshoot performance issues" in the user
> documentation under Troubleshooting section
> [https://ozone.apache.org/docs/edge/troubleshooting.html]
> Or it can go to the Observability page.
> [https://ozone.apache.org/docs/edge/feature/observability.html]
> It will include:
> 1. Flame graph
> If a particular operation runs slow and CPU utilization is high, use flame
> graph to inspect hotspots.
> Enable Framegraph endpoints (hdds.profiler.endpoint.enabled = true), download
> async profiler, start async profiler 2.x from end point or command line. The
> output is exported into a SVG html file.
> The flame graph is collected on a per process basis. To generate flame graphs
> across a cluster,
> # Download this repo [https://github.com/jojochuang/ozone_perf.git]
> # Download async profiler to /opt/async-profiler-2.8.1-linux-x64/
> # Add cluster hostnames to the file cluster_hosts.txt, one hostname per line.
> # Update PASSWORDLESS_USER in conf.sh to a user that has passwordless ssh
> capability in the cluster. This user must also have sudo privileges.
> # Run ‘./start_profiles.sh’ to kick off profiling
> # Wait for some time
> # Run ‘./collect_profiles.sh’ to stop the profiling and to collect
> flamegraphs. They will be downloaded and compressed into a tarball. This
> script collects flamegraphs for Ozone OM, SCM, DN and Recon, HDFS NN and DN,
> Impala daemon and HBase RegionServer.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]