JinHyuk Kim created HBASE-29384:
-----------------------------------

             Summary: Async Profiler occasionally fails to capture profiling 
data
                 Key: HBASE-29384
                 URL: https://issues.apache.org/jira/browse/HBASE-29384
             Project: HBase
          Issue Type: Bug
          Components: master, regionserver
            Reporter: JinHyuk Kim
            Assignee: JinHyuk Kim


h3. *Problem*

When using the HBase Web UI to start async-profiler, the profiler works 
correctly on some RegionServers, but on others, the SVG result is never 
rendered and the page keeps refreshing indefinitely.
h3. *Root Cause*

Currently, the profiler output (SVG) is written to the 
{{/tmp/prof-output-hbase}} directory, which is created only once when the 
RegionServer starts. However, since {{/tmp}} is often subject to periodic 
cleanup by the OS (e.g., via {{systemd-tmpfiles-clean}} on Linux), this 
directory can be removed after some time.

As a result, if async-profiler is not used for a while after the RegionServer 
starts, the output directory may no longer exist when profiling is requested. 
In that case, the profiler fails with an error message, and no SVG result is 
generated.

The error message looks like
{code:java}
Could not open /tmp/prof-output-hbase/async-prof-pid-xxxxxx-cpu-1.svg{code}
related code: 
https://github.com/async-profiler/async-profiler/blob/v1.8.8/src/profiler.cpp#L1246
h3. *Solution*

We can ensure that the profiler output directory exists every time profiling is 
requested.
{{ProfilerServlet}} now checks for the existence of the output directory and 
creates it if it has been deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to