[ 
https://issues.apache.org/jira/browse/HADOOP-18055?focusedWorklogId=700935&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-700935
 ]

ASF GitHub Bot logged work on HADOOP-18055:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/Dec/21 09:27
            Start Date: 24/Dec/21 09:27
    Worklog Time Spent: 10m 
      Work Description: virajjasani commented on a change in pull request #3824:
URL: https://github.com/apache/hadoop/pull/3824#discussion_r774947381



##########
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/http/ProfileServlet.java
##########
@@ -0,0 +1,393 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.http;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantLock;
+import javax.servlet.http.HttpServlet;
+import javax.servlet.http.HttpServletRequest;
+import javax.servlet.http.HttpServletResponse;
+
+import org.apache.hadoop.thirdparty.com.google.common.base.Joiner;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.util.ProcessUtils;
+
+/**
+ * Servlet that runs async-profiler as web-endpoint.
+ * <p>
+ * Following options from async-profiler can be specified as query paramater.
+ * //  -e event          profiling event: cpu|alloc|lock|cache-misses etc.
+ * //  -d duration       run profiling for 'duration' seconds (integer)
+ * //  -i interval       sampling interval in nanoseconds (long)
+ * //  -j jstackdepth    maximum Java stack depth (integer)
+ * //  -b bufsize        frame buffer size (long)
+ * //  -t                profile different threads separately
+ * //  -s                simple class names instead of FQN
+ * //  -o fmt[,fmt...]   output format: 
summary|traces|flat|collapsed|svg|tree|jfr|html
+ * //  --width px        SVG width pixels (integer)
+ * //  --height px       SVG frame height pixels (integer)
+ * //  --minwidth px     skip frames smaller than px (double)
+ * //  --reverse         generate stack-reversed FlameGraph / Call tree
+ * <p>
+ * Example:
+ * If Namenode http address is 9870, and ResourceManager http address is 8088,
+ * ProfileServlet running with async-profiler setup can be accessed with
+ * http://localhost:9870/prof and http://localhost:8088/prof
+ * Deep dive into some params:
+ * - To collect 10 second CPU profile of current process i.e. Namenode 
(returns FlameGraph svg)
+ * curl "http://localhost:9870/prof";
+ * - To collect 10 second CPU profile of pid 12345 (returns FlameGraph svg)
+ * curl "http://localhost:9870/prof?pid=12345"; (For instance, provide pid of 
Datanode)
+ * - To collect 30 second CPU profile of pid 12345 (returns FlameGraph svg)
+ * curl "http://localhost:9870/prof?pid=12345&amp;duration=30";
+ * - To collect 1 minute CPU profile of current process and output in tree 
format (html)
+ * curl "http://localhost:9870/prof?output=tree&amp;duration=60";
+ * - To collect 10 second heap allocation profile of current process (returns 
FlameGraph svg)
+ * curl "http://localhost:9870/prof?event=alloc";
+ * - To collect lock contention profile of current process (returns FlameGraph 
svg)
+ * curl "http://localhost:9870/prof?event=lock";
+ * <p>
+ * Following event types are supported (default is 'cpu') (NOTE: not all OS'es 
support all events)
+ * // Perf events:
+ * //    cpu
+ * //    page-faults
+ * //    context-switches
+ * //    cycles
+ * //    instructions
+ * //    cache-references
+ * //    cache-misses
+ * //    branches
+ * //    branch-misses
+ * //    bus-cycles
+ * //    L1-dcache-load-misses
+ * //    LLC-load-misses
+ * //    dTLB-load-misses
+ * //    mem:breakpoint
+ * //    trace:tracepoint
+ * // Java events:
+ * //    alloc
+ * //    lock
+ */
[email protected]
+public class ProfileServlet extends HttpServlet {
+
+  private static final long serialVersionUID = 1L;
+  private static final Logger LOG = 
LoggerFactory.getLogger(ProfileServlet.class);
+
+  private static final String ACCESS_CONTROL_ALLOW_METHODS = 
"Access-Control-Allow-Methods";
+  private static final String ALLOWED_METHODS = "GET";
+  private static final String ACCESS_CONTROL_ALLOW_ORIGIN = 
"Access-Control-Allow-Origin";
+  private static final String CONTENT_TYPE_TEXT = "text/plain; charset=utf-8";
+  private static final String ASYNC_PROFILER_HOME_ENV = "ASYNC_PROFILER_HOME";
+  private static final String ASYNC_PROFILER_HOME_SYSTEM_PROPERTY = 
"async.profiler.home";
+  private static final String PROFILER_SCRIPT = "/profiler.sh";
+  private static final int DEFAULT_DURATION_SECONDS = 10;
+  private static final AtomicInteger ID_GEN = new AtomicInteger(0);
+
+  static final String OUTPUT_DIR = System.getProperty("java.io.tmpdir") + 
"/prof-output";
+
+  private enum Event {
+
+    CPU("cpu"),
+    ALLOC("alloc"),
+    LOCK("lock"),
+    PAGE_FAULTS("page-faults"),
+    CONTEXT_SWITCHES("context-switches"),
+    CYCLES("cycles"),
+    INSTRUCTIONS("instructions"),
+    CACHE_REFERENCES("cache-references"),
+    CACHE_MISSES("cache-misses"),
+    BRANCHES("branches"),
+    BRANCH_MISSES("branch-misses"),
+    BUS_CYCLES("bus-cycles"),
+    L1_DCACHE_LOAD_MISSES("L1-dcache-load-misses"),
+    LLC_LOAD_MISSES("LLC-load-misses"),
+    DTLB_LOAD_MISSES("dTLB-load-misses"),
+    MEM_BREAKPOINT("mem:breakpoint"),
+    TRACE_TRACEPOINT("trace:tracepoint");
+
+    private final String internalName;
+
+    Event(final String internalName) {
+      this.internalName = internalName;
+    }
+
+    public String getInternalName() {
+      return internalName;
+    }
+
+    public static Event fromInternalName(final String name) {
+      for (Event event : values()) {
+        if (event.getInternalName().equalsIgnoreCase(name)) {
+          return event;
+        }
+      }
+
+      return null;
+    }
+  }
+
+  private enum Output {
+    SUMMARY,
+    TRACES,
+    FLAT,
+    COLLAPSED,
+    // No SVG in 2.x asyncprofiler.
+    SVG,
+    TREE,
+    JFR,
+    // In 2.x asyncprofiler, this is how you get flamegraphs.
+    HTML
+  }
+
+  private final Lock profilerLock = new ReentrantLock();
+  private transient volatile Process process;
+  private final String asyncProfilerHome;
+  private Integer pid;
+
+  public ProfileServlet() {
+    this.asyncProfilerHome = getAsyncProfilerHome();
+    this.pid = ProcessUtils.getPid();
+    LOG.info("Servlet process PID: {} asyncProfilerHome: {}", pid, 
asyncProfilerHome);
+  }
+
+  @Override
+  protected void doGet(final HttpServletRequest req, final HttpServletResponse 
resp)
+      throws IOException {
+    if (!HttpServer2.isInstrumentationAccessAllowed(getServletContext(), req, 
resp)) {

Review comment:
       Thanks @aajisaka. I have added `isInstrumentationAccessAllowed` here but 
let me also add it to `ProfileOutputServlet` so that even before the url 
redirection takes place, user will get `SC_UNAUTHORIZED` response.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 700935)
    Time Spent: 1h 10m  (was: 1h)

> Async Profiler endpoint for Hadoop daemons
> ------------------------------------------
>
>                 Key: HADOOP-18055
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18055
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: Viraj Jasani
>            Assignee: Viraj Jasani
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Async profiler ([https://github.com/jvm-profiling-tools/async-profiler]) is a 
> low overhead sampling profiler for Java that does not suffer from Safepoint 
> bias problem. It features HotSpot-specific APIs to collect stack traces and 
> to track memory allocations. The profiler works with OpenJDK, Oracle JDK and 
> other Java runtimes based on the HotSpot JVM.
> Async profiler can also profile heap allocations, lock contention, and HW 
> performance counters in addition to CPU.
> We have an httpserver based servlet stack hence we can use HIVE-20202 as an 
> implementation template to provide async profiler as servlet for Hadoop 
> daemons. Ideally we achieve these requirements:
>  * Retrieve flamegraph SVG generated from latest profile trace.
>  * Online enable and disable of profiling activity. (async-profiler does not 
> do instrumentation based profiling so this should not cause the code gen 
> related perf problems of that other approach and can be safely toggled on and 
> off while under production load.)
>  * CPU profiling.
>  * ALLOCATION profiling.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to