[jira] [Commented] (HBASE-23955) Have test runs use less resources

Michael Stack (Jira) Fri, 13 Mar 2020 21:00:50 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-23955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059170#comment-17059170
 ]


Michael Stack commented on HBASE-23955:
---------------------------------------

How about this?

{code}
diff --git 
a/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/NettyRpcClientConfigHelper.java
 
b/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/NettyRpcClientConfigHelper.java
index 6107183dd4..7fddbbc2b1 100644
--- 
a/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/NettyRpcClientConfigHelper.java
+++ 
b/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/NettyRpcClientConfigHelper.java
@@ -55,6 +55,8 @@ public final class NettyRpcClientConfigHelper {
   private static final Map<String, Pair<EventLoopGroup, Class<? extends 
Channel>>>
     EVENT_LOOP_CONFIG_MAP = new HashMap<>();

+  private static Pair<EventLoopGroup, Class<? extends Channel>> 
DEFAULT_EVENTLOOPGROUP;
+
   /**
    * Shutdown constructor.
    */
@@ -63,7 +65,7 @@ public final class NettyRpcClientConfigHelper {
   /**
    * Set the EventLoopGroup and channel class for {@code AsyncRpcClient}.
    */
-  public static void setEventLoopConfig(Configuration conf, EventLoopGroup 
group,
+  public synchronized static void setEventLoopConfig(Configuration conf, 
EventLoopGroup group,
       Class<? extends Channel> channelClass) {
     Preconditions.checkNotNull(group, "group is null");
     Preconditions.checkNotNull(channelClass, "channel class is null");
@@ -75,21 +77,22 @@ public final class NettyRpcClientConfigHelper {
   /**
    * The {@code AsyncRpcClient} will create its own {@code NioEventLoopGroup}.
    */
-  public static void createEventLoopPerClient(Configuration conf) {
+  public synchronized static void createEventLoopPerClient(Configuration conf) 
{
     conf.set(EVENT_LOOP_CONFIG, "");
     EVENT_LOOP_CONFIG_MAP.clear();
   }

-  static Pair<EventLoopGroup, Class<? extends Channel>> 
getEventLoopConfig(Configuration conf) {
+  static synchronized Pair<EventLoopGroup, Class<? extends Channel>> 
getEventLoopConfig(
+      Configuration conf) {
     String name = conf.get(EVENT_LOOP_CONFIG);
-    if (name == null) {
-      int threadCount = 
conf.getInt(HBASE_NETTY_EVENTLOOP_RPCCLIENT_THREADCOUNT_KEY, 0);
-      return new Pair<>(new NioEventLoopGroup(threadCount,
-        new DefaultThreadFactory("RPCClient-NioEventLoopGroup", true,
-          Thread.NORM_PRIORITY)), NioSocketChannel.class);
-    }
-    if (StringUtils.isBlank(name)) {
-      return null;
+    if (name == null || StringUtils.isBlank(name)) {
+      if (DEFAULT_EVENTLOOPGROUP == null) {
+        int threadCount = 
conf.getInt(HBASE_NETTY_EVENTLOOP_RPCCLIENT_THREADCOUNT_KEY, 0);
+        DEFAULT_EVENTLOOPGROUP = new Pair<>(new NioEventLoopGroup(threadCount,
+          new DefaultThreadFactory("RPCClient-NioEventLoopGroup", true,
+            Thread.NORM_PRIORITY)), NioSocketChannel.class);
+      }
+      return DEFAULT_EVENTLOOPGROUP;
     }
     return EVENT_LOOP_CONFIG_MAP.get(name);
   }
{code}


> Have test runs use less resources
> ---------------------------------
>
>                 Key: HBASE-23955
>                 URL: https://issues.apache.org/jira/browse/HBASE-23955
>             Project: HBase
>          Issue Type: Improvement
>          Components: test
>            Reporter: Michael Stack
>            Priority: Major
>
> Our tests can create thousands of threads all up in the one JVM. Using less 
> means less memory, less contention, and hopefully, likelier passes.
> I've been studying the likes of TestNamespaceReplicationWithBulkLoadedData to 
> see what it does as it runs (this test puts up 4 clusters with replication 
> between). It peaks at 2k threads. After some configuration and using less 
> HDFS, can get it down to ~800 threads and about 1/2 the memory-used.
> (HDFS is a profligate offender. DataXceivers (Server and Client), jetty 
> threads, Volume threads (async disk 'worker' then another for cleanup...), 
> image savers, ipc clients -- new thread per incoming connection w/o bound (or 
> reuse), block responder threads, anonymous threads, and so on. Many are not 
> configurable or boundable or are hard-coded; e.g. each volume gets 4 workers. 
> Biggest impact was to be had by downing the count of data nodes. TODO: a 
> follow-on that turns down DN counts in all tests)
> I've been using Java Flight Recorder during this study. Here is how you get a 
> flight recorder for the a single test run:
> {code:java}
> MAVEN_OPTS=" 
> -XX:StartFlightRecording=disk=true,dumponexit=true,filename=recording.jfr,settings=profile,path-to-gc-roots=true,maxsize=1024m"
>   mvn  test -Dtest=TestNamespaceReplicationWithBulkLoadedData 
> -Dsurefire.firstPartForkCount=0 -Dsurefire.secondPartForkCount=0 {code}
> i.e. start recording on mvn launch, bound the size of the recording, and have 
> the test run in the mvn context (DON'T fork). Useful is connecting to the 
> running test at the same time from JDK Mission Control. We do the latter 
> because the thread reporting screen is overwhelmed by the count of running 
> threads and if you connect live, you can at least get a 'live threads' graph 
> w/ count as the test progresses. Useful.
> When the test finishes, it dumps a .jfr file which can be opened in JDK MC. 
> I've been compiling w/ JDK8 and then running w/ JDK11 so I can use JDK MC 
> Version 7, the non-commercial latest. Works pretty well.
> Let me put up a patch for tests that cuts down thread counts where we can.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-23955) Have test runs use less resources

Reply via email to