[jira] [Created] (HELIX-535) Helix controller stops working with heavy configuration

Joy (JIRA) Tue, 28 Oct 2014 16:20:12 -0700

Joy created HELIX-535:
-------------------------

             Summary: Helix controller stops working with heavy configuration
                 Key: HELIX-535
                 URL: https://issues.apache.org/jira/browse/HELIX-535
             Project: Apache Helix
          Issue Type: Bug
          Components: helix-core
         Environment: machine:$ uname -a
Linux eat1-app373.stg 2.6.32-220.10.1.el6.x86_64 #1 SMP Fri Mar 9 12:37:51 EST 
2012 x86_64 x86_64 x86_64 GNU/Linux


JVM version: $ /export/apps/jdk/current/bin/java -version
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)

            Reporter: Joy


The issue consistently comes up with heavy configuration: higher number of 
znodes, higher number of partitions, and higher number of databases.

The goal of our tests is to evaluate the performance of helix controller (in 
terms of controller latency) with increased number of nodes, databases and 
partitions.

In our test, we use multiple machines: one for zookeeper, one for helix 
controller, and the rest are for dummy processes. The configuration is as below:
        zkr <----------> helix
         ^
         |
        V
      dummy processes

We intentionally kill the master dummy processes once every 30 seconds to 
simulate a failure event. Everything works fine with light configuration such 
as: 27 nodes + 1db + 729 partitions. However, when the configuration is heavy, 
such as 81 nodes + 10 databases + 81 partitions for each db, the controller 
latency increases significantly after several failure events:
                  Control Latency (ms)
First event     : 182
Second event: 188
Third event:     200
Fourth Event:  193
Fifth event:      200
Sixth event:     185
Seventh event: 189
Eight event:      213
Ninth Event:     1082209

And then after this extremely long failure, the helix controller stop working. 
The controller log is as attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HELIX-535) Helix controller stops working with heavy configuration

Reply via email to