[
https://issues.apache.org/jira/browse/HBASE-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
chendihao reassigned HBASE-10283:
---------------------------------
Assignee: chendihao
> Client can't connect with all the running zk servers in MiniZooKeeperCluster
> ----------------------------------------------------------------------------
>
> Key: HBASE-10283
> URL: https://issues.apache.org/jira/browse/HBASE-10283
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.94.3
> Reporter: chendihao
> Assignee: chendihao
>
> Refer to HBASE-3052, multiple zk servers can run together in minicluster. The
> problem is that client can only connect with the first zk server and if you
> kill the first one, it fails to access the cluster even though other zk
> servers are serving.
> It's easy to repro. Firstly `TEST_UTIL.startMiniZKCluster(3)`. Secondly call
> `killCurrentActiveZooKeeperServer` in MiniZooKeeperCluster. Then when you
> construct the zk client, it can't connect with the zk cluster for any way.
> Here is the simple log you can refer.
> {noformat}
> 2014-01-03 12:06:58,625 INFO [main] zookeeper.MiniZooKeeperCluster(194):
> Started MiniZK Cluster and connect 1 ZK server on client port: 55227
> ......
> 2014-01-03 12:06:59,134 INFO [main] zookeeper.MiniZooKeeperCluster(264):
> Kill the current active ZK servers in the cluster on client port: 55227
> 2014-01-03 12:06:59,134 INFO [main] zookeeper.MiniZooKeeperCluster(272):
> Activate a backup zk server in the cluster on client port: 55228
> 2014-01-03 12:06:59,366 INFO [main-EventThread] zookeeper.ZooKeeper(434):
> Initiating client connection, connectString=localhost:55227
> sessionTimeout=3000
> watcher=com.xiaomi.infra.timestamp.TimestampWatcher@a383118
> (then it throws exceptions......)
> {noformat}
> The log is kind of problematic because it always show "Started MiniZK Cluster
> and connect 1 ZK server" but actually there're three zk servers.
> Looking deeply we find that the client is still trying to connect with the
> dead zk server's port. When I print out the zkQuorum it used, only the first
> zk server's hostport is there and it will not change no matter you kill the
> server or not. The reason for this is in ZKConfig which will convert HBase
> settings into zk's. MiniZooKeeperCluster create three servers with the same
> host name, "localhost", and different ports. But HBase self force to use the
> same port for each zk server and ZKConfig will ignore the other two servers
> which have the same host name.
> MiniZooKeeperCluster works improperly before we fix this. The bug is not
> found because we never test whether HBase works or not if we kill the zk
> active or backup servers in ut.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)