[
https://issues.apache.org/jira/browse/HBASE-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862461#action_12862461
]
Miklos Kurucz commented on HBASE-2504:
--------------------------------------
Yes the ZK session shutdown lines you see belong to the HCM session, not the
regionserver which is still seen as alive by the master(and thus makes the
cluster mostly unusable).
Our ulimit is:
ulimit -n 65536
and our xcievers is:
<property>
<name>dfs.datanode.max.xcievers</name>
<value>2048</value>
<final>true</final>
</property>
Our main problem is that HDFS is very unreliable due to some bad disk space
assignment behavior.
That should be fixed in the near future, but until then I would like to get
familiar with HBASE(and fix bugs if I can)
> Double assigned znodes in regionserver
> --------------------------------------
>
> Key: HBASE-2504
> URL: https://issues.apache.org/jira/browse/HBASE-2504
> Project: Hadoop HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.20.3
> Reporter: Miklos Kurucz
>
> regionserver log:
> Thu Apr 29 13:34:27 CEST 2010 Starting regionserver on dell102
> ...
> 2010-04-29 13:34:29,656 INFO org.apache.zookeeper.ClientCnxn: Opening socket
> connection to server dell147/10.1.3.147:2181
> 2010-04-29 13:34:29,657 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to dell147/10.1.3.147:2181, initiating session
> 2010-04-29 13:34:29,678 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server dell147/10.1.3.147:2181, sessionid =
> 0x284958aa550001, negotiated timeout = 60000
> ...
> 2010-04-29 14:13:30,096 INFO org.apache.zookeeper.ZooKeeper: Initiating
> client connection, connectString=dell149:2181,dell148:2181,dell147:2181
> sessionTimeout=60000
> watcher=org.apache.hadoop.hbase.client.hconnectionmanager$clientzkwatc...@46d895e1
> 2010-04-29 14:13:30,096 INFO org.apache.zookeeper.ClientCnxn: Opening socket
> connection to server dell147/10.1.3.147:2181
> 2010-04-29 14:13:30,161 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to dell147/10.1.3.147:2181, initiating session
> 2010-04-29 14:13:30,194 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server dell147/10.1.3.147:2181, sessionid =
> 0x284958aa550014, negotiated timeout = 60000
> 2010-04-29 14:13:30,195 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode
> /hbase/root-region-server got 10.1.3.123:60020
> 2010-04-29 14:13:30,226 DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Found ROOT at
> 10.1.3.123:60020
> 2010-04-29 14:13:30,243 DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cached
> location for .META.,,1 is 10.1.3.125:60020
> 2010-04-29 14:13:30,247 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master
> got 10.1.3.150:60000
> ...
> 2010-04-29 22:06:01,637 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x284958aa550014 closed
> 2010-04-29 22:06:02,012 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Closed connection with
> ZooKeeper
> Clearly the reinitializeZooKeeper() method was called for some reason.
> Unfortunately:
> hbase(main):005:0> zk 'get /hbase/rs/1272540873161'
> 10.1.3.102:60020
> cZxid = 0x5f0000002b
> ctime = Thu Apr 29 13:34:33 CEST 2010
> mZxid = 0x5f0000003e
> mtime = Thu Apr 29 13:34:33 CEST 2010
> pZxid = 0x5f0000002b
> cversion = 0
> dataVersion = 1
> aclVersion = 0
> ephemeralOwner = 0x284958aa550001
> dataLength = 16
> numChildren = 0
> The owner of the zookeeper node is the first session which was never closed.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.