Yes this sounds like a zookeeper DNS error.

I just ran into these type of issues a few months ago and wrote up my solutions 
to the 3 main hbase communication/setup errors I got.

See if this helps 
http://jayunit100.blogspot.com/2013/05/debugging-hbase-installation.html

Also Make sure iptables are off etc.. 

On Aug 22, 2013, at 6:02 AM, Vamshi Krishna <[email protected]> wrote:

> Hi I setup a hbase cluster of 2 machines.
> 
> Master Machine (vamshi_RS) running both master & Regionserver
> slave machine  - Running only Region server.
> 
> After i ran start-hbase.sh all the daemons are starting perfectly but after
> some time Regionserver on slave machine  is stopping.
> 
> I analysed the region server log and  below is the log content.
> Some how the Region server machine is not able to communicate with the
> zookeeper (I guess). Is that the reason..?
> 
> Please look at my hbase-site.xml below (after log content), which is same
> in both the machines and kindly let me know the solution for this issue.
> 
> 
> 2013-08-22 14:03:25,023 INFO org.apache.zookeeper.ZooKeeper: Initiating
> client connection, connectString=vamshi_RS:2181 sessionTimeout=180000
> watcher=regionserver:60020
> 2013-08-22 14:03:25,033 INFO
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of
> this process is 7426@vamshi
> 2013-08-22 14:03:25,038 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt
> to authenticate using SASL (Unable to locate a login configuration)
> 2013-08-22 14:04:28,171 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
> for server null, unexpected error, closing socket connection and attempting
> reconnect
> java.net.ConnectException: Connection timed out
>    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>    at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>    at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
>    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
> 2013-08-22 14:04:28,287 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase/master
> 2013-08-22 14:04:28,287 INFO org.apache.hadoop.hbase.util.RetryCounter:
> Sleeping 2000ms before retry #1...
> 2013-08-22 14:04:29,282 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt
> to authenticate using SASL (Unable to locate a login configuration)
> 2013-08-22 14:05:32,425 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
> for server null, unexpected error, closing socket connection and attempting
> reconnect
> java.net.ConnectException: Connection timed out
>    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>    at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>    at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
>    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
> 2013-08-22 14:05:32,526 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase/master
> 2013-08-22 14:05:32,526 INFO org.apache.hadoop.hbase.util.RetryCounter:
> Sleeping 4000ms before retry #2...
> 2013-08-22 14:05:33,526 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt
> to authenticate using SASL (Unable to locate a login configuration)
> 2013-08-22 14:06:36,617 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
> for server null, unexpected error, closing socket connection and attempting
> reconnect
> java.net.ConnectException: Connection timed out
>    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>    at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>    at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
>    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
> .
> .
> .
> 
> 
> hbase-site.xml:
> 
> <property>
>        <name>hbase.rootdir</name>
> 
> <!--value>hdfs://vamshi:54310/home/biginfolabs/BILSftwrs/hbase-0.94.10/data/</value-->
>    <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/hbstmp/</value>
>    </property>
> 
>    <property>
>        <name>hbase.cluster.distributed</name>
>        <value>true</value>
>    </property>
>    <property>
>        <name>hbase.master</name>
>        <value>vamshi_RS</value>
>    </property>
>    <property>
>        <name>hbase.zookeeper.property.clientPort</name>
>        <value>2181</value>
>    </property>
> 
>   <property>
>        <name>hbase.hregion.max.filesize</name>
>        <value>50</value>
>    </property>
> 
>   <property>
>        <name>hbase.balancer.period</name>
>        <value>60000</value>
>    </property>
> 
>    <property>
>        <name>hbase.zookeeper.quorum</name>
>        <value>vamshi_RS</value>
>    </property>
>    <property>
>        <name>hbase.zookeeper.property.dataDir</name>
>        <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/zkptmp</value>
>    </property>
>  <property>
>    <name>hbase.client.scanner.caching</name>
>    <value>1000</value>
>    <description>Number of rows that will be fetched when calling next
>    </description>
>  </property>
>  <property>
>    <name>hbase.zookeeper.property.maxClientCnxns</name>
>    <value>1024</value>
>  </property>
> 
> <property>
>    <name>hbase.coprocessor.user.region.classes</name>
>    <value>com.bil.coproc.ColumnAggregationEndpoint</value>
>  </property>
> 
> -- 
> *Regards*
> *
> Vamshi Krishna
> *

Reply via email to