Hi, when I start my hbase cluster, there are some error logs in the master-log: <the ip and hostname node3 192.168.1.15 192.168.1.13 are the same machine that have two NIC> 2011-07-05 17:13:13,820 INFO org.apache.zookeeper.ClientCnxn: zookeeper.disableAutoWatchReset is false 2011-07-05 17:13:13,840 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server node3/192.168.1.15:2181 .... 2011-07-05 17:13:13,975 DEBUG org.apache.hadoop.hbase.master.HMaster: Checking cluster state... 2011-07-05 17:13:13,979 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/root-region-server got 192.168.1.13:60020 .... 2011-07-05 17:13:19,732 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Updated ZNode /hbase/rs/1309857199677 with data 192.168.1.15:60020 .... 2011-07-05 17:22:01,041 INFO org.apache.hadoop.ipc.HbaseRPC: Server at / 192.168.1.13:60020 could not be reached after 1 tries, giving up. 2011-07-05 17:22:01,042 WARN org.apache.hadoop.hbase.master.BaseScanner: Scan one META region: {server: 192.168.1.13:60020, regionname: .META.,,1, startKey: <>}org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /192.168.1.13:60020 after attempts=1 at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:429) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getHRegionConnection(HConnectionManager.java:918) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getHRegionConnection(HConnectionManager.java:934) at org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:173) at org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73) at org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129) at org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:153) at org.apache.hadoop.hbase.Chore.run(Chore.java:68)
Sometimes when the .META. region is not assigned to the server node3, which has two NIC:eth0:192.168.1.13 and eth1:192.168.1.15 and resolve the dns/host as:192.168.1.15 node3, I means, when the region .META. is assigned to the others server that has only one NIC, the hbase will work well. here is some of my hbase cluster infos: Hbase version:0.20.6 Hadoop version:0.20-append+4 Zookeeper version:3.3.0 the hbase-site.xml: <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://node3:54310/hbase</value> </property> <property> <name>hbase.master</name> <value>hadoop5:60000</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>node3,hadoop5,hadoopoffice85,hadoopoffice88,hdofficelj001</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <!--property> <name>hbase.master.dns.interface</name> <value>eth1</value> <description>The name of the Network Interface from which a master should report its IP address. </description> </property> <property> <name>hbase.regionserver.dns.interface</name> <value>eth1</value> <description>The name of the Network Interface from which a region server should report its IP address. </description> </property> <property> <name>hbase.zookeeper.dns.interface</name> <value>eth1</value> <description>The name of the Network Interface from which a ZooKeeper server should report its IP address. </description> </property--> <!--property> <name>hbase.zookeeper.property.clientPort</name> <value>2222</value> <description>Property from ZooKeeper's config zoo.cfg. The port at which the clients will connect. </description> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/opt/zookeeper/data</value> <description>Property from ZooKeeper's config zoo.cfg. The directory where the snapshot is stored. </description> </property--> </configuration> cat /opt/hbase/conf/regionservers hadoop5 node3 hadoopoffice85 hadoopoffice88 hdofficelj001 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- And the below is the node3's info: 192.168.1.13's ifconfig info: [root@node3 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:0C:29:23:2E:D3 inet addr:192.168.1.13 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe23:2ed3/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1424620 errors:0 dropped:0 overruns:0 frame:0 TX packets:17897973 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:150231810 (143.2 MiB) TX bytes:2834085782 (2.6 GiB) Base address:0x2000 Memory:d8920000-d8940000 eth1 Link encap:Ethernet HWaddr 00:0C:29:23:2E:DD inet addr:192.168.1.15 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe23:2edd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1172226 errors:0 dropped:0 overruns:0 frame:0 TX packets:1445 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:168873362 (161.0 MiB) TX bytes:293447 (286.5 KiB) Base address:0x2040 Memory:d8940000-d8960000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:370550 errors:0 dropped:0 overruns:0 frame:0 TX packets:370550 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:64864387 (61.8 MiB) TX bytes:64864387 (61.8 MiB) the hosts info: [root@node3 ~]# cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.1.27 hadoop5 192.168.1.15 node3 192.168.1.85 hadoopoffice85 192.168.1.88 hadoopoffice88 192.168.3.227 hdofficelj001 [root@node3 ~]# netstat -nap | grep 600 tcp 0 0 ::ffff:192.168.1.15:60020 :::* LISTEN 19064/java tcp 0 0 :::60030 :::* LISTEN 19064/java tcp 0 0 ::ffff:192.168.1.13:44350 ::ffff:192.168.1.27:60000 ESTABLISHED 19064/java [root@node3 ~]# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.71 * 255.255.255.255 UH 0 0 0 eth1 192.168.1.0 * 255.255.255.0 U 0 0 0 eth0 192.168.1.0 * 255.255.255.0 U 0 0 0 eth1 169.254.0.0 * 255.255.0.0 U 0 0 0 eth1 default 192.168.1.254 0.0.0.0 UG 0 0 0 eth0 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- I have added the dns.interface configuration as eth1, but it still has the same error. <property> <name>hbase.master.dns.interface</name> <value>eth1</value> <description>The name of the Network Interface from which a master should report its IP address. </description> </property> <property> <name>hbase.regionserver.dns.interface</name> <value>eth1</value> <description>The name of the Network Interface from which a region server should report its IP address. </description> </property> <property> <name>hbase.zookeeper.dns.interface</name> <value>eth1</value> <description>The name of the Network Interface from which a ZooKeeper server should report its IP address. </description> </property> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- But After I change the default route, but it still has the same error. [root@node3 ~]# route del default [root@node3 ~]# route add -net default gw 192.168.1.254 eth1 [root@node3 ~]# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.71 * 255.255.255.255 UH 0 0 0 eth1 192.168.1.0 * 255.255.255.0 U 0 0 0 eth0 192.168.1.0 * 255.255.255.0 U 0 0 0 eth1 169.254.0.0 * 255.255.0.0 U 0 0 0 eth1 default 192.168.1.254 0.0.0.0 UG 0 0 0 eth1 [root@node3 ~]# netstat -nap | grep 600 tcp 0 0 ::ffff:192.168.1.15:60020 :::* LISTEN 23282/java tcp 0 0 :::60030 :::* LISTEN 23282/java tcp 0 0 ::ffff:192.168.1.13:45783 ::ffff:192.168.1.27:60000 ESTABLISHED 23282/java Help.