Hi,

I think The problem is that:
the /etc/hosts file is resolved the dns node3 to 192.168.1.15<eth1>, but the
hbase inner sometime uses the 192.168.1.13<eth0>.
When I use the command "ifdown eth0" on node3 and use stop-hbase.sh, there
shows the message:

2011-07-06 10:25:50,683 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling meta scanner to stop
2011-07-06 10:25:50,683 DEBUG org.apache.hadoop.hbase.master.RegionManager:
meta and root scanners notified
2011-07-06 10:26:00,685 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling root scanner to stop
2011-07-06 10:26:00,685 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling meta scanner to stop
2011-07-06 10:26:00,685 DEBUG org.apache.hadoop.hbase.master.RegionManager:
meta and root scanners notified
2011-07-06 10:26:10,687 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling root scanner to stop
2011-07-06 10:26:10,687 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling meta scanner to stop
2011-07-06 10:26:10,687 DEBUG org.apache.hadoop.hbase.master.RegionManager:
meta and root scanners notified
2011-07-06 10:26:20,689 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling root scanner to stop
2011-07-06 10:26:20,689 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling meta scanner to stop
2011-07-06 10:26:20,689 DEBUG org.apache.hadoop.hbase.master.RegionManager:
meta and root scanners notified
2011-07-06 10:26:30,691 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling root scanner to stop

And when I "ifup eth0" on node3, it will work well and stop the hbase
normal:
2011-07-06 10:28:47,139 INFO org.apache.hadoop.hbase.master.ServerManager:
Region server node3,60020,1309860160318 quiesced
2011-07-06 10:28:47,139 INFO org.apache.hadoop.hbase.master.ServerManager:
All user tables quiesced. Proceeding with shutdown
2011-07-06 10:28:47,139 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling root scanner to stop
2011-07-06 10:28:47,139 DEBUG org.apache.hadoop.hbase.master.RegionManager:
telling meta scanner to stop
2011-07-06 10:28:47,139 DEBUG org.apache.hadoop.hbase.master.RegionManager:
meta and root scanners notified
2011-07-06 10:28:47,338 INFO org.apache.hadoop.hbase.master.ServerManager:
Removing server's info node3,60020,1309860160318
2011-07-06 10:28:47,338 INFO org.apache.hadoop.hbase.master.ServerManager:
Region server node3,60020,1309860160318: MSG_REPORT_EXITING
2011-07-06 10:28:50,719 INFO org.apache.hadoop.hbase.master.HMaster:
Stopping infoServer

2011/7/5 Jameson Li <hovlj...@gmail.com>

> Hi,
>
> when I start my hbase cluster, there are some error logs in the master-log:
> <the ip and hostname node3 192.168.1.15 192.168.1.13 are the same machine
> that have two NIC>
> 2011-07-05 17:13:13,820 INFO org.apache.zookeeper.ClientCnxn:
> zookeeper.disableAutoWatchReset is false
> 2011-07-05 17:13:13,840 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server node3/192.168.1.15:2181
> ....
> 2011-07-05 17:13:13,975 DEBUG org.apache.hadoop.hbase.master.HMaster:
> Checking cluster state...
> 2011-07-05 17:13:13,979 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode
> /hbase/root-region-server got 192.168.1.13:60020
> ....
> 2011-07-05 17:13:19,732 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Updated ZNode
> /hbase/rs/1309857199677 with data 192.168.1.15:60020
>  ....
> 2011-07-05 17:22:01,041 INFO org.apache.hadoop.ipc.HbaseRPC: Server at /
> 192.168.1.13:60020 could not be reached after 1 tries, giving up.
> 2011-07-05 17:22:01,042 WARN org.apache.hadoop.hbase.master.BaseScanner:
> Scan one META region: {server: 192.168.1.13:60020, regionname: .META.,,1,
> startKey: <>}org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Failed setting up proxy to /192.168.1.13:60020 after attempts=1
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:429)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getHRegionConnection(HConnectionManager.java:918)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getHRegionConnection(HConnectionManager.java:934)
>         at
> org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:173)
>         at
> org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
>         at
> org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
>         at
> org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:153)
>         at org.apache.hadoop.hbase.Chore.run(Chore.java:68)
>
> Sometimes when the .META. region is not assigned to the server node3, which
> has two NIC:eth0:192.168.1.13 and eth1:192.168.1.15 and resolve the dns/host
> as:192.168.1.15 node3, I means, when the region .META. is assigned to the
> others server that has only one NIC, the hbase will work well.
>
> here is some of my hbase cluster infos:
> Hbase version:0.20.6
> Hadoop version:0.20-append+4
> Zookeeper version:3.3.0
>
> the hbase-site.xml:
> <configuration>
> <property>
> <name>hbase.rootdir</name>
> <value>hdfs://node3:54310/hbase</value>
> </property>
>
> <property>
> <name>hbase.master</name>
> <value>hadoop5:60000</value>
> </property>
>
> <property>
> <name>hbase.zookeeper.quorum</name>
> <value>node3,hadoop5,hadoopoffice85,hadoopoffice88,hdofficelj001</value>
> </property>
>
> <property>
>     <name>hbase.cluster.distributed</name>
>     <value>true</value>
>   </property>
>
>    <!--property>
>     <name>hbase.master.dns.interface</name>
>     <value>eth1</value>
>     <description>The name of the Network Interface from which a master
>       should report its IP address.
>     </description>
>   </property>
>
> <property>
>     <name>hbase.regionserver.dns.interface</name>
>     <value>eth1</value>
>     <description>The name of the Network Interface from which a region
> server
>       should report its IP address.
>     </description>
>   </property>
>
> <property>
>     <name>hbase.zookeeper.dns.interface</name>
>     <value>eth1</value>
>     <description>The name of the Network Interface from which a ZooKeeper
> server
>       should report its IP address.
>     </description>
>   </property-->
>
> <!--property>
>       <name>hbase.zookeeper.property.clientPort</name>
>       <value>2222</value>
>       <description>Property from ZooKeeper's config zoo.cfg.
>       The port at which the clients will connect.
>       </description>
>     </property>
> <property>
>       <name>hbase.zookeeper.property.dataDir</name>
>       <value>/opt/zookeeper/data</value>
>       <description>Property from ZooKeeper's config zoo.cfg.
>       The directory where the snapshot is stored.
>       </description>
>     </property-->
> </configuration>
>
> cat /opt/hbase/conf/regionservers
> hadoop5
> node3
> hadoopoffice85
> hadoopoffice88
> hdofficelj001
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> And the below is the node3's info:
> 192.168.1.13's ifconfig info:
> [root@node3 ~]# ifconfig
> eth0      Link encap:Ethernet  HWaddr 00:0C:29:23:2E:D3
>           inet addr:192.168.1.13  Bcast:192.168.1.255  Mask:255.255.255.0
>           inet6 addr: fe80::20c:29ff:fe23:2ed3/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:1424620 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:17897973 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:150231810 (143.2 MiB)  TX bytes:2834085782 (2.6 GiB)
>           Base address:0x2000 Memory:d8920000-d8940000
>
> eth1      Link encap:Ethernet  HWaddr 00:0C:29:23:2E:DD
>           inet addr:192.168.1.15  Bcast:192.168.1.255  Mask:255.255.255.0
>           inet6 addr: fe80::20c:29ff:fe23:2edd/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:1172226 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:1445 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:168873362 (161.0 MiB)  TX bytes:293447 (286.5 KiB)
>           Base address:0x2040 Memory:d8940000-d8960000
>
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:370550 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:370550 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:64864387 (61.8 MiB)  TX bytes:64864387 (61.8 MiB)
>
> the hosts info:
> [root@node3 ~]# cat /etc/hosts
> # Do not remove the following line, or various programs
> # that require network functionality will fail.
> 127.0.0.1 localhost.localdomain localhost
> ::1 localhost6.localdomain6 localhost6
> 192.168.1.27 hadoop5
> 192.168.1.15 node3
> 192.168.1.85 hadoopoffice85
> 192.168.1.88 hadoopoffice88
> 192.168.3.227 hdofficelj001
>
> [root@node3 ~]# netstat -nap | grep 600
> tcp        0      0 ::ffff:192.168.1.15:60020   :::*
>  LISTEN      19064/java
> tcp        0      0 :::60030                    :::*
>  LISTEN      19064/java
> tcp        0      0 ::ffff:192.168.1.13:44350   ::ffff:192.168.1.27:60000  
> ESTABLISHED 19064/java
>
> [root@node3 ~]# route
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags Metric Ref    Use
> Iface
> 239.2.11.71     *               255.255.255.255 UH    0      0        0
> eth1
> 192.168.1.0     *               255.255.255.0   U     0      0        0
> eth0
> 192.168.1.0     *               255.255.255.0   U     0      0        0
> eth1
> 169.254.0.0     *               255.255.0.0     U     0      0        0
> eth1
> default         192.168.1.254   0.0.0.0         UG    0      0        0
> eth0
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> I have added the dns.interface configuration as eth1, but it still has the
> same error.
> <property>
>     <name>hbase.master.dns.interface</name>
>     <value>eth1</value>
>     <description>The name of the Network Interface from which a master
>       should report its IP address.
>     </description>
>   </property>
>
> <property>
>     <name>hbase.regionserver.dns.interface</name>
>     <value>eth1</value>
>     <description>The name of the Network Interface from which a region
> server
>       should report its IP address.
>     </description>
>   </property>
>
> <property>
>     <name>hbase.zookeeper.dns.interface</name>
>     <value>eth1</value>
>     <description>The name of the Network Interface from which a ZooKeeper
> server
>       should report its IP address.
>     </description>
>   </property>
>
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> But After I change the default route, but it still has the same error.
> [root@node3 ~]# route del default
> [root@node3 ~]# route add -net default gw 192.168.1.254 eth1
> [root@node3 ~]# route
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags Metric Ref    Use
> Iface
> 239.2.11.71     *               255.255.255.255 UH    0      0        0
> eth1
> 192.168.1.0     *               255.255.255.0   U     0      0        0
> eth0
> 192.168.1.0     *               255.255.255.0   U     0      0        0
> eth1
> 169.254.0.0     *               255.255.0.0     U     0      0        0
> eth1
> default         192.168.1.254   0.0.0.0         UG    0      0        0
> eth1
> [root@node3 ~]# netstat -nap | grep 600
> tcp        0      0 ::ffff:192.168.1.15:60020   :::*
>  LISTEN      23282/java
> tcp        0      0 :::60030                    :::*
>  LISTEN      23282/java
> tcp        0      0 ::ffff:192.168.1.13:45783   ::ffff:192.168.1.27:60000  
> ESTABLISHED 23282/java
>
> Help.
>

Reply via email to