Short of doing an overhaul of hadoop/hbase ...
Here's the issue... Both hadoop and hbase are currently designed to use one NIC to talk to both the outside world and to each other. It kind of makes sense because hadoop doesn't have a single point of entry to the cloud. Each client talks with the namenode and then a specific datanode where the file is located, right? (HDFS) So it appears that there is an assumption of a single interface. Since HBase sits on top of HDFS it seems to inherit this design. To be honest... its a simple design and it works for most people. Also note that if your nodes have more than 4 sata drives, like 8 drives in a 2U high box, you can exceed a 1GBe port. Here, if you have two NIC ports, you'll want to bond them and still have a single IP address to the rest of the world. (In some of our testing we've found that with 4 SATA drives, under load, we're hitting about 80% of a 1GBe port's capacity. Your mileage will vary...) I'm not sure of what you mean by you rent the hardware. Without changing the physical hardware, can you change your configuration in software? Can you assign IP addresses to ports, and configure your HBase to use the NIC that sees the outside world? On a side note... In theory... you could change the HDFS and other configurations to allow for multiple nics. By this I mean changing the networking to allow one to specify which type of traffic goes on which address... > Date: Thu, 4 Nov 2010 20:30:47 +0800 > From: [email protected] > To: [email protected] > Subject: Re: Re: HBase failure causes dual NIC ? > > Ted , I appreciate your help! > I have read HBASE-2502 and mail of "hbase with multiple interfaces" > > I rent the cluster to test HBase, so i can't modify the hardware > configuration. > (Maybe others need the dual NICs) > > Can I temporary modify some code to fix the issue? > (e.g. Replace the process that look up IP address in DNS with some > hardcode(fixed IP address) ) > Anybody can give me some clues? > > > 2010-11-04 > > > > Pan.W > > > > 发件人: Ted Yu > 发送时间: 2010-11-04 17:37:56 > 收件人: user > 抄送: > 主题: Re: HBase failure causes dual NIC ? > > See https://issues.apache.org/jira/browse/HBASE-2502 > Deactivate one of the dual NICs. > On Thu, Nov 4, 2010 at 1:13 AM, Pan.W <[email protected]> wrote: > > Hi, HBaser > > > > I'm currently trying to run HBase, but some errors occur. > > > > Running environment: > > CentOS release 5.5 > > hadoop-0.20.2 > > hbase-0.20.6 > > > > I use two machines to run hbase (just for illustrate this issue). > > Master is : 192.168.22.18 /192.168.25.18 > > RegionServer is : 192.168.22.19 /192.168.25.19 > > In my cluster, every machine has dual NIC. Maybe that's the problem, I > > guess... > > ~~~~~~~ > > > > In hbase-site.xml, list some configurations > > <property> > > <name>hbase.zookeeper.quorum</name> > > <value>192.168.25.18, 192.168.25.19</value> > > ... > > </property> > > > > > > After run the start-hbase.sh, these relevant processes have been started! > > Run "hbase shell" to excute some commands: > > -------------------------------------------------------- > > hbase(main):002:0> create "table1","cf1" > > NativeException: java.io.IOException: java.io.IOException: > > java.lang.NullPointerException > > at > > org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:785) > > at > > org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:762) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > > at > > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > > > > hbase(main):013:0> list > > 10/11/04 15:43:28 INFO ipc.HbaseRPC: Server at /192.168.25.19:60020 could > > not be reached after 1 tries, giving up. > > ----------------------------------------------------------- > > > > > > And then execute "zk_dump", gets infomation as follows: > > ------------------------------------------------------------------- > > Version: 0.20.6, r965666, Mon Jul 19 16:54:48 PDT 2010 > > hbase(main):001:0> zk_dump > > HBase tree in ZooKeeper is rooted at /hbase > > Cluster up? true > > In safe mode? false > > Master address: 192.168.25.18:60000 > > Region server holding ROOT: 192.168.25.19:60020 > > Region servers: > > - 192.168.22.19:60020 > > Quorum Server Statistics: > > - 192.168.25.19:2181 > > Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT > > Clients: > > /192.168.25.18:53266[1](queued=0,recved=0,sent=0) > > Latency min/avg/max: 0/0/0 > > Received: 0 > > Sent: 0 > > Outstanding: 0 > > Zxid: 0xc0000000d > > Mode: leader > > Node count: 11 > > - 192.168.25.18:2181 > > Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT > > Clients: > > /192.168.25.18:52198[1](queued=0,recved=0,sent=0) > > /192.168.25.18:59354[1](queued=0,recved=4115,sent=0) > > /192.168.25.19:41012[1](queued=0,recved=4106,sent=0) > > /192.168.25.18:52195[1](queued=0,recved=10,sent=0) > > Latency min/avg/max: 0/1/22 > > Received: 8251 > > Sent: 0 > > Outstanding: 0 > > Zxid: 0xc0000000d > > Mode: follower > > Node count: 11 > > ------------------------------------------------------------ > > > > From the infomation returned by zk_dump, It's looks like the inconsistent > > IP address be used simultaneously. > > > > Any help is greatly appreciated! > > > > > > 2010-11-04 > > > > > > > > Pan.W > >
