sry.. it is Changed 127.0.0.1 localhost localhost.localdomain 127.0.1.1 hsreekumar-lt. <http://hsreekumar-lt.corp1.com/>Clickablecorp.com<http://hsreekumar-lt.clickablecorp.com/> hsreekumar-lt <http://hsreekumar-lt.corp1.com/>
to 127.0.0.1 localhost localhost.localdomain hsreekumar-lt.Clickablecorp.com<http://hsreekumar-lt.clickablecorp.com/> hsreekumar-lt #127.0.1.1 hsreekumar-lt. <http://hsreekumar-lt.corp1.com/>Clickablecorp.com<http://hsreekumar-lt.clickablecorp.com/> hsreekumar-lt <http://hsreekumar-lt.corp1.com/> On Thu, Jun 2, 2011 at 11:18 AM, Hari Sreekumar <[email protected]>wrote: > Hey, > > I had the same problem.. it seems it's because of the 127.0.1.1 entry in > /etc/hosts (which is default in ubuntu I think, but I haven't seen it in > CentOS systems). > > Changed > 127.0.0.1 localhost localhost.localdomain > 127.0.1.1 hsreekumar-lt.corp1.com hsreekumar-lt > > to > 127.0.0.1 localhost localhost.localdomain hsreekumar-lt.Clickablecorp.com > hsreekumar-lt > #127.0.1.1 hsreekumar-lt.corp1.com hsreekumar-lt > > See if it fixes your problem.. though I am not sure what will be the side > effects of this/ whether some other programs will break? > > Thanks, > Hari > > On Wed, Jun 1, 2011 at 11:29 PM, Stack <[email protected]> wrote: > >> On Tue, May 31, 2011 at 11:45 PM, Sean Bigdatafun >> <[email protected]> wrote: >> > Sure. Thanks, St.Ack. Here are the attached HBase logs, plus the >> screenshot >> > of the region server. The /etc/hosts should be Ok I think because my >> Hadoop >> > (pseudo distributed )cluster runs well and healthy. >> >> FYI, what works for hadoop may not work for hbase. >> >> > But I post it here in >> > case I missed something :-0 >> > >> > 127.0.0.1 localhost >> > 127.0.1.1 sean-PowerEdge >> > >> > # The following lines are desirable for IPv6 capable hosts >> > ::1 ip6-localhost ip6-loopback localhost6 >> > fe00::0 ip6-localnet >> > ff00::0 ip6-mcastprefix >> > ff02::1 ip6-allnodes >> > ff02::2 ip6-allrouters >> > >> >> Try turning off ipv6. In the past its been fingered as problem-causing. >> >> Looking in your logs: >> >> + Make sure you fix this before you put any significant data into >> hbase 'ulimit -n 1024' >> >> So, yeah, it looks like your /etc/hosts needs fixing. When the >> regionserver does its lookup its finding its hostname to be localhost: >> >> 2011-05-31 23:32:44,742 INFO >> org.apache.hadoop.hbase.master.ServerManager: Registering >> server=localhost,60020,1306909960650, regionCount=0, userLoad=false >> >> But then when the master tries to send it a region, its trying to send it >> to >> >> 2011-05-31 23:32:47,671 INFO org.apache.hadoop.ipc.HbaseRPC: Server at >> /127.0.0.1:60020 could not be reached after 1 tries, giving up. >> >> .... notice the 127.0.0.1 above. >> >> Fix this discrepency. >> >> St.Ack >> >> >> >> > Thanks, >> > Sean >> > >> > >> > >> > >> > >> > On Mon, May 30, 2011 at 7:34 PM, Stack <[email protected]> wrote: >> >> >> >> Odd. I dont' see the regionserver checking into the master (maybe >> >> thats the way it is in pseudo-distributed and I just forgot). Can you >> >> paste more master log? I don't see the regionserver coming in in the >> >> snippet you've pasted so not sure how its registering itself (I see >> >> the timeout when we try to assign it -ROOT-). >> >> >> >> Whats in your /etc/hosts? I see lots of locahost and 127.0.0.1. >> >> Maybe the two are not equated in your resolve setup? >> >> >> >> St.Ack >> >> >> >> On Sat, May 28, 2011 at 11:28 PM, Sean Bigdatafun >> >> <[email protected]> wrote: >> >> > I am trying for 0.90.1 (hbase-0.90.1-CDH3B4) under pseudo-dist mode, >> and >> >> > met >> >> > the problem of HMaster crashing. Here is how I did. >> >> > >> >> > I. First I installed Hadoop pseudo cluster (hadoop-0.20.2-CDH3B4) >> with >> >> > the >> >> > following conf edited. >> >> > >> >> > 1) core-site.xml ==> >> >> > <property> >> >> > <name>fs.default.name</name> >> >> > <value>hdfs://localhost:9000</value> >> >> > </property> >> >> > >> >> > 2) hdfs-site.xml ==> >> >> > <property> >> >> > <name>dfs.replication</name> >> >> > <value>1</value> >> >> > </property> >> >> > >> >> > (with above confs, start-all.sh was run, and the hadoop pseudo >> cluster >> >> > started to run happily) >> >> > >> >> > >> >> > Secondly, I installed hbase-0.90.1-CDH3B4 with the following conf >> >> > edited. >> >> > >> >> > hbase-site.xml ==> >> >> > <property> >> >> > <name>hbase.rootdir</name> >> >> > <value>hdfs://localhost:9000/hbase</value> >> >> > </property> >> >> > >> >> > <property> >> >> > <name>hbase.cluster.distributed</name> >> >> > <value>true</value> >> >> > </property> >> >> > >> >> > <property> >> >> > <name>hbase.zookeeper.quorum</name> >> >> > <value>localhost</value> >> >> > </property> >> >> > >> >> > <property> >> >> > <name>dfs.replication</name> >> >> > <value>1</value> >> >> > <description>The replication count for HLog and HFile storage. >> Should >> >> > not be greater than HDFS datanode count. >> >> > </description> >> >> > </property> >> >> > >> >> > (with the above conf, I run the command of hbase-start.sh, and I >> >> > realised >> >> > that HMaster did not function well -- i can't access localhost:60010) >> >> > >> >> > >> >> > II. Here is the HMaster error log: >> >> > >> >> > 2011-05-28 23:22:55,292 WARN >> >> > org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a >> >> > viable >> >> > location to assign region -ROOT-,,0.70236052 >> >> > 2011-05-28 23:23:35,291 INFO >> >> > org.apache.hadoop.hbase.master.AssignmentManager: Regions in >> transition >> >> > timed out: -ROOT-,,0.70236052 state=OFFLINE, ts=1306650175292 >> >> > 2011-05-28 23:23:35,291 INFO >> >> > org.apache.hadoop.hbase.master.AssignmentManager: Region has been >> >> > OFFLINE >> >> > for too long, reassigning -ROOT-,,0.70236052 to a random server >> >> > 2011-05-28 23:23:35,291 DEBUG >> >> > org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; >> >> > was=-ROOT-,,0.70236052 state=OFFLINE, ts=1306650175292 >> >> > 2011-05-28 23:23:35,291 DEBUG >> >> > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing >> >> > plan >> >> > for region -ROOT-,,0.70236052; plan=hri=-ROOT-,,0.70236052, src=, >> >> > dest=localhost,60020,1306648534687 >> >> > 2011-05-28 23:23:35,291 DEBUG >> >> > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region >> >> > -ROOT-,,0.70236052 to localhost,60020,1306648534687 >> >> > 2011-05-28 23:23:35,291 DEBUG >> >> > org.apache.hadoop.hbase.master.ServerManager: >> >> > New connection to localhost,60020,1306648534687 >> >> > 2011-05-28 23:23:35,292 INFO org.apache.hadoop.ipc.HbaseRPC: Server >> at / >> >> > 127.0.0.1:60020 could not be reached after 1 tries, giving up. >> >> > 2011-05-28 23:23:35,292 WARN >> >> > org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment >> of >> >> > -ROOT-,,0.70236052 to serverName=localhost,60020,1306648534687, >> >> > load=(requests=0, regions=0, usedHeap=22, maxHeap=996), trying to >> assign >> >> > elsewhere instead; retry=0 >> >> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed >> setting >> >> > up >> >> > proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to / >> >> > 127.0.0.1:60020 after attempts=1 >> >> > at >> >> > org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:355) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:606) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:541) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:901) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:730) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:710) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1605) >> >> > at org.apache.hadoop.hbase.Chore.run(Chore.java:66) >> >> > Caused by: java.net.ConnectException: Connection refused >> >> > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> >> > at >> >> > >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) >> >> > at >> >> > >> >> > >> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) >> >> > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883) >> >> > at >> >> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750) >> >> > at >> >> > >> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) >> >> > at $Proxy6.getProtocolVersion(Unknown Source) >> >> > at >> >> > org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) >> >> > at >> >> > org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) >> >> > at >> >> > org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) >> >> > at >> >> > org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) >> >> > ... 8 more >> >> > 2011-05-28 23:23:35,292 WARN >> >> > org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a >> >> > viable >> >> > location to assign region -ROOT-,,0.70236052 >> >> > >> >> > >> >> > >> >> > III. Here is the zk status from http://localhost:60010/zk.jsp >> >> > >> >> > HBase is rooted at /hbase >> >> > Master address: sean-PowerEdge:60000 >> >> > Region server holding ROOT: null >> >> > Region servers: >> >> > sean-PowerEdge:60020 >> >> > Quorum Server Statistics: >> >> > localhost:2181 >> >> > Zookeeper version: 3.3.2-CDH3B4--1, built on 02/21/2011 20:16 GMT >> >> > Clients: >> >> > /127.0.0.1:42221[0](queued=0,recved=1,sent=0) >> >> > /127.0.0.1:44071[1](queued=0,recved=39,sent=44) >> >> > /127.0.0.1:44078[1](queued=0,recved=23,sent=24) >> >> > /127.0.0.1:44085[1](queued=0,recved=23,sent=23) >> >> > /127.0.0.1:44077[1](queued=0,recved=19,sent=19) >> >> > >> >> > Latency min/avg/max: 0/6/164 >> >> > Received: 105 >> >> > Sent: 110 >> >> > Outstanding: 0 >> >> > Zxid: 0x148 >> >> > Mode: standalone >> >> > Node count: 12 >> >> > >> >> > >> >> > What's the problem causing the above symptom? >> >> > >> >> > Thanks, >> >> > -- >> >> > --Sean >> >> > >> > >> > >> > >> > -- >> > --Sean >> > >> > >> > >
