Hi Laxmi,Sorry for my late reply. is Your total ram 6 GB ? If that is total memory of computer. Your computer use that. You should decrease your heap size. You can read every property from the Hbase book. But I try to explain why use this property.
- You should increase your *hbase.client.scanner.caching* property from 1 to 20. Now Gora has not filter method. Now in Nutch 2.x, every row getting from habse for GeneratorJob or FetchJob. That is tiring If you increase that property as much as possible, That provide less connection every map/reduce.
- *hbase.regionserver.handler.count*, Count of RPC Listener instances spun up on RegionServers. Little Note : The importance of reducing the number of separate RPC calls is tied to the round-trip time, which is the time it takes for a client to send a request and the server to send a response over the network. This does not include the time required for the data transfer. It simply is the overhead of sending packages over the wire. On average, these take about 1ms on a LAN, which means you can handle 1,000 round-trips per second only. The other important factor is the message size: if you send large requests over the network, you already need a much lower number of round-trips, as most of the time is spent transferring data. But when doing, for example, counter increments, which are small in size, you will see better performance when batching updates into fewer requests. (From: Hbase Definitive Guide page 86.)
- *hbase.client.write.buffer*, Default size of the HTable client write buffer in bytes. A bigger buffer takes more memory -- on both the client and server side since server instantiates the passed write buffer to process it -- but a larger buffer size reduces the number of RPCs made. For an estimate of server-side memory-used, your regionserver in same server, evaluate hbase.client.write.buffer * hbase.regionserver.handler.count * regionserver count * regionserver count.
In my opion, Your problem *hbase.client.scanner.caching*. If you increase that. Your problem will be solved. If it wont be solved, you can try other properties.
Have a nice day, Talat 21-10-2013 04:31 tarihinde, A Laxmi yazdı:
Hi Talat - Since I am running HBase Pseudo mode for Nutch, I have changed some of your properties like *hbase.client.scanner.caching* to 1 and * hbase.regionserver.handler.count* to 10 and *hbase.client.write.buffer* to 2097152. I emailed hbase user list with the out of memory error hoping to find some help. I don't understand why crawling goes on fine until like 5th iteration and in 5th iteration parsing stage of some urls, nutch just hangs with an out of memory error: heap space. I have a 6GB RAM but I am not crawling a million records, as a test run I am just trying to crawl a url with depth 7 and topN 1000. I am not sure what can be done in this case. Thanks for your help, Laxmi On Sun, Oct 20, 2013 at 3:55 AM, Talat UYARER <[email protected]>wrote:Hey Laxmi, First of all please send your email our maillist. Maybe somebody tell their experiences. If you use my settings with didnt change any value, your heap size will be out of memory. That is normal. I have 64 GB Ram on my datanodes. You should change my settings for sutiable your computer. 20-10-2013 05:13 tarihinde, A Laxmi yazdı:Hi Talat! Update - So I added some of the properties you recommended for tuning and I have some good and bad news. Good news is Region Server was not getting disconnected under heavy crawl (thanks to Talat!!!!) and the bad news is I am getting an Out of memory: heap space exception while I was in the 5th crawl iteration. I have set 8GB for Heap in hbase_env.sh. Not sure why I have this out of memory: heap space issue. Please comment. Thanks Laxmi On Fri, Oct 18, 2013 at 4:16 PM, A Laxmi <[email protected] <mailto: [email protected]**>> wrote: Thanks Talat! I will try it out again and you will be the first person to notify if mine works. I will keep you posted. Thanks Laxmi On Fri, Oct 18, 2013 at 3:52 PM, Talat UYARER <[email protected] <mailto:talat.uyarer@agmlab.**com<[email protected]>>> wrote: For the issue, I didnt see any problem. You need some property for tuning. But it is not our subject. I share my hbase-site.xml. If you dont wrong remember. This issue related with only hbase. <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://hdpnn01.secret.**local:8080/hbase</value> </property> <property> <name>hbase.cluster.**distributed</name> <value>true</value> </property> <property> <name>hbase.hregion.max.**filesize</name> <value>10737418240</value> </property> <property> <name>hbase.zookeeper.quorum</**name> <value>hdpzk01.secret.local,**hdpzk02.secret.local,hdpzk03.** secret.local</value> </property> <property> <name>hbase.client.scanner.**caching</name> <value>200</value> </property> <property> <name>hbase.client.scanner.**timeout.period</name> <value>120000</value> </property> <property> <name>hbase.regionserver.**lease.period</name> <value>900000</value> </property> <property> <name>hbase.rpc.timeout</name> <value>900000</value> </property> <property> <name>dfs.support.append</**name> <value>true</value> </property> <property> <name>hbase.hregion.memstore.**mslab.enabled</name> <value>true</value> </property> <property> <name>hbase.regionserver.**handler.count</name> <value>20</value> </property> <property> <name>hbase.client.write.**buffer</name> <value>20971520</value> </property> 18-10-2013 20:34 tarihinde, A Laxmi yazdı:Thanks, Talat! I will try with those properties you recommended. Please look at my other properties here and please let me know your comments. HBase: 0.90.6 Hadoop: 0.20.205.0 Nutch: 2.2.1 Note: I have set 8GB for heap size in *hbase/conf/hbase-env.sh* ==============================**=========== Below is my *hbase/conf/hbase-site.xml: ==============================**=========== <property> <name>hbase.rootdir</name> <value>hdfs://localhost:8020/**hbase</value> </property> <property> <name>hbase.cluster.**distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</**name> <value>localhost</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>hbase.zookeeper.**property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.zookeeper.**property.dataDir</name> <value>/home/hadoop/hbase-0.**90.6/zookeeper</value> </property> <property> <name>zookeeper.session.**timeout</name> <value>60000</value> </property> ==============================**========== * Below is my*hadoop/conf/hdfs-site.xml * *=============================**=========== <property> <name>dfs.support.append</**name> <value>true</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.safemode.extension</**name> <value>0</value> </property> <property> <name>dfs.safemode.min.**datanodes</name> <value>1</value> </property> <property> <name>dfs.permissions.enabled<**/name> <value>false</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.safemode.min.**datanodes</name> <value>1</value>* *</property> <property> <name>dfs.webhdfs.enabled</**name> <value>true</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/dfs/tmp</**value> </property> ==============================**==== * Below is my *hadoop/conf/core-site.xml* ==============================**==== *<property> <name>fs.default.name <http://fs.default.name></**name> <!-- <value>hdfs://0.0.0.0:8020 <http://0.0.0.0:8020></value> --> <value>hdfs://localhost:8020</**value> </property>* ==============================**===== Below is my*hadoop/conf/**mapred-site.**xml* ==============================**===== *<configuration> <property> <name>mapred.job.tracker</**name> <value>0.0.0.0:8021 <http://0.0.0.0:8021></value> </property> <property> <name>mapred.task.timeout</**name> <value>3600000</value> </property> </configuration>* ==============================**====== * * On Fri, Oct 18, 2013 at 12:01 PM, Talat UYARER <[email protected] <mailto:talat.uyarer@agmlab.**com<[email protected]>>> wrote: Ooh no :(. Sorry Laxmi. I had same issue. I said wrong settings :) I should set up - hbase.client.scanner.caching (My value: 200) - hbase.regionserver.handler.**count (My Value: 20) - hbase.client.write.buffer (My Value:20971520) You should set your region load. It will be solved. Talat 18-10-2013 17:52 tarihinde, A Laxmi yazdı:Hi Talat, I am sorry to say - I have not fixed it yet. I have spent literally sleepless nights to debug that issue. No matter what I do, RegionServer always used to get disconnected. :( Since now I have a deadline in two days, I will go with your advise to one of my email to use HBase *standalone *mode since I am crawling about 10 urls to achieve about 300,000 urls. Once I get that done, I will retry debugging the regionserver issue. I remember you use Hadoop 1.x and not 0.20.205.0 like me, I am not sure if there is any bug with I am using? Thanks, Laxmi On Fri, Oct 18, 2013 at 10:47 AM, Talat UYARER <[email protected] <mailto:talat.uyarer@agmlab.**com <[email protected]>>> wrote: Hi Laxmi, You are welcome. I know very well that feel. I havent used Cloudera. I use pure install hadoop-hbase cluster base on Centos. I am happy because of you fixed your issue. Talat 18-10-2013 17:36 tarihinde, A Laxmi yazdı:Thanks for the article, Talat! It was so annoying to see RegionSever getting disconnected under heavy load while everything works. Have you used Cloudera for Nutch? On Thu, Oct 17, 2013 at 6:50 PM, Talat UYARER <[email protected] <mailto:talat.uyarer@agmlab.**com<[email protected]>>> wrote: Hi Laxmi, It didnt come me. I understand, your RegionServer is gone away. Because of this problem is your Hbase heap size or xceivers count is not enough. I had same issue. I set up my xceivers count. I am not sure what it will be count.But you should check your heaps size usage. If it is enough you can set up your property. This article[1] very good about this property. [1] http://blog.cloudera.com/blog/** 2012/03/hbase-hadoop-xceivers/<http://blog.cloudera.com/blog/2012/03/hbase-hadoop-xceivers/> Talat 17-10-2013 22:20 tarihinde, A Laxmi yazdı:Hi Talat, I hope all is well. Could you please advise me on this issue? Thanks for your help! ---------- Forwarded message ---------- From: *A Laxmi* <[email protected] <mailto:[email protected]**>> Date: Tue, Oct 15, 2013 at 12:07 PM Subject: HBase Pseudo mode - RegionServer disconnects after some time To: [email protected] <mailto:[email protected]> Hi - Please find the below log of HBase-master. I have tried all sorts of fixes mentioned in various threads yet I could not overcome this issue. I made sure I dont have 127.0.1.1 in /etc/hosts file. I pinged my localhost (hostname) which gives back the actual IP and not 127.0.0.1 using ping -c 1 localhost. I have 'localhost' in my /etc/hostname and actual IP address mapped to localhost.localdomain and localhost as alias - something like /etc/hosts - 192.***.*.*** localhost.localdomain localhost /etc/hostname - localhost I am using *Hadoop 0.20.205.0 and HBase 0.90.6 in Pseudo mode* for storing crawled data from a crawler - Apache Nutch 2.2.1. I can start Hadoop and HBase and when I do jps it shows all good, then after that when I start Nutch crawl after about 40mins of crawling or so, I can see Nutch hanging up while in about 4th iteration of parsing and at the same time when I do jps in HBase, I can see everything except HRegionServer. Below is the log. I tried all possible ways but couldn't overcome this issue. I really need someone from HBase list to help me with this issue. 2013-10-15 02:02:08,285 DEBUG org.apache.hadoop.hbase.** regionserver.wal.HLogSplitter: Pushed=56 entries from hdfs://localhost:8020/hbase/.**logs/127.0.0.1 <http://127.0.0.1>,60020,**1381814216471/ 127.0.0.1 <http://127.0.0.1>%3A60020.**1381816329235 2013-10-15 02:02:08,285 DEBUG org.apache.hadoop.hbase.** regionserver.wal.HLogSplitter: Splitting hlog 28 of 29: hdfs://localhost:8020/hbase/.**logs/127.0.0.1 <http://127.0.0.1>,60020,**1381814216471/ 127.0.0.1 <http://127.0.0.1>%3A60020.**1381816367672, length=64818440 2013-10-15 02:02:08,285 WARN org.apache.hadoop.hbase.util.**FSUtils: Running on HDFS without append enabled may result in data loss 2013-10-15 02:02:08,554 DEBUG org.apache.*hadoop.hbase.**master.HMaster: Not running balancer because processing dead regionserver(s): [127.0.0.1,60020*,** 1381814216471] 2013-10-15 02:02:08,556 INFO org.apache.hadoop.hbase.** catalo*g.CatalogTracker: Failed verification of .META.,,1 at address=127.0.0.1:60020 <http://127.0.0.1:60020>; java.net.ConnectException: Connection refused* 2013-10-15 02:02:08,559 INFO org.apache.hadoop.hbase.** catalog.*CatalogTracker: Current cached META location is not valid*, resetting 2013-10-15 02:02:08,601 WARN org.apache.hadoop.*hbase.**master.CatalogJanitor: Failed scan of catalog table org.apache.hadoop.hbase.** NotAllMetaRegionsOnlineExcepti**on: Timed out (2147483647ms)* at org.apache.hadoop.hbase.**catalog.CatalogTracker. **waitForMeta(CatalogTracker.**java:390) at org.apache.hadoop.hbase.**catalog.CatalogTracker. **waitForMetaServerConnectionDef**ault(CatalogTracker.java:422) at org.apache.hadoop.hbase.** catalog.MetaReader.fullScan(**MetaReader.java:255) at org.apache.hadoop.hbase.** catalog.MetaReader.fullScan(**MetaReader.java:237) at org.apache.hadoop.hbase.** master.CatalogJanitor.scan(**CatalogJanitor.java:120) at org.apache.hadoop.hbase.** master.CatalogJanitor.chore(**CatalogJanitor.java:88) at org.apache.hadoop.hbase.Chore.** run(Chore.java:66) 2013-10-15 02:02:08,842 INFO org.apache.hadoop.hbase.**regionserver.wal.** SequenceFileLogWriter: syncFs -- HDFS-200 -- not available, dfs.support.append=false 2013-10-15 02:02:08,842 DEBUG org.apache.hadoop.hbase.** regionserver.wal.HLogSplitter: Creating writer path=hdfs://localhost:8020/**hbase/1_webpage/** 853ef78be7c0853208e865a9ff13d5**fb/recovered.edits/** 0000000000000001556.**tempregion=**853ef78be7c0853208e865a9ff13d5**fb 2013-10-15 02:02:09,443 DEBUG org.apache.hadoop.hbase.** regionserver.wal.HLogSplitter: Pushed=39 entries from hdfs://localhost:8020/hbase/.**logs/127.0.0.1 <http://127.0.0.1>,60020,**1381814216471/ 127.0.0.1 <http://127.0.0.1>%3A60020.**1381816367672 2013-10-15 02:02:09,444 DEBUG org.apache.hadoop.hbase.** regionserver.wal.HLogSplitter: Splitting hlog 29 of 29: hdfs://localhost:8020/hbase/.**logs/127.0.0.1 <http://127.0.0.1>,60020,**1381814216471/ 127.0.0.1 <http://127.0.0.1>%3A60020.**1381816657239, length=0 Thanks for your help!

