Here you go.. HBase Performance tuning page http://wiki.apache.org/hadoop/Hbase/FAQ#A7refers to the following hadoop URL.
http://wiki.apache.org/hadoop/PerformanceTuning Thanks, Charan On Thu, Feb 3, 2011 at 10:22 PM, Todd Lipcon <t...@cloudera.com> wrote: > Does the wiki really recommend that? Got a link handy? > > On Thu, Feb 3, 2011 at 10:20 PM, charan kumar <charan.ku...@gmail.com > >wrote: > > > Todd, > > > > That did the trick. I think the wiki should be updated as well, no > point > > in recommending ParNew 6M or is it? > > > > Thanks, > > Charan. > > > > On Thu, Feb 3, 2011 at 2:06 PM, Charan K <charan.ku...@gmail.com> wrote: > > > > > Thanks Todd.. I will try it out .. > > > > > > > > > On Feb 3, 2011, at 1:43 PM, Todd Lipcon <t...@cloudera.com> wrote: > > > > > > > Hi Charan, > > > > > > > > Your GC settings are way off - 6m newsize will promote way too much > to > > > the > > > > oldgen. > > > > > > > > Try this: > > > > > > > > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xmn256m > > > > -XX:CMSInitiatingOccupancyFraction=70 > > > > > > > > -Todd > > > > > > > > On Thu, Feb 3, 2011 at 12:28 PM, charan kumar < > charan.ku...@gmail.com > > > >wrote: > > > > > > > >> HI Jonathan, > > > >> > > > >> Thanks for you quick reply.. > > > >> > > > >> Heap is set to 4G. > > > >> > > > >> Following are the JVM opts. > > > >> export HBASE_OPTS="$HBASE_OPTS -XX:+HeapDumpOnOutOfMemoryError > > > >> -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:NewSize=6m > > > >> -XX:MaxNewSize=6m" > > > >> > > > >> Are there any other options apart from increasing the RAM? > > > >> > > > >> I am adding some more info about the app. > > > >> > > > >>> We are storing web page data in HBase. > > > >>> Row key is Hashed URL, for random distribution, since we dont plan > to > > > do > > > >> scan's.. > > > >>> We have LZOCompression Set on this column family. > > > >>> We were noticing 1500 Reads, when reading the page content. > > > >>> We have a column family, which stores just metadata of the page > > "title" > > > >> etc... When reading this the performance is whopping 12000 TPS. > > > >> > > > >> We though the issue could be because of N/w bandwidth used between > > HBase > > > >> and Clients. So we disable LZO Compression on Column Family and > > started > > > >> doing the compression of the raw page on the client and decompress > it > > > when > > > >> readind (LZO). > > > >> > > > >>> With this my write performance jumped up from 2000 to 5000 at peak. > > > >>> With this approach, the servers are crashing... Not sure , why only > > > >> after > > > >> turning of LZO... and doing the same from client. > > > >> > > > >> > > > >> > > > >> On Thu, Feb 3, 2011 at 12:13 PM, Jonathan Gray <jg...@fb.com> > wrote: > > > >> > > > >>> How much heap are you running on your RegionServers? > > > >>> > > > >>> 6GB of total RAM is on the low end. For high throughput > > applications, > > > I > > > >>> would recommend at least 6-8GB of heap (so 8+ GB of RAM). > > > >>> > > > >>>> -----Original Message----- > > > >>>> From: charan kumar [mailto:charan.ku...@gmail.com] > > > >>>> Sent: Thursday, February 03, 2011 11:47 AM > > > >>>> To: user@hbase.apache.org > > > >>>> Subject: Region Servers Crashing during Random Reads > > > >>>> > > > >>>> Hello, > > > >>>> > > > >>>> I am using hbase 0.90.0 with hadoop-append. h/w ( Dell 1950, 2 > CPU, > > 6 > > > >> GB > > > >>>> RAM) > > > >>>> > > > >>>> I had 9 Region Servers crash (out of 30) in a span of 30 minutes > > > during > > > >> a > > > >>> heavy > > > >>>> reads. It looks like a GC, ZooKeeper Connection Timeout thingy to > > me. > > > >>>> I did all recommended configuration from the Hbase wiki... Any > other > > > >>>> suggestions? > > > >>>> > > > >>>> > > > >>>> 2011-02-03T09:43:07.890-0800: 70693.632: [GC 70693.632: [ParNew > > > >>>> (promotion > > > >>>> failed): 5555K->5540K(5568K), 0.0280950 secs]70693.660: > > > >>>> [CMS2011-02-03T09:43:16.864-0800: 70702.606: [CMS-concurrent-mark: > > > >>>> 12.549/69.323 secs] [Times: user=11.90 sys=1.26, real=69.31 secs] > > > >>>> > > > >>>> 2011-02-03T09:53:35.165-0800: 71320.785: [GC 71320.785: [ParNew > > > >>>> (promotion > > > >>>> failed): 5568K->5568K(5568K), 0.4384530 secs]71321.224: > > > >>>> [CMS2011-02-03T09:53:45.111-0800: 71330.731: [CMS-concurrent-mark: > > > >>>> 17.511/51.564 secs] [Times: user=38.72 sys=5.67, real=51.60 secs] > > > >>>> > > > >>>> 2011-02-03T09:43:07.890-0800: 70693.632: [GC 70693.632: [ParNew > > > >>>> (promotion > > > >>>> failed): 5555K->5540K(5568K), 0.0280950 secs]70693.660: > > > >>>> [CMS2011-02-03T09:43:16.864-0800: 70702.606: [CMS-concurrent-mark: > > > >>>> 12.549/69.323 secs] [Times: user=11.90 sys=1.26, real=69.31 secs] > > > >>>> > > > >>>> > > > >>>> The following is the log entry in region Server > > > >>>> > > > >>>> 2011-02-03 10:37:43,946 INFO org.apache.zookeeper.ClientCnxn: > Client > > > >>>> session timed out, have not heard from server in 47172ms for > > sessionid > > > >>>> 0x12db9f722421ce3, closing socket connection and attempting > > reconnect > > > >>>> 2011-02-03 10:37:43,947 INFO org.apache.zookeeper.ClientCnxn: > Client > > > >>>> session timed out, have not heard from server in 48159ms for > > sessionid > > > >>>> 0x22db9f722501d93, closing socket connection and attempting > > reconnect > > > >>>> 2011-02-03 10:37:44,401 INFO org.apache.zookeeper.ClientCnxn: > > Opening > > > >>>> socket connection to server XXXXXXXXXXXXXXXX > > > >>>> 2011-02-03 10:37:44,402 INFO org.apache.zookeeper.ClientCnxn: > Socket > > > >>>> connection established to XXXXXXXXX, initiating session > > > >>>> 2011-02-03 10:37:44,709 INFO org.apache.zookeeper.ClientCnxn: > > Opening > > > >>>> socket connection to server XXXXXXXXXXXXXXX > > > >>>> 2011-02-03 10:37:44,709 INFO org.apache.zookeeper.ClientCnxn: > Socket > > > >>>> connection established to XXXXXXXXXXXXXXXXXXXXX, initiating > session > > > >>>> 2011-02-03 10:37:44,767 DEBUG > > > >>>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > > > >> eviction > > > >>>> started; Attempting to free 81.93 MB of total=696.25 MB > > > >>>> 2011-02-03 10:37:44,784 DEBUG > > > >>>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > > > >> eviction > > > >>>> completed; freed=81.94 MB, total=614.81 MB, single=379.98 MB, > > > >>>> multi=309.77 MB, memory=0 KB > > > >>>> 2011-02-03 10:37:45,205 INFO org.apache.zookeeper.ClientCnxn: > Unable > > > to > > > >>>> reconnect to ZooKeeper service, session 0x22db9f722501d93 has > > expired, > > > >>>> closing socket connection > > > >>>> 2011-02-03 10:37:45,206 INFO > > > >>>> > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplem > > > >>>> entation: > > > >>>> This client just lost it's session with ZooKeeper, trying to > > > reconnect. > > > >>>> 2011-02-03 10:37:45,453 INFO > > > >>>> > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplem > > > >>>> entation: > > > >>>> Trying to reconnect to zookeeper > > > >>>> 2011-02-03 10:37:45,206 INFO org.apache.zookeeper.ClientCnxn: > Unable > > > to > > > >>>> reconnect to ZooKeeper service, session 0x12db9f722421ce3 has > > expired, > > > >>>> closing socket connection > > > >>>> gionserver:60020-0x22db9f722501d93 regionserver:60020- > > > >>>> 0x22db9f722501d93 > > > >>>> received expired from ZooKeeper, aborting > > > >>>> org.apache.zookeeper.KeeperException$SessionExpiredException: > > > >>>> KeeperErrorCode = Session expired > > > >>>> at > > > >>>> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent( > > > >>>> ZooKeeperWatcher.java:328) > > > >>>> at > > > >>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeep > > > >>>> erWatcher.java:246) > > > >>>> at > > > >>>> > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.ja > > > >>>> va:530) > > > >>>> at > > > >>>> > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) > > > >>>> handled exception: org.apache.hadoop.hbase.YouAreDeadException: > > Server > > > >>>> REPORT rejected; currently processing > > XXXXXXXXXXXX,60020,1296684296172 > > > >>>> as dead server > > > >>>> org.apache.hadoop.hbase.YouAreDeadException: > > > >>>> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT > rejected; > > > >>>> currently processing XXXXXXXXXXXX,60020,1296684296172 as dead > server > > > >>>> at > > > >> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > > > >>>> Method) > > > >>>> at > > > >>>> > > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructor > > > >>>> AccessorImpl.java:39) > > > >>>> at > > > >>>> > > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingCon > > > >>>> structorAccessorImpl.java:27) > > > >>>> at > > > >>> java.lang.reflect.Constructor.newInstance(Constructor.java:513) > > > >>>> at > > > >>>> > > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteExce > > > >>>> ption.java:96) > > > >>>> at > > > >>>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(Remote > > > >>>> Exception.java:80) > > > >>>> at > > > >>>> > > org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerRep > > > >>>> ort(HRegionServer.java:729) > > > >>>> at > > > >>>> > > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.j > > > >>>> ava:586) > > > >>>> at java.lang.Thread.run(Thread.java:619) > > > >>>> > > > >>>> > > > >>>> 2011-02-03T09:53:35.165-0800: 71320.785: [GC 71320.785: [ParNew > > > >>>> (promotion > > > >>>> failed): 5568K->5568K(5568K), 0.4384530 secs]71321.224: > > > >>>> [CMS2011-02-03T09:53:45.111-0800: 71330.731: [CMS-concurrent-mark: > > > >>>> 17.511/51.564 secs] [Times: user=38.72 sys=5.67, real=51.60 secs] > > > >>>> > > > >>>> > > > >>>> > > > >>>> Thanks, > > > >>>> Charan > > > >>> > > > >> > > > > > > > > > > > > > > > > -- > > > > Todd Lipcon > > > > Software Engineer, Cloudera > > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera >