Hi Ram, Yes BlockCache is on but there is another in memory column which might be preempting the stuff from block cache. So, we might be hitting more disk seeks - I see that you have seen this trace before on HBASE 5898 - did that issue resolve things for you ?
Thanks Varun On Wed, Dec 5, 2012 at 10:04 PM, ramkrishna vasudevan < [email protected]> wrote: > Is block cache ON? Check out HBASe-5898? > > Regards > Ram > > On Thu, Dec 6, 2012 at 9:55 AM, Anoop Sam John <[email protected]> wrote: > > > > > >is the META table cached just like other tables > > Yes Varun I think so. > > > > -Anoop- > > ________________________________________ > > From: Varun Sharma [[email protected]] > > Sent: Thursday, December 06, 2012 6:10 AM > > To: [email protected]; lars hofhansl > > Subject: Re: .META. region server DDOSed by too many clients > > > > We only see this on the .META. region not otherwise... > > > > On Wed, Dec 5, 2012 at 4:37 PM, Varun Sharma <[email protected]> > wrote: > > > > > I see but is this pointing to the fact that we are heading to disk for > > > scanning META - if yes, that would be pretty bad, no ? Currently I am > > > trying to see if the freeze coincides with Block Cache being full (we > > have > > > an inmemory column) - is the META table cached just like other tables ? > > > > > > Varun > > > > > > > > > On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl <[email protected]> > > wrote: > > > > > >> Looks like you're running into HBASE-5898. > > >> > > >> > > >> > > >> ----- Original Message ----- > > >> From: Varun Sharma <[email protected]> > > >> To: [email protected] > > >> Cc: > > >> Sent: Wednesday, December 5, 2012 3:51 PM > > >> Subject: .META. region server DDOSed by too many clients > > >> > > >> Hi, > > >> > > >> I am running hbase 0.94.0 and I have a significant write load being > put > > on > > >> a table with 98 regions on a 15 node cluster - also this write load > > comes > > >> from a very large number of clients (~ 1000). I am running with 10 > > >> priority > > >> IPC handlers and 200 IPC handlers. It seems the region server holding > > >> .META > > >> is DDOSed. All the 200 handlers are busy serving the .META. region and > > >> they > > >> are all locked onto on object. The Jstack is here for the regoin > server > > >> > > >> "IPC Server handler 182 on 60020" daemon prio=10 > tid=0x00007f329872c800 > > >> nid=0x4401 waiting on condition [0x00007f328807f000] > > >> java.lang.Thread.State: WAITING (parking) > > >> at sun.misc.Unsafe.park(Native Method) > > >> - parking to wait for <0x0000000542d72e30> (a > > >> java.util.concurrent.locks.ReentrantLock$NonfairSync) > > >> at > > >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > > >> at > > >> > > >> > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > > >> at > > >> > > >> > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) > > >> at > > >> > > >> > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) > > >> at > > >> > > >> > > > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) > > >> at > > >> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) > > >> at > > >> > > >> > > > java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) > > >> at > > >> > > >> > > > java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) > > >> at > > >> org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) > > >> - locked <0x000000063b4965d0> (a > > >> org.apache.hadoop.hbase.regionserver.StoreScanner) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402) > > >> - locked <0x000000063b4965d0> (a > > >> org.apache.hadoop.hbase.regionserver.StoreScanner) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3354) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3310) > > >> - locked <0x0000000523c211e0> (a > > >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3327) > > >> - locked <0x0000000523c211e0> (a > > >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) > > >> at > > >> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4066) > > >> at > > >> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4039) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1941) > > >> > > >> The client side trace shows that we are looking for META region. > > >> > > >> thrift-worker-3499" daemon prio=10 tid=0x00007f789dd98800 nid=0xb52 > > >> waiting > > >> for monitor entry [0x00007f778672d000] > > >> java.lang.Thread.State: BLOCKED (on object monitor) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:943) > > >> - waiting to lock <0x0000000707978298> (a java.lang.Object) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1482) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367) > > >> at > org.apache.hadoop.hbase.client.HTable.batch(HTable.java:729) > > >> - locked <0x000000070821d5a0> (a > > >> org.apache.hadoop.hbase.client.HTable) > > >> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:698) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.client.HTablePool$PooledHTable.get(HTablePool.java:371) > > >> > > >> On the RS page, I see 68 million read requests for the META region > while > > >> for the other 98 regions - we have done like 20 million write requests > > in > > >> total - regions have not moved around at all and no crashes have > > happened. > > >> Why do we have such an incredible number of scans over META and is > there > > >> something I can do about this issue ? > > >> > > >> Varun > > >> > > >> > > > > > >
