On a second look, this is the bold line: org.apache.hadoop.hbase.regionserver.StoreScanner. next(StoreScanner.java:411)*
On Sun, Jan 26, 2014 at 1:43 PM, Ted Yu <[email protected]> wrote: > I don't see bold line. > Is the following line what you referred to ? > org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.reseek( > MemStore.java:805) > > BTW I assume you're using 0.94 HBase in production. > > Cheers > > > On Sun, Jan 26, 2014 at 1:17 PM, Varun Sharma <[email protected]> wrote: > >> Here is the hotthread result. The line in bold is "SEEK_NEXT_COL" and it >> degenerates to a reseek+skiplist lookup when it does not have to because >> we >> already seeked to the row and getting all the columns is just a matter of >> iterator.next() calls. >> >> >> >> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareWithoutRow(KeyValue.java:2196) >> >> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(KeyValue.java:2086) >> >> org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1535) >> >> org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1523) >> >> >> java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(ConcurrentSkipListMap.java:606) >> >> >> java.util.concurrent.ConcurrentSkipListMap.findPredecessor(ConcurrentSkipListMap.java:685) >> >> >> java.util.concurrent.ConcurrentSkipListMap.findNear(ConcurrentSkipListMap.java:1345) >> >> >> java.util.concurrent.ConcurrentSkipListMap$SubMap.loNode(ConcurrentSkipListMap.java:2583) >> >> >> java.util.concurrent.ConcurrentSkipListMap$SubMap.access$300(ConcurrentSkipListMap.java:2486) >> >> >> java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapIter.<init>(ConcurrentSkipListMap.java:3022) >> >> >> java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapValueIterator.<init>(ConcurrentSkipListMap.java:3092) >> >> >> java.util.concurrent.ConcurrentSkipListMap$SubMap.valueIterator(ConcurrentSkipListMap.java:3002) >> >> >> java.util.concurrent.ConcurrentSkipListMap$Values.iterator(ConcurrentSkipListMap.java:2402) >> >> >> org.apache.hadoop.hbase.regionserver.KeyValueSkipListSet.iterator(KeyValueSkipListSet.java:87) >> >> >> org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.reseek(MemStore.java:805) >> >> >> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) >> >> >> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:320) >> >> >> org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:265) >> >> >> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:545) >> >> * >> >> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:411)* >> >> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:143) >> >> >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3884) >> >> >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3956) >> >> >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3827) >> >> >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3808) >> >> >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3851) >> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4777) >> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4750) >> >> >> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2152) >> >> >> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3700) >> >> >> >> On Sun, Jan 26, 2014 at 1:14 PM, Varun Sharma <[email protected]> >> wrote: >> >> > Hi, >> > >> > We are seeing some unfortunately low performance in the memstore - we >> have >> > researched some of the previous JIRA(s) and seen some inefficiencies in >> the >> > ConcurrentSkipListMap. The symptom is a RegionServer hitting 100 % cpu >> at >> > weird points in time - the bug is hard to reproduce and there isn't >> like a >> > huge # of extra reads going to that region server or any substantial >> > hotspot happening. The region server recovers the moment, we flush the >> > memstores or restart the region server. Our queries retrieve wide rows >> > which are upto 10-20 columns. A stack trace shows two things: >> > >> > 1) Time spent inside MemstoreScanner.reseek() and inside the >> > ConcurrentSkipListMap >> > 2) The reseek() is being called at the "SEEK_NEXT" column inside >> > StoreScanner - this is understandable since the rows contain many >> columns >> > and StoreScanner iterates one KeyValue at a time. >> > >> > So, I was looking at the code and it seems that every single time there >> is >> > a reseek call on the same memstore scanner, we make a fresh call to >> build >> > an iterator() on the skip list set - this means we an additional skip >> list >> > lookup for every column retrieved. SkipList lookups are O(n) and not >> O(1). >> > >> > Related JIRA HBASE 3855 made the reseek() scan some KVs and if that >> number >> > if exceeded, do a lookup. However, it seems this behaviour was reverted >> by >> > HBASE 4195 and every next row/next column is now a reseek() and a skip >> list >> > lookup rather than being an iterator. >> > >> > Are there any strong reasons against having the previous behaviour of >> > scanning a small # of keys before degenerating to a skip list lookup ? >> > Seems like it would really help for sequential memstore scans and for >> > memstore gets with wide tables (even 10-20 columns). >> > >> > Thanks >> > Varun >> > >> > >
