Re: Sporadic memstore slowness for Read Heavy workloads

Ted Yu Sun, 26 Jan 2014 13:56:45 -0800

On a second look, this is the bold line:

org.apache.hadoop.hbase.regionserver.StoreScanner.
next(StoreScanner.java:411)*



On Sun, Jan 26, 2014 at 1:43 PM, Ted Yu <[email protected]> wrote:

> I don't see bold line.
> Is the following line what you referred to ?
> org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.reseek(
> MemStore.java:805)
>
> BTW I assume you're using 0.94 HBase in production.
>
> Cheers
>
>
> On Sun, Jan 26, 2014 at 1:17 PM, Varun Sharma <[email protected]> wrote:
>
>> Here is the hotthread result. The line in bold is "SEEK_NEXT_COL" and it
>> degenerates to a reseek+skiplist lookup when it does not have to because
>> we
>> already seeked to the row and getting all the columns is just a matter of
>> iterator.next() calls.
>>
>>
>>
>> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareWithoutRow(KeyValue.java:2196)
>>
>> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(KeyValue.java:2086)
>>
>> org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1535)
>>
>> org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1523)
>>
>>
>> java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(ConcurrentSkipListMap.java:606)
>>
>>
>> java.util.concurrent.ConcurrentSkipListMap.findPredecessor(ConcurrentSkipListMap.java:685)
>>
>>
>> java.util.concurrent.ConcurrentSkipListMap.findNear(ConcurrentSkipListMap.java:1345)
>>
>>
>> java.util.concurrent.ConcurrentSkipListMap$SubMap.loNode(ConcurrentSkipListMap.java:2583)
>>
>>
>> java.util.concurrent.ConcurrentSkipListMap$SubMap.access$300(ConcurrentSkipListMap.java:2486)
>>
>>
>> java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapIter.<init>(ConcurrentSkipListMap.java:3022)
>>
>>
>> java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapValueIterator.<init>(ConcurrentSkipListMap.java:3092)
>>
>>
>> java.util.concurrent.ConcurrentSkipListMap$SubMap.valueIterator(ConcurrentSkipListMap.java:3002)
>>
>>
>> java.util.concurrent.ConcurrentSkipListMap$Values.iterator(ConcurrentSkipListMap.java:2402)
>>
>>
>> org.apache.hadoop.hbase.regionserver.KeyValueSkipListSet.iterator(KeyValueSkipListSet.java:87)
>>
>>
>> org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.reseek(MemStore.java:805)
>>
>>
>> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
>>
>>
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:320)
>>
>>
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:265)
>>
>>
>> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:545)
>>
>> *
>>
>> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:411)*
>>
>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:143)
>>
>>
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3884)
>>
>>
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3956)
>>
>>
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3827)
>>
>>
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3808)
>>
>>
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3851)
>>     org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4777)
>>     org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4750)
>>
>>
>> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2152)
>>
>>
>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3700)
>>
>>
>>
>> On Sun, Jan 26, 2014 at 1:14 PM, Varun Sharma <[email protected]>
>> wrote:
>>
>> > Hi,
>> >
>> > We are seeing some unfortunately low performance in the memstore - we
>> have
>> > researched some of the previous JIRA(s) and seen some inefficiencies in
>> the
>> > ConcurrentSkipListMap. The symptom is a RegionServer hitting 100 % cpu
>> at
>> > weird points in time - the bug is hard to reproduce and there isn't
>> like a
>> > huge # of extra reads going to that region server or any substantial
>> > hotspot happening. The region server recovers the moment, we flush the
>> > memstores or restart the region server. Our queries retrieve wide rows
>> > which are upto 10-20 columns. A stack trace shows two things:
>> >
>> > 1) Time spent inside MemstoreScanner.reseek() and inside the
>> > ConcurrentSkipListMap
>> > 2) The reseek() is being called at the "SEEK_NEXT" column inside
>> > StoreScanner - this is understandable since the rows contain many
>> columns
>> > and StoreScanner iterates one KeyValue at a time.
>> >
>> > So, I was looking at the code and it seems that every single time there
>> is
>> > a reseek call on the same memstore scanner, we make a fresh call to
>> build
>> > an iterator() on the skip list set - this means we an additional skip
>> list
>> > lookup for every column retrieved. SkipList lookups are O(n) and not
>> O(1).
>> >
>> > Related JIRA HBASE 3855 made the reseek() scan some KVs and if that
>> number
>> > if exceeded, do a lookup. However, it seems this behaviour was reverted
>> by
>> > HBASE 4195 and every next row/next column is now a reseek() and a skip
>> list
>> > lookup rather than being an iterator.
>> >
>> > Are there any strong reasons against having the previous behaviour of
>> > scanning a small # of keys before degenerating to a skip list lookup ?
>> > Seems like it would really help for sequential memstore scans and for
>> > memstore gets with wide tables (even 10-20 columns).
>> >
>> > Thanks
>> > Varun
>> >
>>
>
>

Re: Sporadic memstore slowness for Read Heavy workloads

Reply via email to