[ 
https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837141#action_12837141
 ] 

Dan Washusen commented on HBASE-2248:
-------------------------------------

@Todd: I didn't author the change but it relates to the 
[tests|http://svn.apache.org/viewvc/hadoop/hbase/branches/0.20/src/test/org/apache/hadoop/hbase/regionserver/TestHRegion.java?p2=/hadoop/hbase/branches/0.20/src/test/org/apache/hadoop/hbase/regionserver/TestHRegion.java&p1=/hadoop/hbase/branches/0.20/src/test/org/apache/hadoop/hbase/regionserver/TestHRegion.java&r1=896138&r2=896137&view=diff&pathrev=896138]
 added with the change.

@Ryan: The tests added to PE as a result of HBASE-2249 seem to indicate that 
even with a fully loaded MemStore it takes 9ms to complete a scan for ~100 rows 
with 10 concurrent client VMs hitting a single region server.  That seems to 
contradict the 1-2 seconds seen by Dave.   The thread dump does seems to 
indicate the clone but maybe something else is coming into play as well?  Maybe 
the additional 4KB memory allocation is bringing GC into it?

> New MemStoreScanner copies memstore for each scan, makes short scans slow
> -------------------------------------------------------------------------
>
>                 Key: HBASE-2248
>                 URL: https://issues.apache.org/jira/browse/HBASE-2248
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3
>            Reporter: Dave Latham
>             Fix For: 0.20.4
>
>         Attachments: threads.txt
>
>
> HBASE-2037 introduced a new MemStoreScanner which triggers a 
> ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when 
> starting a scan.
> After upgrading to 0.20.3, we noticed a big slowdown in our use of short 
> scans.  Some of our data repesent a time series.   The data is stored in time 
> series order, MR jobs often insert/update new data at the end of the series, 
> and queries usually have to pick up some or all of the series.  These are 
> often scans of 0-100 rows at a time.  To load one page, we'll observe about 
> 20 such scans being triggered concurrently, and they take 2 seconds to 
> complete.  Doing a thread dump of a region server shows many threads in 
> ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key 
> values to copy it.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to