[ https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837591#action_12837591 ]
Yoram Kulbak commented on HBASE-2248: ------------------------------------- I did the following sanity check: I rolled back memstore to just before HBASE-2037 was applied [last commit on 21 Oct 2009]. [ To get things going I had to put back the MemStore#numKeyValues method and change the MemStore#clearSnapshot argument to SortedSet ] I then ran TestHRegion and two tests failed: - testFlushCacheWhileScanning - demonstrates the incorrect scans while a snapshot exists issue - testWritesWhileScanning - demonstrates 'partial puts' being visible to the scanner I also tried running TestMemStore but all the tests there have passed. I didn't try running the whole suite. It took me a while to figure out what exactly goes wrong when a snapshot exists, the short (and vague) explanation is that the scanner may return keys in a 'non ordered' manner, meaning a KV with a higher row may be returned before a KV with a lower row because the result list which aggregates results from both snapshot and kvset doesn't guarantee the KVs are added in a sorted order. I think there's a way to add a simple test to TestMemStore which will demonstrate that.. > New MemStoreScanner copies memstore for each scan, makes short scans slow > ------------------------------------------------------------------------- > > Key: HBASE-2248 > URL: https://issues.apache.org/jira/browse/HBASE-2248 > Project: Hadoop HBase > Issue Type: Bug > Affects Versions: 0.20.3 > Reporter: Dave Latham > Fix For: 0.20.4 > > Attachments: hbase-2248.gc, Screen shot 2010-02-23 at 10.33.38 > AM.png, threads.txt > > > HBASE-2037 introduced a new MemStoreScanner which triggers a > ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when > starting a scan. > After upgrading to 0.20.3, we noticed a big slowdown in our use of short > scans. Some of our data repesent a time series. The data is stored in time > series order, MR jobs often insert/update new data at the end of the series, > and queries usually have to pick up some or all of the series. These are > often scans of 0-100 rows at a time. To load one page, we'll observe about > 20 such scans being triggered concurrently, and they take 2 seconds to > complete. Doing a thread dump of a region server shows many threads in > ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key > values to copy it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.