Lars Hofhansl created HBASE-10015:
-------------------------------------
Summary: Major performance improvement: Avoid synchronization in
StoreScanner
Key: HBASE-10015
URL: https://issues.apache.org/jira/browse/HBASE-10015
Project: HBase
Issue Type: Bug
Reporter: Lars Hofhansl
Attachments: 10015-0.94.txt
Did some more profiling (this time with a sampling profiler) and
StoreScanner.peek() showed up a lot in the samples. At first that was
surprising, but peek is synchronized, so it seems a lot of the sync'ing cost is
eaten there.
It seems the only reason we have to synchronize all these methods is because a
concurrent flush or compaction can change the scanner stack, other than that
only a single thread should access a StoreScanner at any given time.
So replaced updateReaders() with some code that just indicates to the scanner
that the readers should be updated and then make it the using thread's
responsibility to do the work.
The perf improvement from this is staggering. I am seeing somewhere around 3x
scan performance improvement across all scenarios.
Now, the hard part is to reason about whether this is 100% correct. I ran
TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still
pass.
Will attach a sample patch.
--
This message was sent by Atlassian JIRA
(v6.1#6144)