[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

Ted Yu (JIRA) Thu, 21 Nov 2013 10:47:25 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13829205#comment-13829205
 ]


Ted Yu commented on HBASE-10015:
--------------------------------

I tried to run the unit test but got:
{code}
testScanFilterPerformance(org.apache.hadoop.hbase.regionserver.TestScanFilterPerformance)
  Time elapsed: 0.007 sec  <<< ERROR!
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/tyu/hbase/hbase.version could only be replicated to 0 nodes, instead of 1
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
{code}

> Major performance improvement: Avoid synchronization in StoreScanner
> --------------------------------------------------------------------
>
>                 Key: HBASE-10015
>                 URL: https://issues.apache.org/jira/browse/HBASE-10015
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>         Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java
>
>
> Did some more profiling (this time with a sampling profiler) and 
> StoreScanner.peek() showed up a lot in the samples. At first that was 
> surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
> is eaten there.
> It seems the only reason we have to synchronize all these methods is because 
> a concurrent flush or compaction can change the scanner stack, other than 
> that only a single thread should access a StoreScanner at any given time.
> So replaced updateReaders() with some code that just indicates to the scanner 
> that the readers should be updated and then make it the using thread's 
> responsibility to do the work.
> The perf improvement from this is staggering. I am seeing somewhere around 3x 
> scan performance improvement across all scenarios.
> Now, the hard part is to reason about whether this is 100% correct. I ran 
> TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
> pass.
> Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

Reply via email to