[
https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ted Yu resolved HBASE-8639.
---------------------------
Resolution: Fixed
> very poor performance of htable#getscanner in multithreaded environment
> -----------------------------------------------------------------------
>
> Key: HBASE-8639
> URL: https://issues.apache.org/jira/browse/HBASE-8639
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.94.7
> Reporter: Raymond Liu
> Assignee: Ted Yu
> Fix For: 0.98.0, 0.95.2, 0.94.9
>
> Attachments: 8639-0.94.txt, 8639-v1.txt
>
>
> Hi, I am running a app on top of phoenix which will fork say around 100+
> thread to call htable.getscanner(scan) to do parallel scan ( say each scan is
> actually targeting one Region), And each scan will only match a few result
> and return thus will be very fast.
> under this case, I found that the htable.getscanner(scan) op itself runs
> pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on
> org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by
> scannnercallable.checkIfRegionServerIsRemote.
> The root cause is that DNS.getDefaultHost involves synchronized methods in
> java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon
> each other. each call to DNS.getDefaultHost cost around 30ms, while in
> another case, I run single thread to call 100K times DNS.getDefaultHost ,
> each cost leas than 0.06ms.
> By hacking the code and remove the call to checkIfRegionServerIsRemote, my
> app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of
> 1000+ seconds.
> by check the code further, I found this checkIfRegionServerIsRemote seems
> just for use of metrics collection. ( or maybe retry logic?) I am wondering
> that could this been removed or switch to some other implementation? so that
> cases like mine which run large number of small scan with multi threads could
> performance way better?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira