[
https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529642#comment-13529642
]
Lars Hofhansl commented on HBASE-7336:
--------------------------------------
Here's my test case: 20m rows, single column family, single column, blockcache
disabled for the scan, no HDFS short circuiting, all data fits into the OS
buffer cache. Full scan over all rows.
One client scanning: 15s (regionserver keeps one core busy ~120%)
Two clients scanning along the same set of blocks: They both time out.
(regionserver is a 5-6% CPU, clearly just waiting)
Then I hacked readAtOffset to always do preads. Now:
One client scanning: 39s (regionserver at ~120%)
Two clients scanning: 39s each (regionserver at ~210%)
With short circuiting enabled:
One client: 15s (120% CPU)
Two clients: 41s each (160% CPU)
with pread:
One client: 18s (160% CPU)
Two clients: 19s each (250% CPU)
> HFileBlock.readAtOffset does not work well with multiple threads
> ----------------------------------------------------------------
>
> Key: HBASE-7336
> URL: https://issues.apache.org/jira/browse/HBASE-7336
> Project: HBase
> Issue Type: Bug
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Priority: Critical
> Fix For: 0.96.0, 0.94.4
>
>
> HBase grinds to a halt when many threads scan along the same set of blocks
> and neither read short circuit is nor block caching is enabled for the dfs
> client ... disabling the block cache makes sense on very large scans.
> It turns out that synchronizing in istream in HFileBlock.readAtOffset is the
> culprit.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira