[jira] [Comment Edited] (HBASE-13291) Lift the scan ceiling

Andrew Purtell (JIRA) Thu, 19 Mar 2015 17:09:07 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370416#comment-14370416
 ]


Andrew Purtell edited comment on HBASE-13291 at 3/20/15 12:07 AM:
------------------------------------------------------------------

Those flame graphs show on-CPU differences. Would be also really interesting to 
see where and how long handler threads are waiting or blocked and how that 
differs among the two cases. 

A quick look over the nofilter and filterall graphs suggests that the filterall 
case removes a lot of RPC work so shifts on-CPU processing from a mix of:
- RPC handling and related data copies
- skipping to desired cells (skipping through blocks in blockcache, and logic 
in SQM#match)

to mostly skipping to desired cells. We maybe could optimize this with an index 
into the blockcache but the actual bottleneck may lie elsewhere with queueing, 
threading, or locking issues.


was (Author: apurtell):
Those flame graphs show on-CPU differences. Would be really interesting to see 
where and how long handler threads are waiting and how that differs among the 
two cases. 

I quick look over the nofilter and filterall graphs suggest that the filterall 
case removes a lot of RPC work so shifts on-CPU processing from a mix of:
- RPC handling and related data copies
- skipping to desired cells (skipping through blocks in blockcache, and logic 
in SQM#match)

to mostly skipping to desired cells. We maybe could optimize this with an index 
into the blockcache but the actual bottleneck may lie elsewhere with 
queueing/threading issues.

> Lift the scan ceiling
> ---------------------
>
>                 Key: HBASE-13291
>                 URL: https://issues.apache.org/jira/browse/HBASE-13291
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners
>    Affects Versions: 1.0.0
>            Reporter: stack
>            Assignee: stack
>         Attachments: traces.filterall.svg, traces.nofilter.svg
>
>
> Scanning medium sized rows with multiple concurrent scanners exhibits 
> interesting 'ceiling' properties. A server runs at about 6.7k ops a second 
> using 450% of possible 1600% of CPUs  when 4 clients each with 10 threads 
> doing scan 1000 rows.  If I add '--filterAll' argument (do not return 
> results), then we run at 1450% of possible 1600% possible but we do 8k ops a 
> second.
> Let me attach flame graphs for two cases. Unfortunately, there is some 
> frustrating dark art going on. Let me try figure it... Filing issue in 
> meantime to keep score in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-13291) Lift the scan ceiling

Reply via email to