[ 
https://issues.apache.org/jira/browse/HBASE-9811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13807648#comment-13807648
 ] 

Vladimir Rodionov commented on HBASE-9811:
------------------------------------------

>>No much improvement when there is more than one hfiles. This may be due to 
>>the use of KeyValueHeap, as performance drops >>greatly as the number of 
>>hfiles grows. 

KeyValueHeap is slow (JIRA to be opened). StoreScaner's ScanQueryMatcher is 
slow (+1 JIRA). I have run all my tests directly on StoreFileScanner (we can do 
this for 1 HFile)  bypassing all of them. 

> ColumnPaginationFilter is slow when offset is large
> ---------------------------------------------------
>
>                 Key: HBASE-9811
>                 URL: https://issues.apache.org/jira/browse/HBASE-9811
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Chao Shi
>
> Hi there, we are trying to migrate a app from MySQL to HBase. One kind of the 
> queries is pagination with large offset and small limit. We don't have too 
> many such queries and so both MySQL and HBase should survive. (MySQL has no 
> index for offset either.)
> When comparing the performance on both systems, we found something interest: 
> write ~1M values in a single row, and query with offset = 1M. So all values 
> should be scanned on RS side.
> When running the query on MySQL, the first query is pretty slow (more than 1 
> second) and then repeat the same query, it will become very low latency.
> HBase on the other hand, repeating the query does not help much (~1s 
> forever). I can confirm that all data are in block cache and all the time is 
> spent on in-memory data processing. (We have flushed data to disk.)
> I found "reseek" is the hot spot. It is caused by ColumnPaginationFilter 
> returning NEXT_COL. If I replace this line by returning SKIP (which causes to 
> call next rather than reseek), the latency is reduced to ~100ms.
> So I think there must be some room for optimization.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to