[
https://issues.apache.org/jira/browse/HBASE-9778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13930849#comment-13930849
]
Hudson commented on HBASE-9778:
-------------------------------
FAILURE: Integrated in HBase-0.94-JDK7 #79 (See
[https://builds.apache.org/job/HBase-0.94-JDK7/79/])
HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev
1576456)
* /hbase/branches/0.94/src/docbkx/performance.xml
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
*
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
*
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
*
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
*
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
> Add hint to ExplicitColumnTracker to avoid seeking
> --------------------------------------------------
>
> Key: HBASE-9778
> URL: https://issues.apache.org/jira/browse/HBASE-9778
> Project: HBase
> Issue Type: Bug
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18
>
> Attachments: 9778-0.94-v2.txt, 9778-0.94-v3.txt, 9778-0.94-v4.txt,
> 9778-0.94-v5.txt, 9778-0.94-v6.txt, 9778-0.94-v7.txt, 9778-0.94-v8.txt,
> 9778-0.94-v9.txt, 9778-0.94.txt, 9778-trunk-v2.txt, 9778-trunk-v3.txt,
> 9778-trunk-v6.txt, 9778-trunk-v7.txt, 9778-trunk-v8.txt, 9778-trunk-v9.txt,
> 9778-trunk.txt
>
>
> The issue of slow seeking in ExplicitColumnTracker was brought up by
> [~vrodionov] on the dev list.
> My idea here is to avoid the seeking if we know that there aren't many
> versions to skip.
> How do we know? We'll use the column family's VERSIONS setting as a hint. If
> VERSIONS is set to 1 (or maybe some value < 10) we'll avoid the seek and call
> SKIP repeatedly.
> HBASE-9769 has some initial number for this approach:
> Interestingly it depends on which column(s) is (are) selected.
> Some numbers: 4m rows, 5 cols each, 1 cf, 10 bytes values, VERSIONS=1,
> everything filtered at the server with a ValueFilter. Everything measured in
> seconds.
> Without patch:
> ||Wildcard||Col 1||Col 2||Col 4||Col 5||Col 2+4||
> |6.4|8.5|14.3|14.6|11.1|20.3|
> With patch:
> ||Wildcard||Col 1||Col 2||Col 4||Col 5||Col 2+4||
> |6.4|8.4|8.9|9.9|6.4|10.0|
> Variation here was +- 0.2s.
> So with this patch scanning is 2x faster than without in some cases, and
> never slower. No special hint needed, beyond declaring VERSIONS correctly.
--
This message was sent by Atlassian JIRA
(v6.2#6252)