[
https://issues.apache.org/jira/browse/PHOENIX-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905900#comment-13905900
]
Lars Hofhansl commented on PHOENIX-29:
--------------------------------------
FilterList has had its issues in the past (it's actually not entirely easy to
combine filter that can request a seek with other filters), that just as an
aside :)
Why not add this filter in the beginning?
Lastly, doing this as a filter in Phoenix is the right approach, I think. We
can add various hints to HBase about small rows, etc, but in the end with such
a filter we can just take control.
> Add custom filter to more efficiently navigate KeyValues in row
> ---------------------------------------------------------------
>
> Key: PHOENIX-29
> URL: https://issues.apache.org/jira/browse/PHOENIX-29
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Attachments: PHOENIX-29.patch, PHOENIX-29_V2.patch
>
>
> Currently HBase is 50% faster at selecting the first KV in a row than in
> selecting any other column. The reason is that when you project a column into
> a Scan, HBase uses its ExplicitColumTracker which does a reseek to the
> column. The only case where this is not necessary is when the column is the
> first one.
> In most cases (unless you have thousands of versions), it'd be more efficient
> to just do a NEXT instead of a reseek (especially if your KV is the next
> one). We can provide our own custom filter through which we pass two lists:
> 1) all KVs referenced in the select expressions. These are the only ones that
> need to be returned back to the client which is another advantage we'd get
> writing this custom filter.
> 2) all KVs referenced in the WHERE clause.
> The filter could sort the KVs using the standard KeyValue.COMPARATOR and
> merge between them and the incoming KVs, using NEXT instead of a reseek. We
> could potentially use a reseek if the number of columns in the table is
> beyond a certain threshold.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)