Lars Hofhansl created PHOENIX-2797:
--------------------------------------
Summary: Ideas to speed up MIN/MAX/DISTINCT for prefixes of the PK
Key: PHOENIX-2797
URL: https://issues.apache.org/jira/browse/PHOENIX-2797
Project: Phoenix
Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor
All of MIN, MAX, and DISTINCT always perform a full scan, even when they for a
prefix of a compound key.
For MIN and MAX only need to find the first and last row (resp) and we'll have
our answer. This works for the full key or a prefix of the key.
This should work find with or without a WHERE clause, as long as we can
identify the first and last.
For DISTINCT we could a skip scan to the next prefix (only help with a true
prefix of a compound key).
Say the key is (K1, K2), and say that we're doing DISTINCT(K1). We can skip to
the next value of K1 once we found a value. This should have a dramatic impact
when the cardinality of K2 is high.
With a WHERE clause that might itself be causing a SKIP SCAN, this might be
quite tricky. Would need to think about it.
Both of these statement hold equally when querying against an index.
Anyway... Just filing this as an idea for now.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)