Fwd: Issues w/ count (distinct col) queries via phoenix table with secondary indexes

Jude K Tue, 28 Apr 2015 18:09:07 -0700

Hi,

Yesterday, we created a secondary index on one our tables to help improve
the read speed of performing select count queries like the one below. The
query w/ index runs fast but provides the same counts. The query w/o index
(/*+ NO_INDEX */) runs slower but provides the right counts for each
column.


The index is covered and includes all the columns specified below. The bug
we are observing seems similar to
https://issues.apache.org/jira/browse/PHOENIX-814 and
https://issues.apache.org/jira/browse/PHOENIX-1203.

Can anyone points us in the right direction to correct this issue (or
workaround) as we need fast retrieval and proper counting?

We have the following environment:

Hbase 0.98.4.2.2.0.0
Phoenix: 4.2.0.2.2.0.0-2041
Core (CPU) - 8
RAM - 29 Gb
OS:centos6 (x86_64)



++++++++ QUERY +++++++++++++++
SELECT A.COL1, A.COL2, COUNT(DISTINCT A.COL3), COUNT(DISTINCT A.COL4)

FROM A

WHERE A.COL5 >= 1427864399000 AND  A.COL5 <= 1427947199000

GROUP BY  A.COL1, A.COL2

HAVING COUNT(DISTINCT A.COL3) > 10;

A.COL1 and A.COL2 are varchars. The rest of the columns are unsigned_longs.
++++++++ QUERY +++++++++++++++

Fwd: Issues w/ count (distinct col) queries via phoenix table with secondary indexes

Reply via email to