Re: Slow query help

Flavio Pompermaier Fri, 16 Mar 2018 10:26:55 -0700

Thanks for the tip James. I didn't know that syntax for doing the count on
a distinct value!
This version is able to end, the first one wasn't able to finish even
giving a huge amount of memory to HBase (the cardinality of SOMEFIELD is
very big indeed).


Thanks a lot,
Flavio

On Fri, Mar 16, 2018 at 5:45 PM, James Taylor <[email protected]>
wrote:

> Hi Flavio,
> You'll need to add a secondary index to SOMEFIELD (or SOMEFIELD + VALID)
> to speed that up. You can write it more simply as SELECT COUNT(DISTINCT
> SOMEFIELD) FROM TEST.MYTABLE WHERE VALID AND SOMEFIELD IS NOT NULL.
>
> Otherwise, you'll end up doing a full table scan (and use a fair amount of
> memory depending on the cardinality of SOMEFIELD). The above with a
> secondary index would skip to the distinct values instead.
>
> Thanks,
> James
>
> On Fri, Mar 16, 2018 at 8:30 AM, Flavio Pompermaier <[email protected]>
> wrote:
>
>> Hi to all,
>> I'm running a query like this one on my Phoenix 4.13 (on CDH 5.11.2):
>>
>> SELECT COUNT(*) FROM (
>>   SELECT DISTINCT(SOMEFIELD)
>>   FROM TEST.MYTABLE
>>   WHERE VALID = TRUE AND SOMEFIELD IS NOT NULL
>> )
>>
>> Unfortunately the query timeouts (timeout is 10 min).... Any suggestion
>> about how to tune my installation?
>>
>> Best,
>> Flavi
>>
>

Re: Slow query help

Reply via email to