Re: count distinct

2015-06-23 Thread Michael McAllister

Your second query should work with Phoenix 4.3 or later.


Thanks, unfortunately at the moment I’m stuck with Phoenix 4.2.


I will investigate the problem with the first one and get back to you.


Appreciate this.

Michael McAllister
Staff Data Warehouse Engineer | Decision Systems
mmcallis...@homeaway.commailto:mmcallis...@homeaway.com | C: 512.423.7447 | 
skype: michael.mcallister.hamailto:zimmk...@hotmail.com | webex: 
https://h.a/mikewebex

[cid:image002.png@01D080DC.77AD4930]
This electronic communication (including any attachment) is confidential.  If 
you are not an intended recipient of this communication, please be advised that 
any disclosure, dissemination, distribution, copying or other use of this 
communication or any attachment is strictly prohibited.  If you have received 
this communication in error, please notify the sender immediately by reply 
e-mail and promptly destroy all electronic and printed copies of this 
communication and any attachment.

On Jun 23, 2015, at 7:54 PM, Maryann Xue 
maryann@gmail.commailto:maryann@gmail.com wrote:

Sorry, I missed the first line. Your second query should work with Phoenix 4.3 
or later.


I will investigate the problem with the first one and get back to you.


Thanks,
Maryann

On Tuesday, June 23, 2015, Michael McAllister 
mmcallis...@homeaway.commailto:mmcallis...@homeaway.com wrote:
Hi

(This questions relates to Phoenix 4.2 on HDP 2.2)

I have a situation where I want to count the distinct combination of a couple 
of columns.

When I try the following:-

select count(distinct a.col1, b.col2)
from table tab1 a
inner join tab2 b on b.joincol = a.joincol
where a.col3 = ‘some condition’
and b.col4 = ‘some other condition';

I get the following error:-

Error: ERROR 605 (42P00): Syntax error. Unknown function: DISTINCT_COUNT. 
(state=42P00,code=605)

Playing around with this it looks like count(distinct) works with a single 
column, but not more.

So I try this:-

SELECT count(*)
FROM
  (SELECT a.col1,
  b.col2
   FROM TABLE tab1 a
   INNER JOIN tab2 b ON b.joincol = a.joincol
   WHERE a.col3 = ‘SOME condition’
 AND b.col4 = ‘SOME other condition'
   GROUP BY a.col1,
b.col2) ;

I get the following error:-

Error: Complex nested queries not supported. (state=,code=0)

So, my question … is there any way to get what I’m looking for?

Regards,

Mike





Re: count distinct

2015-06-23 Thread James Taylor
Michael,
You're correct, count distinct doesn't support multiple arguments currently
(I filed PHOENIX-2062 for this). Another workaround is to combine a.col1
and b.col2 into an expression, for example concatenating them. If order
matters, you could do this:

select count(distinct col1 || col2) ...

If order doesn't matter, you could do this:

select count(distinct case when col1  col2 then col1 || col2 else col2 ||
col1 end) ...

Thanks,
James

On Tue, Jun 23, 2015 at 6:56 PM, Michael McAllister 
mmcallis...@homeaway.com wrote:

  

 Your second query should work with Phoenix 4.3 or later.

  

  Thanks, unfortunately at the moment I’m stuck with Phoenix 4.2.

  

 I will investigate the problem with the first one and get back to you.

  

  Appreciate this.

   Michael McAllister
  Staff Data Warehouse Engineer | Decision Systems
  mmcallis...@homeaway.com | C: 512.423.7447 | skype: michael.mcallister.ha
 zimmk...@hotmail.com | webex: https://h.a/mikewebex


 This electronic communication (including any attachment) is confidential.
 If you are not an intended recipient of this communication, please be
 advised that any disclosure, dissemination, distribution, copying or other
 use of this communication or any attachment is strictly prohibited.  If you
 have received this communication in error, please notify the sender
 immediately by reply e-mail and promptly destroy all electronic and printed
 copies of this communication and any attachment.

  On Jun 23, 2015, at 7:54 PM, Maryann Xue maryann@gmail.com wrote:

 Sorry, I missed the first line. Your second query should work with Phoenix
 4.3 or later.


  I will investigate the problem with the first one and get back to you.


  Thanks,
 Maryann

 On Tuesday, June 23, 2015, Michael McAllister mmcallis...@homeaway.com
 wrote:

 Hi

  (This questions relates to Phoenix 4.2 on HDP 2.2)

  I have a situation where I want to count the distinct combination of a
 couple of columns.

  When I try the following:-

  select count(distinct a.col1, b.col2)
 from table tab1 a
 inner join tab2 b on b.joincol = a.joincol
 where a.col3 = ‘some condition’
 and b.col4 = ‘some other condition';

  I get the following error:-

  Error: ERROR 605 (42P00): Syntax error. Unknown function:
 DISTINCT_COUNT. (state=42P00,code=605)

  Playing around with this it looks like count(distinct) works with a
 single column, but not more.

  So I try this:-

  SELECT count(*)
 FROM
   (SELECT a.col1,
   b.col2
FROM TABLE tab1 a
INNER JOIN tab2 b ON b.joincol = a.joincol
WHERE a.col3 = ‘SOME condition’
  AND b.col4 = ‘SOME other condition'
GROUP BY a.col1,
 b.col2) ;

  I get the following error:-

  Error: Complex nested queries not supported. (state=,code=0)

  So, my question … is there any way to get what I’m looking for?

  Regards,

  Mike






Re: count distinct

2015-06-23 Thread Maryann Xue
Which version of Phoenix are you using?

On Tuesday, June 23, 2015, Michael McAllister mmcallis...@homeaway.com
wrote:

  Hi

  (This questions relates to Phoenix 4.2 on HDP 2.2)

  I have a situation where I want to count the distinct combination of a
 couple of columns.

  When I try the following:-

  select count(distinct a.col1, b.col2)
 from table tab1 a
 inner join tab2 b on b.joincol = a.joincol
 where a.col3 = ‘some condition’
 and b.col4 = ‘some other condition';

  I get the following error:-

  Error: ERROR 605 (42P00): Syntax error. Unknown function:
 DISTINCT_COUNT. (state=42P00,code=605)

  Playing around with this it looks like count(distinct) works with a
 single column, but not more.

  So I try this:-

  SELECT count(*)
 FROM
   (SELECT a.col1,
   b.col2
FROM TABLE tab1 a
INNER JOIN tab2 b ON b.joincol = a.joincol
WHERE a.col3 = ‘SOME condition’
  AND b.col4 = ‘SOME other condition'
GROUP BY a.col1,
 b.col2) ;

  I get the following error:-

  Error: Complex nested queries not supported. (state=,code=0)

  So, my question … is there any way to get what I’m looking for?

  Regards,

  Mike





Re: count distinct

2015-06-23 Thread Maryann Xue
Sorry, I missed the first line. Your second query should work with Phoenix
4.3 or later.


I will investigate the problem with the first one and get back to you.


Thanks,
Maryann

On Tuesday, June 23, 2015, Michael McAllister mmcallis...@homeaway.com
wrote:

  Hi

  (This questions relates to Phoenix 4.2 on HDP 2.2)

  I have a situation where I want to count the distinct combination of a
 couple of columns.

  When I try the following:-

  select count(distinct a.col1, b.col2)
 from table tab1 a
 inner join tab2 b on b.joincol = a.joincol
 where a.col3 = ‘some condition’
 and b.col4 = ‘some other condition';

  I get the following error:-

  Error: ERROR 605 (42P00): Syntax error. Unknown function:
 DISTINCT_COUNT. (state=42P00,code=605)

  Playing around with this it looks like count(distinct) works with a
 single column, but not more.

  So I try this:-

  SELECT count(*)
 FROM
   (SELECT a.col1,
   b.col2
FROM TABLE tab1 a
INNER JOIN tab2 b ON b.joincol = a.joincol
WHERE a.col3 = ‘SOME condition’
  AND b.col4 = ‘SOME other condition'
GROUP BY a.col1,
 b.col2) ;

  I get the following error:-

  Error: Complex nested queries not supported. (state=,code=0)

  So, my question … is there any way to get what I’m looking for?

  Regards,

  Mike





Fwd: Issues w/ count (distinct col) queries via phoenix table with secondary indexes

2015-04-28 Thread Jude K
Hi,

Yesterday, we created a secondary index on one our tables to help improve
the read speed of performing select count queries like the one below. The
query w/ index runs fast but provides the same counts. The query w/o index
(/*+ NO_INDEX */) runs slower but provides the right counts for each
column.

The index is covered and includes all the columns specified below. The bug
we are observing seems similar to
https://issues.apache.org/jira/browse/PHOENIX-814 and
https://issues.apache.org/jira/browse/PHOENIX-1203.

Can anyone points us in the right direction to correct this issue (or
workaround) as we need fast retrieval and proper counting?

We have the following environment:

Hbase 0.98.4.2.2.0.0
Phoenix: 4.2.0.2.2.0.0-2041
Core (CPU) - 8
RAM - 29 Gb
OS:centos6 (x86_64)



 QUERY +++
SELECT A.COL1, A.COL2, COUNT(DISTINCT A.COL3), COUNT(DISTINCT A.COL4)

FROM A

WHERE A.COL5 = 1427864399000 AND  A.COL5 = 1427947199000

GROUP BY  A.COL1, A.COL2

HAVING COUNT(DISTINCT A.COL3)  10;

A.COL1 and A.COL2 are varchars. The rest of the columns are unsigned_longs.
 QUERY +++


Re: Issues w/ count (distinct col) queries via phoenix table with secondary indexes

2015-04-28 Thread James Taylor
Hey Jude,
Would you mind trying with 4.3.1 release and letting us know if the
issue is resolved?
Thanks,
James

On Tue, Apr 28, 2015 at 6:07 PM, Jude K j2k...@gmail.com wrote:
 Hi,

 Yesterday, we created a secondary index on one our tables to help improve
 the read speed of performing select count queries like the one below. The
 query w/ index runs fast but provides the same counts. The query w/o index
 (/*+ NO_INDEX */) runs slower but provides the right counts for each column.

 The index is covered and includes all the columns specified below. The bug
 we are observing seems similar to
 https://issues.apache.org/jira/browse/PHOENIX-814 and
 https://issues.apache.org/jira/browse/PHOENIX-1203.

 Can anyone points us in the right direction to correct this issue (or
 workaround) as we need fast retrieval and proper counting?

 We have the following environment:

 Hbase 0.98.4.2.2.0.0
 Phoenix: 4.2.0.2.2.0.0-2041
 Core (CPU) - 8
 RAM - 29 Gb
 OS:centos6 (x86_64)



  QUERY +++
 SELECT A.COL1, A.COL2, COUNT(DISTINCT A.COL3), COUNT(DISTINCT A.COL4)

 FROM A

 WHERE A.COL5 = 1427864399000 AND  A.COL5 = 1427947199000

 GROUP BY  A.COL1, A.COL2

 HAVING COUNT(DISTINCT A.COL3)  10;

 A.COL1 and A.COL2 are varchars. The rest of the columns are unsigned_longs.
  QUERY +++










Issues w/ count (distinct col) queries via phoenix table with secondary indexes

2015-04-28 Thread Jude K
Yesterday, we created a secondary index on one our tables to help improve
the read speed of performing select count queries like the one below. The
query w/ index runs fast but provides the same counts. The query w/o index
(/*+ NO_INDEX */) runs slower but provides the right counts for each
column.

The index is covered and includes all the columns specified below. The bug
we are observing seems similar to
https://issues.apache.org/jira/browse/PHOENIX-814 and
https://issues.apache.org/jira/browse/PHOENIX-1203.

Can anyone points us in the right direction to correct this issue (or
workaround) as we need fast retrieval and proper counting?

We have the following environment:

Hbase 0.98.4.2.2.0.0
Phoenix: 4.2.0.2.2.0.0-2041
Core (CPU) - 8
RAM - 29 Gb
OS:centos6 (x86_64)



 QUERY +++
SELECT A.COL1, A.COL2, COUNT(DISTINCT A.COL3), COUNT(DISTINCT A.COL4)

FROM A

WHERE A.COL5 = 1427864399000 AND  A.COL5 = 1427947199000

GROUP BY  A.COL1, A.COL2

HAVING COUNT(DISTINCT A.COL3)  10;

A.COL1 and A.COL2 are varchars. The rest of the columns are unsigned_longs.
 QUERY +++