Aggregates

2007-03-21 Thread Shelley L

I need to use several aggregate functions, including most of the built-in
aggregates (count, min, max, avg, etc.) as well as some custom aggregates
(standard deviation and various percentiles).

Is there any documentation or example on how to create custom aggregate
functions?  I have found one reference [1] which only mentions that custom
aggregates can be created, but contains no other details.

(Also, are there any plans to include standard deviation or percentiles as
built-in aggregates in derby?)

[1]
http://db.apache.org/derby/docs/10.2/ref/rrefsqlj33923.html#rrefsqlj33923__sqlj92398


Re: Aggregates

2007-03-21 Thread Rick Hillegas

Hi Shelley,

There is an enhancement request logged for this feature: 
http://issues.apache.org/jira/browse/DERBY-672. No-one has picked up 
this enhancement request yet--you are welcome to! Most of the code 
needed for this feature is actually in the Derby codeline. I think the 
blocking issue is that the community must agree on syntax for declaring 
user defined aggregates.


Right now you can fake a user defined aggregate. An example of how to do 
this can be found in the scores demo package in java/demo/scores. For 
more details, please grep that subtree for getMedianTestScore.


Hope this helps,
-Rick

Shelley L wrote:
I need to use several aggregate functions, including most of the 
built-in aggregates (count, min, max, avg, etc.) as well as some 
custom aggregates (standard deviation and various percentiles).


Is there any documentation or example on how to create custom 
aggregate functions?  I have found one reference [1] which only 
mentions that custom aggregates can be created, but contains no other 
details.


(Also, are there any plans to include standard deviation or 
percentiles as built-in aggregates in derby?)


[1] 
http://db.apache.org/derby/docs/10.2/ref/rrefsqlj33923.html#rrefsqlj33923__sqlj92398 
http://db.apache.org/derby/docs/10.2/ref/rrefsqlj33923.html#rrefsqlj33923__sqlj92398




Re: How to emulate multiple DISTINCT aggregates?

2006-08-29 Thread Robert Enyedi
I can confirm what Ali said because {3,3,3} is the result I also receive 
on Derby.


However, it seems the correct result to me. Just think about it:

SELECT COUNT(tmp1.f1)
FROM tmp1
WHERE tmp1.f1 in ( SELECT DISTINCT b.f1 FROM tmp1 b)

This basically counts the number of values field f1 has.

Regards,
Robert

[EMAIL PROTECTED] wrote:


That doesn’t make sense.

Each sub select statement should have returned the following

Field A {1,2}

Field B {1,2}

Field C {1,2,3}

Then in the original select, your counts should be {2,2,3};

Note: I used Informix.

There is something interesting going on. You’re using a dummy table as 
a wrapper for 3.


Now I wonder what the optimizer thinks of this and how, if at all, 
does it parallelize the query?


I’ll try my sql method on a similar table to yours, but I do think the 
response you saw wasn’t correct.




*From:* Suavi Ali Demir [mailto:[EMAIL PROTECTED]
*Sent:* Monday, August 28, 2006 3:52 PM
*To:* Derby Discussion; [EMAIL PROTECTED]
*Subject:* RE: How to emulate multiple DISTINCT aggregates?

That query needs to be modified little bit. It does not work when

my table contains:

1



1



1

2



2



2

1



1



3

result is 3,3,3, where as it should have been 2,2,3.

This one works:

SELECT
(SELECT COUNT(DISTINCT b.field1) FROM table_a b) as field1_count,
(SELECT COUNT(DISTINCT c.field2) FROM table_a c) as field2_count,
(SELECT COUNT(DISTINCT d.field3) FROM table_a d) as field3_count
FROM SYSIBM.SYSDUMMY1

Regards,

Ali


*/[EMAIL PROTECTED]/* wrote:


Hi,

Uhm you can't do what you want to do as written

When you say SELECT COUNT(DISTINCT blah), it has to be unique to
your query.
So no multiple COUNT(DISTINCTS xxx) allowed in a single query.
However here's a work around that may help...

SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ...
FROM table_a a
WHERE a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b)
AND a.field2 in ( SELECT DISTINCT b.field2 FROM table_a b)
AND a.field3 in ( SELECT DISTINCT b.field3 FROM table_a b)
...

Well you get the idea.

It's then pretty straight forward.
Each subselect returns the distinct set of elements and then your
outer
select returns the count on these unique subsets.

I just tried it in Informix (I happen to have a table with a couple of
string columns of data that I am working on.)
Should work in Derby. Its all standard SQL

But Hey! What do I know?

-G

 -Original Message-
 From: Robert Enyedi [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 28, 2006 10:58 AM
 To: Derby Discussion
 Subject: How to emulate multiple DISTINCT aggregates?

 Hi,

 When trying to use multiple DISTINCT aggregates in Derby:

 /SELECT COUNT(distinct editable), COUNT( distinct visible )
 FROM item/

 the following error is reported:

 /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this
 time./

 For a simple query one could write:

 /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT
 DISTINCT visible FROM item) )
 FROM item/

 but this is actually pretty uncomfortable and unmaintainable in a
larger
 query.

 Does anyone know an elegant overcome for this Derby limitation?

 Thanks,
 Robert





RE: How to emulate multiple DISTINCT aggregates?

2006-08-29 Thread Michael Segel
Ok,

Yeah, you're right. 
I'm trying to think about what I did in my query that gave me the right
result.

In my test table, there were 65K rows and I was able to get the correct
count.

So I guess I'll have to go back to see what I did.


 -Original Message-
 From: Robert Enyedi [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, August 29, 2006 2:43 AM
 To: Derby Discussion
 Subject: Re: How to emulate multiple DISTINCT aggregates?
 
 I can confirm what Ali said because {3,3,3} is the result I also receive
 on Derby.
 
 However, it seems the correct result to me. Just think about it:
 
 SELECT COUNT(tmp1.f1)
 FROM tmp1
 WHERE tmp1.f1 in ( SELECT DISTINCT b.f1 FROM tmp1 b)
 
 This basically counts the number of values field f1 has.
 
 Regards,
 Robert
 
 [EMAIL PROTECTED] wrote:
 
  That doesn't make sense.
 
  Each sub select statement should have returned the following
 
  Field A {1,2}
 
  Field B {1,2}
 
  Field C {1,2,3}
 
  Then in the original select, your counts should be {2,2,3};
 
  Note: I used Informix.
 
  There is something interesting going on. You're using a dummy table as
  a wrapper for 3.
 
  Now I wonder what the optimizer thinks of this and how, if at all,
  does it parallelize the query?
 
  I'll try my sql method on a similar table to yours, but I do think the
  response you saw wasn't correct.
 
  
 
  *From:* Suavi Ali Demir [mailto:[EMAIL PROTECTED]
  *Sent:* Monday, August 28, 2006 3:52 PM
  *To:* Derby Discussion; [EMAIL PROTECTED]
  *Subject:* RE: How to emulate multiple DISTINCT aggregates?
 
  That query needs to be modified little bit. It does not work when
 
  my table contains:
 
  1
 
 
 
  1
 
 
 
  1
 
  2
 
 
 
  2
 
 
 
  2
 
  1
 
 
 
  1
 
 
 
  3
 
  result is 3,3,3, where as it should have been 2,2,3.
 
  This one works:
 
  SELECT
  (SELECT COUNT(DISTINCT b.field1) FROM table_a b) as field1_count,
  (SELECT COUNT(DISTINCT c.field2) FROM table_a c) as field2_count,
  (SELECT COUNT(DISTINCT d.field3) FROM table_a d) as field3_count
  FROM SYSIBM.SYSDUMMY1
 
  Regards,
 
  Ali
 
 
  */[EMAIL PROTECTED]/* wrote:
 
 
  Hi,
 
  Uhm you can't do what you want to do as written
 
  When you say SELECT COUNT(DISTINCT blah), it has to be unique to
  your query.
  So no multiple COUNT(DISTINCTS xxx) allowed in a single query.
  However here's a work around that may help...
 
  SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ...
  FROM table_a a
  WHERE a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b)
  AND a.field2 in ( SELECT DISTINCT b.field2 FROM table_a b)
  AND a.field3 in ( SELECT DISTINCT b.field3 FROM table_a b)
  ...
 
  Well you get the idea.
 
  It's then pretty straight forward.
  Each subselect returns the distinct set of elements and then your
  outer
  select returns the count on these unique subsets.
 
  I just tried it in Informix (I happen to have a table with a couple
 of
  string columns of data that I am working on.)
  Should work in Derby. Its all standard SQL
 
  But Hey! What do I know?
 
  -G
 
   -Original Message-
   From: Robert Enyedi [mailto:[EMAIL PROTECTED]
   Sent: Monday, August 28, 2006 10:58 AM
   To: Derby Discussion
   Subject: How to emulate multiple DISTINCT aggregates?
  
   Hi,
  
   When trying to use multiple DISTINCT aggregates in Derby:
  
   /SELECT COUNT(distinct editable), COUNT( distinct visible )
   FROM item/
  
   the following error is reported:
  
   /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at
 this
   time./
  
   For a simple query one could write:
  
   /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT
   DISTINCT visible FROM item) )
   FROM item/
  
   but this is actually pretty uncomfortable and unmaintainable in a
  larger
   query.
  
   Does anyone know an elegant overcome for this Derby limitation?
  
   Thanks,
   Robert
 





How to emulate multiple DISTINCT aggregates?

2006-08-28 Thread Robert Enyedi

Hi,

When trying to use multiple DISTINCT aggregates in Derby:

/SELECT COUNT(distinct editable), COUNT( distinct visible )
FROM item/

the following error is reported:

/ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this time./

For a simple query one could write:

/SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT 
DISTINCT visible FROM item) )

FROM item/

but this is actually pretty uncomfortable and unmaintainable in a larger 
query.


Does anyone know an elegant overcome for this Derby limitation?

Thanks,
Robert


RE: How to emulate multiple DISTINCT aggregates?

2006-08-28 Thread derby

Hi,

Uhm you can't do what you want to do as written

When you say SELECT COUNT(DISTINCT blah), it has to be unique to your query.
So no multiple COUNT(DISTINCTS xxx) allowed in a single query.
However here's a work around that may help...

SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ...
FROM table_a a
WHERE  a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b)
ANDa.field2 in ( SELECT DISTINCT b.field2 FROM table_a b)
ANDa.field3 in ( SELECT DISTINCT b.field3 FROM table_a b)
...

Well you get the idea.

It's then pretty straight forward.
Each subselect returns the distinct set of elements and then your outer
select returns the count on these unique subsets.

I just tried it in Informix (I happen to have  a table with a couple of
string columns of data that I am working on.)
Should work in Derby. Its all standard SQL

But Hey! What do I know? 

-G

 -Original Message-
 From: Robert Enyedi [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 28, 2006 10:58 AM
 To: Derby Discussion
 Subject: How to emulate multiple DISTINCT aggregates?
 
 Hi,
 
 When trying to use multiple DISTINCT aggregates in Derby:
 
 /SELECT COUNT(distinct editable), COUNT( distinct visible )
 FROM item/
 
 the following error is reported:
 
 /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this
 time./
 
 For a simple query one could write:
 
 /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT
 DISTINCT visible FROM item) )
 FROM item/
 
 but this is actually pretty uncomfortable and unmaintainable in a larger
 query.
 
 Does anyone know an elegant overcome for this Derby limitation?
 
 Thanks,
 Robert




RE: How to emulate multiple DISTINCT aggregates?

2006-08-28 Thread Suavi Ali Demir
That query needs to be modified little bit. It does notwork when  my table contains:1  1  12  2  21  1  3result is 3,3,3, where as it should have been 2,2,3.This one works:SELECT (SELECT COUNT(DISTINCT b.field1) FROM table_a b) as field1_count, (SELECT COUNT(DISTINCT c.field2) FROM table_a c) as field2_count, (SELECT COUNT(DISTINCT d.field3) FROM table_a d) as field3_countFROM SYSIBM.SYSDUMMY1Regards,  Ali  [EMAIL PROTECTED] wrote:  Hi,Uhm
 you can't do what you want to do as writtenWhen you say SELECT COUNT(DISTINCT blah), it has to be unique to your query.So no multiple COUNT(DISTINCTS xxx) allowed in a single query.However here's a work around that may help...SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ...FROM table_a aWHERE a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b)AND a.field2 in ( SELECT DISTINCT b.field2 FROM table_a b)AND a.field3 in ( SELECT DISTINCT b.field3 FROM table_a b)...Well you get the idea.It's then pretty straight forward.Each subselect returns the distinct set of elements and then your outerselect returns the count on these unique subsets.I just tried it in Informix (I happen to have a table with a couple ofstring columns of data that I am working on.)Should work in Derby. Its all standard SQLBut Hey! What do I know? -G -Original
 Message- From: Robert Enyedi [mailto:[EMAIL PROTECTED] Sent: Monday, August 28, 2006 10:58 AM To: Derby Discussion Subject: How to emulate multiple DISTINCT aggregates?  Hi,  When trying to use multiple DISTINCT aggregates in Derby:  /SELECT COUNT(distinct editable), COUNT( distinct visible ) FROM item/  the following error is reported:  /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this time./  For a simple query one could write:  /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT DISTINCT visible FROM item) ) FROM item/  but this is actually pretty uncomfortable and unmaintainable in a larger query.  Does anyone know an elegant overcome for this Derby limitation?  Thanks,
 Robert

RE: How to emulate multiple DISTINCT aggregates?

2006-08-28 Thread derby








That doesnt make sense.

Each sub select statement should have
returned the following

Field A {1,2}

Field B {1,2}

Field C {1,2,3}



Then in the original select, your counts
should be {2,2,3};



Note: I used Informix. 



There is something interesting going on.
Youre using a dummy table as a wrapper for 3.

Now I wonder what the optimizer thinks of
this and how, if at all, does it parallelize the query?



Ill try my sql method on a similar
table to yours, but I do think the response you saw wasnt correct.













From: Suavi Ali Demir
[mailto:[EMAIL PROTECTED] 
Sent: Monday, August 28, 2006 3:52
PM
To: Derby Discussion;
[EMAIL PROTECTED]
Subject: RE: How to emulate
multiple DISTINCT aggregates?







That query needs to be modified little bit. It does notwork when





my table contains:












 
  
  1
  
  
  1
  
  
  1
  
 
 
  
  2
  
  
  2
  
  
  2
  
 
 
  
  1
  
  
  1
  
  
  3
  
 












result is 3,3,3, where as it should have been 2,2,3.











This one works:











SELECT 
(SELECT COUNT(DISTINCT b.field1) FROM table_a b) as field1_count, 
(SELECT COUNT(DISTINCT c.field2) FROM table_a c) as field2_count, 
(SELECT COUNT(DISTINCT d.field3) FROM table_a d) as field3_count
FROM SYSIBM.SYSDUMMY1











Regards,





Ali






[EMAIL PROTECTED]
wrote:






Hi,

Uhm you can't do what you want to do as written

When you say SELECT COUNT(DISTINCT blah), it has to be unique to your query.
So no multiple COUNT(DISTINCTS xxx) allowed in a single query.
However here's a work around that may help...

SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ...
FROM table_a a
WHERE a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b)
AND a.field2 in ( SELECT DISTINCT b.field2 FROM table_a b)
AND a.field3 in ( SELECT DISTINCT b.field3 FROM table_a b)
...

Well you get the idea.

It's then pretty straight forward.
Each subselect returns the distinct set of elements and then your outer
select returns the count on these unique subsets.

I just tried it in Informix (I happen to have a table with a couple of
string columns of data that I am working on.)
Should work in Derby.
Its all standard SQL

But Hey! What do I know? 

-G

 -Original Message-
 From: Robert Enyedi [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 28, 2006 10:58 AM
 To: Derby Discussion
 Subject: How to emulate multiple DISTINCT aggregates?
 
 Hi,
 
 When trying to use multiple DISTINCT aggregates in Derby:
 
 /SELECT COUNT(distinct editable), COUNT( distinct visible )
 FROM item/
 
 the following error is reported:
 
 /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this
 time./
 
 For a simple query one could write:
 
 /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT
 DISTINCT visible FROM item) )
 FROM item/
 
 but this is actually pretty uncomfortable and unmaintainable in a larger
 query.
 
 Does anyone know an elegant overcome for this Derby limitation?
 
 Thanks,
 Robert