Aggregates
I need to use several aggregate functions, including most of the built-in aggregates (count, min, max, avg, etc.) as well as some custom aggregates (standard deviation and various percentiles). Is there any documentation or example on how to create custom aggregate functions? I have found one reference [1] which only mentions that custom aggregates can be created, but contains no other details. (Also, are there any plans to include standard deviation or percentiles as built-in aggregates in derby?) [1] http://db.apache.org/derby/docs/10.2/ref/rrefsqlj33923.html#rrefsqlj33923__sqlj92398
Re: Aggregates
Hi Shelley, There is an enhancement request logged for this feature: http://issues.apache.org/jira/browse/DERBY-672. No-one has picked up this enhancement request yet--you are welcome to! Most of the code needed for this feature is actually in the Derby codeline. I think the blocking issue is that the community must agree on syntax for declaring user defined aggregates. Right now you can fake a user defined aggregate. An example of how to do this can be found in the scores demo package in java/demo/scores. For more details, please grep that subtree for getMedianTestScore. Hope this helps, -Rick Shelley L wrote: I need to use several aggregate functions, including most of the built-in aggregates (count, min, max, avg, etc.) as well as some custom aggregates (standard deviation and various percentiles). Is there any documentation or example on how to create custom aggregate functions? I have found one reference [1] which only mentions that custom aggregates can be created, but contains no other details. (Also, are there any plans to include standard deviation or percentiles as built-in aggregates in derby?) [1] http://db.apache.org/derby/docs/10.2/ref/rrefsqlj33923.html#rrefsqlj33923__sqlj92398 http://db.apache.org/derby/docs/10.2/ref/rrefsqlj33923.html#rrefsqlj33923__sqlj92398
Re: How to emulate multiple DISTINCT aggregates?
I can confirm what Ali said because {3,3,3} is the result I also receive on Derby. However, it seems the correct result to me. Just think about it: SELECT COUNT(tmp1.f1) FROM tmp1 WHERE tmp1.f1 in ( SELECT DISTINCT b.f1 FROM tmp1 b) This basically counts the number of values field f1 has. Regards, Robert [EMAIL PROTECTED] wrote: That doesn’t make sense. Each sub select statement should have returned the following Field A {1,2} Field B {1,2} Field C {1,2,3} Then in the original select, your counts should be {2,2,3}; Note: I used Informix. There is something interesting going on. You’re using a dummy table as a wrapper for 3. Now I wonder what the optimizer thinks of this and how, if at all, does it parallelize the query? I’ll try my sql method on a similar table to yours, but I do think the response you saw wasn’t correct. *From:* Suavi Ali Demir [mailto:[EMAIL PROTECTED] *Sent:* Monday, August 28, 2006 3:52 PM *To:* Derby Discussion; [EMAIL PROTECTED] *Subject:* RE: How to emulate multiple DISTINCT aggregates? That query needs to be modified little bit. It does not work when my table contains: 1 1 1 2 2 2 1 1 3 result is 3,3,3, where as it should have been 2,2,3. This one works: SELECT (SELECT COUNT(DISTINCT b.field1) FROM table_a b) as field1_count, (SELECT COUNT(DISTINCT c.field2) FROM table_a c) as field2_count, (SELECT COUNT(DISTINCT d.field3) FROM table_a d) as field3_count FROM SYSIBM.SYSDUMMY1 Regards, Ali */[EMAIL PROTECTED]/* wrote: Hi, Uhm you can't do what you want to do as written When you say SELECT COUNT(DISTINCT blah), it has to be unique to your query. So no multiple COUNT(DISTINCTS xxx) allowed in a single query. However here's a work around that may help... SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ... FROM table_a a WHERE a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b) AND a.field2 in ( SELECT DISTINCT b.field2 FROM table_a b) AND a.field3 in ( SELECT DISTINCT b.field3 FROM table_a b) ... Well you get the idea. It's then pretty straight forward. Each subselect returns the distinct set of elements and then your outer select returns the count on these unique subsets. I just tried it in Informix (I happen to have a table with a couple of string columns of data that I am working on.) Should work in Derby. Its all standard SQL But Hey! What do I know? -G -Original Message- From: Robert Enyedi [mailto:[EMAIL PROTECTED] Sent: Monday, August 28, 2006 10:58 AM To: Derby Discussion Subject: How to emulate multiple DISTINCT aggregates? Hi, When trying to use multiple DISTINCT aggregates in Derby: /SELECT COUNT(distinct editable), COUNT( distinct visible ) FROM item/ the following error is reported: /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this time./ For a simple query one could write: /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT DISTINCT visible FROM item) ) FROM item/ but this is actually pretty uncomfortable and unmaintainable in a larger query. Does anyone know an elegant overcome for this Derby limitation? Thanks, Robert
RE: How to emulate multiple DISTINCT aggregates?
Ok, Yeah, you're right. I'm trying to think about what I did in my query that gave me the right result. In my test table, there were 65K rows and I was able to get the correct count. So I guess I'll have to go back to see what I did. -Original Message- From: Robert Enyedi [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 29, 2006 2:43 AM To: Derby Discussion Subject: Re: How to emulate multiple DISTINCT aggregates? I can confirm what Ali said because {3,3,3} is the result I also receive on Derby. However, it seems the correct result to me. Just think about it: SELECT COUNT(tmp1.f1) FROM tmp1 WHERE tmp1.f1 in ( SELECT DISTINCT b.f1 FROM tmp1 b) This basically counts the number of values field f1 has. Regards, Robert [EMAIL PROTECTED] wrote: That doesn't make sense. Each sub select statement should have returned the following Field A {1,2} Field B {1,2} Field C {1,2,3} Then in the original select, your counts should be {2,2,3}; Note: I used Informix. There is something interesting going on. You're using a dummy table as a wrapper for 3. Now I wonder what the optimizer thinks of this and how, if at all, does it parallelize the query? I'll try my sql method on a similar table to yours, but I do think the response you saw wasn't correct. *From:* Suavi Ali Demir [mailto:[EMAIL PROTECTED] *Sent:* Monday, August 28, 2006 3:52 PM *To:* Derby Discussion; [EMAIL PROTECTED] *Subject:* RE: How to emulate multiple DISTINCT aggregates? That query needs to be modified little bit. It does not work when my table contains: 1 1 1 2 2 2 1 1 3 result is 3,3,3, where as it should have been 2,2,3. This one works: SELECT (SELECT COUNT(DISTINCT b.field1) FROM table_a b) as field1_count, (SELECT COUNT(DISTINCT c.field2) FROM table_a c) as field2_count, (SELECT COUNT(DISTINCT d.field3) FROM table_a d) as field3_count FROM SYSIBM.SYSDUMMY1 Regards, Ali */[EMAIL PROTECTED]/* wrote: Hi, Uhm you can't do what you want to do as written When you say SELECT COUNT(DISTINCT blah), it has to be unique to your query. So no multiple COUNT(DISTINCTS xxx) allowed in a single query. However here's a work around that may help... SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ... FROM table_a a WHERE a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b) AND a.field2 in ( SELECT DISTINCT b.field2 FROM table_a b) AND a.field3 in ( SELECT DISTINCT b.field3 FROM table_a b) ... Well you get the idea. It's then pretty straight forward. Each subselect returns the distinct set of elements and then your outer select returns the count on these unique subsets. I just tried it in Informix (I happen to have a table with a couple of string columns of data that I am working on.) Should work in Derby. Its all standard SQL But Hey! What do I know? -G -Original Message- From: Robert Enyedi [mailto:[EMAIL PROTECTED] Sent: Monday, August 28, 2006 10:58 AM To: Derby Discussion Subject: How to emulate multiple DISTINCT aggregates? Hi, When trying to use multiple DISTINCT aggregates in Derby: /SELECT COUNT(distinct editable), COUNT( distinct visible ) FROM item/ the following error is reported: /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this time./ For a simple query one could write: /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT DISTINCT visible FROM item) ) FROM item/ but this is actually pretty uncomfortable and unmaintainable in a larger query. Does anyone know an elegant overcome for this Derby limitation? Thanks, Robert
How to emulate multiple DISTINCT aggregates?
Hi, When trying to use multiple DISTINCT aggregates in Derby: /SELECT COUNT(distinct editable), COUNT( distinct visible ) FROM item/ the following error is reported: /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this time./ For a simple query one could write: /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT DISTINCT visible FROM item) ) FROM item/ but this is actually pretty uncomfortable and unmaintainable in a larger query. Does anyone know an elegant overcome for this Derby limitation? Thanks, Robert
RE: How to emulate multiple DISTINCT aggregates?
Hi, Uhm you can't do what you want to do as written When you say SELECT COUNT(DISTINCT blah), it has to be unique to your query. So no multiple COUNT(DISTINCTS xxx) allowed in a single query. However here's a work around that may help... SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ... FROM table_a a WHERE a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b) ANDa.field2 in ( SELECT DISTINCT b.field2 FROM table_a b) ANDa.field3 in ( SELECT DISTINCT b.field3 FROM table_a b) ... Well you get the idea. It's then pretty straight forward. Each subselect returns the distinct set of elements and then your outer select returns the count on these unique subsets. I just tried it in Informix (I happen to have a table with a couple of string columns of data that I am working on.) Should work in Derby. Its all standard SQL But Hey! What do I know? -G -Original Message- From: Robert Enyedi [mailto:[EMAIL PROTECTED] Sent: Monday, August 28, 2006 10:58 AM To: Derby Discussion Subject: How to emulate multiple DISTINCT aggregates? Hi, When trying to use multiple DISTINCT aggregates in Derby: /SELECT COUNT(distinct editable), COUNT( distinct visible ) FROM item/ the following error is reported: /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this time./ For a simple query one could write: /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT DISTINCT visible FROM item) ) FROM item/ but this is actually pretty uncomfortable and unmaintainable in a larger query. Does anyone know an elegant overcome for this Derby limitation? Thanks, Robert
RE: How to emulate multiple DISTINCT aggregates?
That query needs to be modified little bit. It does notwork when my table contains:1 1 12 2 21 1 3result is 3,3,3, where as it should have been 2,2,3.This one works:SELECT (SELECT COUNT(DISTINCT b.field1) FROM table_a b) as field1_count, (SELECT COUNT(DISTINCT c.field2) FROM table_a c) as field2_count, (SELECT COUNT(DISTINCT d.field3) FROM table_a d) as field3_countFROM SYSIBM.SYSDUMMY1Regards, Ali [EMAIL PROTECTED] wrote: Hi,Uhm you can't do what you want to do as writtenWhen you say SELECT COUNT(DISTINCT blah), it has to be unique to your query.So no multiple COUNT(DISTINCTS xxx) allowed in a single query.However here's a work around that may help...SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ...FROM table_a aWHERE a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b)AND a.field2 in ( SELECT DISTINCT b.field2 FROM table_a b)AND a.field3 in ( SELECT DISTINCT b.field3 FROM table_a b)...Well you get the idea.It's then pretty straight forward.Each subselect returns the distinct set of elements and then your outerselect returns the count on these unique subsets.I just tried it in Informix (I happen to have a table with a couple ofstring columns of data that I am working on.)Should work in Derby. Its all standard SQLBut Hey! What do I know? -G -Original Message- From: Robert Enyedi [mailto:[EMAIL PROTECTED] Sent: Monday, August 28, 2006 10:58 AM To: Derby Discussion Subject: How to emulate multiple DISTINCT aggregates? Hi, When trying to use multiple DISTINCT aggregates in Derby: /SELECT COUNT(distinct editable), COUNT( distinct visible ) FROM item/ the following error is reported: /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this time./ For a simple query one could write: /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT DISTINCT visible FROM item) ) FROM item/ but this is actually pretty uncomfortable and unmaintainable in a larger query. Does anyone know an elegant overcome for this Derby limitation? Thanks, Robert
RE: How to emulate multiple DISTINCT aggregates?
That doesnt make sense. Each sub select statement should have returned the following Field A {1,2} Field B {1,2} Field C {1,2,3} Then in the original select, your counts should be {2,2,3}; Note: I used Informix. There is something interesting going on. Youre using a dummy table as a wrapper for 3. Now I wonder what the optimizer thinks of this and how, if at all, does it parallelize the query? Ill try my sql method on a similar table to yours, but I do think the response you saw wasnt correct. From: Suavi Ali Demir [mailto:[EMAIL PROTECTED] Sent: Monday, August 28, 2006 3:52 PM To: Derby Discussion; [EMAIL PROTECTED] Subject: RE: How to emulate multiple DISTINCT aggregates? That query needs to be modified little bit. It does notwork when my table contains: 1 1 1 2 2 2 1 1 3 result is 3,3,3, where as it should have been 2,2,3. This one works: SELECT (SELECT COUNT(DISTINCT b.field1) FROM table_a b) as field1_count, (SELECT COUNT(DISTINCT c.field2) FROM table_a c) as field2_count, (SELECT COUNT(DISTINCT d.field3) FROM table_a d) as field3_count FROM SYSIBM.SYSDUMMY1 Regards, Ali [EMAIL PROTECTED] wrote: Hi, Uhm you can't do what you want to do as written When you say SELECT COUNT(DISTINCT blah), it has to be unique to your query. So no multiple COUNT(DISTINCTS xxx) allowed in a single query. However here's a work around that may help... SELECT COUNT(a.field1), COUNT(a.field2), COUNT(a.field3) ... FROM table_a a WHERE a.field1 in ( SELECT DISTINCT b.field1 FROM table_a b) AND a.field2 in ( SELECT DISTINCT b.field2 FROM table_a b) AND a.field3 in ( SELECT DISTINCT b.field3 FROM table_a b) ... Well you get the idea. It's then pretty straight forward. Each subselect returns the distinct set of elements and then your outer select returns the count on these unique subsets. I just tried it in Informix (I happen to have a table with a couple of string columns of data that I am working on.) Should work in Derby. Its all standard SQL But Hey! What do I know? -G -Original Message- From: Robert Enyedi [mailto:[EMAIL PROTECTED] Sent: Monday, August 28, 2006 10:58 AM To: Derby Discussion Subject: How to emulate multiple DISTINCT aggregates? Hi, When trying to use multiple DISTINCT aggregates in Derby: /SELECT COUNT(distinct editable), COUNT( distinct visible ) FROM item/ the following error is reported: /ERROR 42Z02: Multiple DISTINCT aggregates are not supported at this time./ For a simple query one could write: /SELECT COUNT(distinct editable) AS editable_count, COUNT( (SELECT DISTINCT visible FROM item) ) FROM item/ but this is actually pretty uncomfortable and unmaintainable in a larger query. Does anyone know an elegant overcome for this Derby limitation? Thanks, Robert