Re: [FastBit-users] Fastbit aggregate functions?

K. John Wu Fri, 02 Oct 2009 11:05:41 -0700

Dear Jan,

Thanks for your interested in our work.  We appreciate your 
suggestions and will put them on our list of things to do. 
Unfortunately, some of the items that takes a lot of programming 
effort might take a long time to come about.

Regards,

John

On 10/1/2009 11:45 PM, Jan Steemann wrote:
> Hi,
> 
> I have been looking into Fastbit's support of aggregate functions and I have 
> a few questions on that (or probably they turn out to be all feature 
> requests):
> 
> 1. is there currently any way to count the number of distinct values in a 
> group?
> In some SQL products, I'd issue something like
> SELECT event_id, /* id of the event */ 
>        COUNT(*) /* how many times did the event occur */,
>        COUNT(DISTINCT(item_id)) /* how many distinct items where affected by 
> the event */
> FROM table
> GROUP BY event_id;
> 
> I have peeked at the ibis source code and did not find any equivalent for a 
> COUNT(DISTINCT()) or a DISTINCT.

Currently, there are some support for count function, however, there 
is some inconsistency in the display of the information.  If you do 
need to know the how many entries there are in a group, the easiest 
thing to do is to look at the last column generated by the function 
ibis::table::select, which is currently always COUNT(*).

We don't current support the keyword DISTINCT.  Because our target 
applications do not currently use it, it might take a while for this 
to be moved close to the top of our to-do list.  With that said, we 
are always looking for compelling applications.  If you have a good 
use-case, we might be able to make a case for moving it higher on our 
to-do list.

> 
> 
> 2. Native Fastbit support for more aggregate functions like STD(), VAR() 
> would be great.

Yes, it would be nice to have these functions natively supported. 
This is mostly a convenience issue.

> 
> 
> 3. Conditional logic for aggregate functions would be absolutely great to 
> create cross-breaks.
> In some SQL products, this would look like
> SELECT event_id, SUM(IF(event_type = 1, 1, 0)), SUM(IF(event_type = 2, 1, 0)) 
> FROM table
> GROUP BY event_id

This can potentially be broken into a number of queries as follows

select event_id, count(*) from table where event_type = 1;
select event_id, count(*) from table where event_type = 2;
select event_id, count(*) from table where event_type = 3;
...

> 
> 
> 4. Finally, HAVING clauses operating on the results of grouped values would 
> be a nice-to-have extension to Fastbit though not absolutely necessary (can 
> be implemented outside of Fastbit).

The HAVING clause is generally a shorthand for nested queries, most of 
which can be implemented as nested queries as follows in FastBit.

select store_name, sum(sales) from store_table group by store_name 
having sum(sales) >1500;

ibis::table* result1 = store_table.select("store_name, sum(sales) as 
total", "sales > 0"); // need a dummy where clause here
ibis:;table* result2 = result1->select("store_name, total", "total > 
1500");

If you do have a chance to try this, please let us know if you 
encounters any problems.
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Re: [FastBit-users] Fastbit aggregate functions?

Reply via email to