Hi, Avi,

You are absolutely right that FastBit could have implemented something 
smarter for these aggregation functions.  We probably will get around 
to implement something new in a few months.  In the mean time, if you 
have some ideas feel free to give them a try.  We would love to have 
some contributions to speed up the computation of aggregation functions.

Thanks.

John


On 2/15/2011 6:39 AM, Avi Haleva wrote:
> Hello,
> I'm exploring the use of fastbit using the mensa interface to generate
> queries on a table that is constructed from many partitions.
> The queries will use an aggregation function (either distinct or sum)
> on one or two columns (e.g. sum(col_1), distinct(col_2) ), so the
> final result set will contain 1 row with 1 or 2 columns.
> I was working on a large data set of about 160 million records using
> 90 partitions (each partition has ~1.8 million records)
> I've noticed that for the sum column, fastbit allocate memory for the
> number of hits the sum is working on (based on the where criteria) x
> double. and after the calculation of the sum, this memory is released.
> I was wondering why this is needed, as the sum aggregation function
> can itterate on the actual column that is cached based on the hits
> array (by the way, I used to to perfom the sum on a byte/short column)
> and use an array of results based on the groupby size (in my case 1 as
> no column was retrieved with no aggreagation function).
> When the SUM was perfomed on all of the 160 million records, the
> amount of memory that was allocated and released later was ~1.2GB
> Am I missing something ?
> Is there a workaround for this use case (avoiding the need to allocate
> this memory) ?
> Thanks in advance,
> Avi
>
>
>
> _______________________________________________
> FastBit-users mailing list
> [email protected]
> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to