Hi, Sean, I presume that you would want to do some operations on the selected rows, and hopefully, your operations do not involve median or percentiles. In this case, you can partition the data into smaller pieces, say 50M or 100M rows per partition, then the amount of memory required by FastBit to complete the operations would be relatively modest.
In the particular example, 'ibis -d . -q "select id1, id2"', I suspect that the majority of the execution time is spent on writing out the two IDs to /dev/null. Tye 'ibis -d . -q "select id1, id2, count(*)"' to see if it makes any difference. John On 3/21/14, 7:05 PM, Sean McNamara wrote: > Hi John- > > I had a question about large result sizes. For our queries it seems > that past a point, the size of the result begins to have a large > impact on completion time. If our queries do heavy event filtering, > fastbit is crazy insane fast! Some of our other queries aren't able > to filter down as much though, and may return 200M rows from a 250M > dataset (the dataset has many columns, but we only select out 2 long > columns, all other columns are for filtering). > > > So I suppose my question is: is there a way to iterate faster over > larger results? (we are currently using the table interface /w a > cursor). For example doing a plain select on a pair of longs /w 250M > rows in ibis takes almost a minute: > > time ibis -d . -q "select id1, id2" > /dev/null 2>&1 > > real 0m48.010s > user 0m44.265s > sys 0m3.761s > > > Do you have any insight into places we could look to trim down the > time /w larger results? > > Thanks! > > Sean > > > > _______________________________________________ > FastBit-users mailing list > [email protected] > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
