That's fantastic! I'm currently trying to get FastBit to build as a static library and run on my iPad before I go any further. I'll try out your latest change right after that.
Thanks! On Mon, Aug 20, 2012 at 9:25 PM, K. John Wu <[email protected]> wrote: > Hi, Michael, > > It turned out that there is a relatively straightforward way to > support group_concat in FastBit. The new code is checked in as SVN > revision 543. Please give it a try when get the chance. > > Thanks for the suggestion. > > John > > > On 8/19/12 9:57 PM, Michael Beauregard wrote: >> John, Thanks for your help in getting me off the ground. >> >> It turned out that some columns were imported from csv incorrectly and >> my initial experiment queries just happened to use the columns that >> had import problems. Ultimately, this just caused confusing query >> results at a time when I was trying to build my understanding of how >> to use FastBit. Now that I have the csv import cleaned up, I am seeing >> exactly what I expect for almost everything I need. >> >> I have one query that relies upon the non-standard 'group_concat' >> aggregation function as documented at >> http://www.sqlite.org/lang_aggfunc.html. It allows me to determine >> which rows were used in each aggregated value which is required for >> one particular feature of my application. Is it possible to get >> similar information from FastBit? >> >> Thanks! >> >> Michael >> >> On Fri, Aug 17, 2012 at 3:19 PM, K. John Wu <[email protected]> wrote: >>> Hi, Michael, >>> >>> The command line tool ibis always expects a where clause. In a pinch, >>> you can try something like "where 1 = 1". Your query "SELECT >>> IncidentBeginMonth, Priority, count(*)" needs to be written as "SELECT >>> IncidentBeginMonth, Priority, count(*) WHERE 1 = 1". >>> >>> John >>> >>> >>> >>> On 8/17/12 3:15 PM, Michael Beauregard wrote: >>>> Thanks for your quick reply. I plan to convert the IncidentBeginTime >>>> to unix epoch timestamps in my source data anyway - which should also >>>> work with FastBit. >>>> >>>> Based on your input, I've simplified my query to be: SELECT >>>> IncidentBeginMonth, Priority, count(*) >>>> >>>> Executing this results in the following unhelpful error: >>>> >>>> tableSelect:: select(IncidentBeginMonth, Priority, count(*), ) failed >>>> on table T-era >>>> >>>> I cranked up the logging (-v=10), but I don't see any details that >>>> give me any clue as to what is wrong. >>>> >>>> Thanks again, >>>> >>>> Michael >>>> >>>> >>>> On Fri, Aug 17, 2012 at 2:56 PM, K. John Wu <[email protected]> wrote: >>>>> >>>>> Hi, Michael, >>>>> >>>>> Thanks for your interest in FastBit. I would encourage you to give >>>>> FastBit a try. As for your specific example, >>>>> >>>>> SELECT IncidentBeginMonth, Priority, count(*) FROM events WHERE >>>>> IncidentBeginTime BETWEEN '2011-01-01 00:00:00' AND '2012-12-31 >>>>> 23:59:59' GROUP BY IncidentBeginMonth, Priority ORDER BY >>>>> IncidentBeginMonth, Priority >>>>> >>>>> Here are a couple things to note: >>>>> >>>>> - FastBit does not have explicit support for date and time, you will >>>>> need to convert that to something else >>>>> >>>>> - You can specify the "group-by" clause and the "order-by" clause as >>>>> is to the command line tool ibis. It should behave as you expect. >>>>> >>>>> Due to the presence of function 'count(*)' in the select clause, a >>>>> natural interpretation is that the user wants to count the number of >>>>> entries for each distinct combinations of IndidentBeginMonth and >>>>> Priority. In this case, the group by clause in redundant - it >>>>> specifies the same information already implied in the select clause. >>>>> The order by clause tell the system in what order to output the >>>>> results. The default is to order the columns as they appeared in the >>>>> select clause - your example illustrate this default behavior. >>>>> Therefore, the order by clause could be neglected as well. >>>>> >>>>> Hope this helps. >>>>> >>>>> John >>>>> >>>>> >>>>> >>>>> On 8/17/12 1:34 PM, Michael Beauregard wrote: >>>>>> Hi all, >>>>>> >>>>>> I'm building an analytical iOS application that currently uses sqlite >>>>>> to slice and dice about 2.5MB (30K rows) of data. I've been able to >>>>>> get performance to an acceptable level by relying upon many sql >>>>>> indices, one for each slice/dice combination. Moreover, the queries >>>>>> would still be too slow unless I ensure each query has its own >>>>>> "covering" index. This approach has resulted in an explosion in the db >>>>>> size due to the vast indexing required for performance. A 2.5MB >>>>>> dataset becomes 45MB when sufficiently indexed. This strategy has >>>>>> taken us pretty far, but is at its performance and scalability limits. >>>>>> >>>>>> My customer recently informed me that we now need to go well >>>>>> beyond (at least 10x) our initial scale requirement. Plus he >>>>>> apparently needs to add more ways to slice and dice the data, >>>>>> basically leading to further explosion of the already unacceptable sql >>>>>> index size. This essentially means that I'm not able to continue with >>>>>> my current architecture and I'm looking for other options. >>>>>> >>>>>> I stumbled upon FastBit today while researching various options and >>>>>> I'm really intrigued by what I see. However, I don't understand how I >>>>>> can express the required queries because of FastBit's implicit >>>>>> group-by functionality. Perhaps I'm just missing something obvious, >>>>>> but it seems like I need explicit control over the group-by clause. >>>>>> >>>>>> For instance, here is a query that determines the number of events >>>>>> that occurred in each month, broken down by month and priority of the >>>>>> event: >>>>>> >>>>>> SELECT IncidentBeginMonth, Priority, count(*) FROM events WHERE >>>>>> IncidentBeginTime BETWEEN '2011-01-01 00:00:00' AND '2012-12-31 >>>>>> 23:59:59' GROUP BY IncidentBeginMonth, Priority ORDER BY >>>>>> IncidentBeginMonth, Priority >>>>>> >>>>>> The result is displayed in a stacked bar chart with one bar per month >>>>>> and each bar broken down by event priority. This query executes in >>>>>> around 200ms on 30K rows on an iPad 3 which is fast enough, but only >>>>>> barely. >>>>>> >>>>>> I would really appreciate if someone could help me figure out how to >>>>>> produce an equivalent query in FastBit. I have already imported my >>>>>> data and have been trying various queries using the ibis command, but >>>>>> I'm kind of stuck. >>>>>> >>>>>> Thanks for taking the time to read and consider my question. >>>>>> >>>>>> Michael >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> FastBit-users mailing list >>>>>> [email protected] >>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>>> _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
