Hi, Michael, It turned out that there is a relatively straightforward way to support group_concat in FastBit. The new code is checked in as SVN revision 543. Please give it a try when get the chance.
Thanks for the suggestion. John On 8/19/12 9:57 PM, Michael Beauregard wrote: > John, Thanks for your help in getting me off the ground. > > It turned out that some columns were imported from csv incorrectly and > my initial experiment queries just happened to use the columns that > had import problems. Ultimately, this just caused confusing query > results at a time when I was trying to build my understanding of how > to use FastBit. Now that I have the csv import cleaned up, I am seeing > exactly what I expect for almost everything I need. > > I have one query that relies upon the non-standard 'group_concat' > aggregation function as documented at > http://www.sqlite.org/lang_aggfunc.html. It allows me to determine > which rows were used in each aggregated value which is required for > one particular feature of my application. Is it possible to get > similar information from FastBit? > > Thanks! > > Michael > > On Fri, Aug 17, 2012 at 3:19 PM, K. John Wu <[email protected]> wrote: >> Hi, Michael, >> >> The command line tool ibis always expects a where clause. In a pinch, >> you can try something like "where 1 = 1". Your query "SELECT >> IncidentBeginMonth, Priority, count(*)" needs to be written as "SELECT >> IncidentBeginMonth, Priority, count(*) WHERE 1 = 1". >> >> John >> >> >> >> On 8/17/12 3:15 PM, Michael Beauregard wrote: >>> Thanks for your quick reply. I plan to convert the IncidentBeginTime >>> to unix epoch timestamps in my source data anyway - which should also >>> work with FastBit. >>> >>> Based on your input, I've simplified my query to be: SELECT >>> IncidentBeginMonth, Priority, count(*) >>> >>> Executing this results in the following unhelpful error: >>> >>> tableSelect:: select(IncidentBeginMonth, Priority, count(*), ) failed >>> on table T-era >>> >>> I cranked up the logging (-v=10), but I don't see any details that >>> give me any clue as to what is wrong. >>> >>> Thanks again, >>> >>> Michael >>> >>> >>> On Fri, Aug 17, 2012 at 2:56 PM, K. John Wu <[email protected]> wrote: >>>> >>>> Hi, Michael, >>>> >>>> Thanks for your interest in FastBit. I would encourage you to give >>>> FastBit a try. As for your specific example, >>>> >>>> SELECT IncidentBeginMonth, Priority, count(*) FROM events WHERE >>>> IncidentBeginTime BETWEEN '2011-01-01 00:00:00' AND '2012-12-31 >>>> 23:59:59' GROUP BY IncidentBeginMonth, Priority ORDER BY >>>> IncidentBeginMonth, Priority >>>> >>>> Here are a couple things to note: >>>> >>>> - FastBit does not have explicit support for date and time, you will >>>> need to convert that to something else >>>> >>>> - You can specify the "group-by" clause and the "order-by" clause as >>>> is to the command line tool ibis. It should behave as you expect. >>>> >>>> Due to the presence of function 'count(*)' in the select clause, a >>>> natural interpretation is that the user wants to count the number of >>>> entries for each distinct combinations of IndidentBeginMonth and >>>> Priority. In this case, the group by clause in redundant - it >>>> specifies the same information already implied in the select clause. >>>> The order by clause tell the system in what order to output the >>>> results. The default is to order the columns as they appeared in the >>>> select clause - your example illustrate this default behavior. >>>> Therefore, the order by clause could be neglected as well. >>>> >>>> Hope this helps. >>>> >>>> John >>>> >>>> >>>> >>>> On 8/17/12 1:34 PM, Michael Beauregard wrote: >>>>> Hi all, >>>>> >>>>> I'm building an analytical iOS application that currently uses sqlite >>>>> to slice and dice about 2.5MB (30K rows) of data. I've been able to >>>>> get performance to an acceptable level by relying upon many sql >>>>> indices, one for each slice/dice combination. Moreover, the queries >>>>> would still be too slow unless I ensure each query has its own >>>>> "covering" index. This approach has resulted in an explosion in the db >>>>> size due to the vast indexing required for performance. A 2.5MB >>>>> dataset becomes 45MB when sufficiently indexed. This strategy has >>>>> taken us pretty far, but is at its performance and scalability limits. >>>>> >>>>> My customer recently informed me that we now need to go well >>>>> beyond (at least 10x) our initial scale requirement. Plus he >>>>> apparently needs to add more ways to slice and dice the data, >>>>> basically leading to further explosion of the already unacceptable sql >>>>> index size. This essentially means that I'm not able to continue with >>>>> my current architecture and I'm looking for other options. >>>>> >>>>> I stumbled upon FastBit today while researching various options and >>>>> I'm really intrigued by what I see. However, I don't understand how I >>>>> can express the required queries because of FastBit's implicit >>>>> group-by functionality. Perhaps I'm just missing something obvious, >>>>> but it seems like I need explicit control over the group-by clause. >>>>> >>>>> For instance, here is a query that determines the number of events >>>>> that occurred in each month, broken down by month and priority of the >>>>> event: >>>>> >>>>> SELECT IncidentBeginMonth, Priority, count(*) FROM events WHERE >>>>> IncidentBeginTime BETWEEN '2011-01-01 00:00:00' AND '2012-12-31 >>>>> 23:59:59' GROUP BY IncidentBeginMonth, Priority ORDER BY >>>>> IncidentBeginMonth, Priority >>>>> >>>>> The result is displayed in a stacked bar chart with one bar per month >>>>> and each bar broken down by event priority. This query executes in >>>>> around 200ms on 30K rows on an iPad 3 which is fast enough, but only >>>>> barely. >>>>> >>>>> I would really appreciate if someone could help me figure out how to >>>>> produce an equivalent query in FastBit. I have already imported my >>>>> data and have been trying various queries using the ibis command, but >>>>> I'm kind of stuck. >>>>> >>>>> Thanks for taking the time to read and consider my question. >>>>> >>>>> Michael >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> FastBit-users mailing list >>>>> [email protected] >>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>> _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
