Hi, Michael,

It turned out that there is a relatively straightforward way to
support group_concat in FastBit.  The new code is checked in as SVN
revision 543.  Please give it a try when get the chance.

Thanks for the suggestion.

John


On 8/19/12 9:57 PM, Michael Beauregard wrote:
> John, Thanks for your help in getting me off the ground.
> 
> It turned out that some columns were imported from csv incorrectly and
> my initial experiment queries just happened to use the columns that
> had import problems. Ultimately, this just caused confusing query
> results at a time when I was trying to build my understanding of how
> to use FastBit. Now that I have the csv import cleaned up, I am seeing
> exactly what I expect for almost everything I need.
> 
> I have one query that relies upon the non-standard 'group_concat'
> aggregation function as documented at
> http://www.sqlite.org/lang_aggfunc.html. It allows me to determine
> which rows were used in each aggregated value which is required for
> one particular feature of my application. Is it possible to get
> similar information from FastBit?
> 
> Thanks!
> 
> Michael
> 
> On Fri, Aug 17, 2012 at 3:19 PM, K. John Wu <[email protected]> wrote:
>> Hi, Michael,
>>
>> The command line tool ibis always expects a where clause.  In a pinch,
>> you can try something like "where 1 = 1".  Your query "SELECT
>> IncidentBeginMonth, Priority, count(*)" needs to be written as "SELECT
>> IncidentBeginMonth, Priority, count(*) WHERE 1 = 1".
>>
>> John
>>
>>
>>
>> On 8/17/12 3:15 PM, Michael Beauregard wrote:
>>> Thanks for your quick reply. I plan to convert the IncidentBeginTime
>>> to unix epoch timestamps in my source data anyway - which should also
>>> work with FastBit.
>>>
>>> Based on your input, I've simplified my query to be: SELECT
>>> IncidentBeginMonth, Priority, count(*)
>>>
>>> Executing this results in the following unhelpful error:
>>>
>>> tableSelect:: select(IncidentBeginMonth, Priority, count(*), ) failed
>>> on table T-era
>>>
>>> I cranked up the logging (-v=10), but I don't see any details that
>>> give me any clue as to what is wrong.
>>>
>>> Thanks again,
>>>
>>> Michael
>>>
>>>
>>> On Fri, Aug 17, 2012 at 2:56 PM, K. John Wu <[email protected]> wrote:
>>>>
>>>> Hi, Michael,
>>>>
>>>> Thanks for your interest in FastBit.  I would encourage you to give
>>>> FastBit a try.  As for your specific example,
>>>>
>>>> SELECT IncidentBeginMonth, Priority, count(*) FROM events WHERE
>>>> IncidentBeginTime BETWEEN '2011-01-01 00:00:00' AND '2012-12-31
>>>> 23:59:59'  GROUP BY IncidentBeginMonth, Priority ORDER BY
>>>> IncidentBeginMonth, Priority
>>>>
>>>> Here are a couple things to note:
>>>>
>>>> - FastBit does not have explicit support for date and time, you will
>>>> need to convert that to something else
>>>>
>>>> - You can specify the "group-by" clause and the "order-by" clause as
>>>> is to the command line tool ibis.  It should behave as you expect.
>>>>
>>>> Due to the presence of function 'count(*)' in the select clause, a
>>>> natural interpretation is that the user wants to count the number of
>>>> entries for each distinct combinations of IndidentBeginMonth and
>>>> Priority.  In this case, the group by clause in redundant - it
>>>> specifies the same information already implied in the select clause.
>>>> The order by clause tell the system in what order to output the
>>>> results.  The default is to order the columns as they appeared in the
>>>> select clause - your example illustrate this default behavior.
>>>> Therefore, the order by clause could be neglected as well.
>>>>
>>>> Hope this helps.
>>>>
>>>> John
>>>>
>>>>
>>>>
>>>> On 8/17/12 1:34 PM, Michael Beauregard wrote:
>>>>> Hi all,
>>>>>
>>>>> I'm building an analytical iOS application that currently uses sqlite
>>>>> to slice and dice about 2.5MB (30K rows) of data. I've been able to
>>>>> get performance to an acceptable level by relying upon many sql
>>>>> indices, one for each slice/dice combination. Moreover, the queries
>>>>> would still be too slow unless I ensure each query has its own
>>>>> "covering" index. This approach has resulted in an explosion in the db
>>>>> size due to the vast indexing required for performance. A 2.5MB
>>>>> dataset becomes 45MB when sufficiently indexed. This strategy has
>>>>> taken us pretty far, but is at its performance and scalability limits.
>>>>>
>>>>> My customer recently informed me that we now need to go well
>>>>> beyond (at least 10x) our initial scale requirement. Plus he
>>>>> apparently needs to add more ways to slice and dice the data,
>>>>> basically leading to further explosion of the already unacceptable sql
>>>>> index size. This essentially means that I'm not able to continue with
>>>>> my current architecture and I'm looking for other options.
>>>>>
>>>>> I stumbled upon FastBit today while researching various options and
>>>>> I'm really intrigued by what I see. However, I don't understand how I
>>>>> can express the required queries because of FastBit's implicit
>>>>> group-by functionality. Perhaps I'm just missing something obvious,
>>>>> but it seems like I need explicit control over the group-by clause.
>>>>>
>>>>> For instance, here is a query that determines the number of events
>>>>> that occurred in each month, broken down by month and priority of the
>>>>> event:
>>>>>
>>>>> SELECT IncidentBeginMonth, Priority, count(*) FROM events WHERE
>>>>> IncidentBeginTime BETWEEN '2011-01-01 00:00:00' AND '2012-12-31
>>>>> 23:59:59'  GROUP BY IncidentBeginMonth, Priority ORDER BY
>>>>> IncidentBeginMonth, Priority
>>>>>
>>>>> The result is displayed in a stacked bar chart with one bar per month
>>>>> and each bar broken down by event priority. This query executes in
>>>>> around 200ms on 30K rows on an iPad 3 which is fast enough, but only
>>>>> barely.
>>>>>
>>>>> I would really appreciate if someone could help me figure out how to
>>>>> produce an equivalent query in FastBit. I have already imported my
>>>>> data and have been trying various queries using the ibis command, but
>>>>> I'm kind of stuck.
>>>>>
>>>>> Thanks for taking the time to read and consider my question.
>>>>>
>>>>> Michael
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> FastBit-users mailing list
>>>>> [email protected]
>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>>
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to