Hi, Michael,

I understand your request, however, the current implementation in
FastBit deals with group by operations by assuming all values given to
them are valid (not NULL).  Clearly, this is a very limited option,
however, it covers a good number of use cases.

It seems to me that I will need quite a bit of time to add the support
for NULL values in group-by operations.  If you have a quick way to
get this done, please go ahead with modification or let me know how to
do it..

John


On 11/1/12 4:25 PM, Michael Beauregard wrote:
> Thanks for the quick response.
> 
> I was afraid I you were going to suggest that. I think I over
> simplified the query making it a poor example of what I intend to do.
> I need something more like:
> 
> SELECT a, b, AVG(c), AVG(d), AVG(e), AVG(f), AVG(g), AVG(h), AVG(i),
> AVG(j) WHERE a NOT IN (1, 2, 3) AND b NOT IN (3, 4, 5) AND <some other
> constraints>
> 
> and columns 'c' through 'j' can have NULL values at any point.
> 
> Splitting that into 8 queries:
> 
> SELECT a, b, AVG(c) WHERE c NOT NULL AND ...
> SELECT a, b, AVG(d) WHERE d NOT NULL AND ...
> ...
> 
> means that I'd have to somehow combine those results after the fact
> and I'd like to avoid that if possible.
> 
> Could I perform the main query without specifying 'c' through 'j'
> aggregations, then using the api to compute each average on the
> resulting table so that the non-aggregation part of the query doesn't
> have to be repeated? Like:
> 
> SELECT a, b WHERE a NOT IN (1, 2, 3) AND b NOT IN (3, 4, 5) AND <some
> other constraints>
> 
> Then use the resulting table instance to compute aggregations AVG('c')
> through AVG('j').
> 
> Or any other suggestions?
> 
> Michael
> 
> On Thu, Nov 1, 2012 at 4:08 PM, K. John Wu <[email protected]> wrote:
>> You will have to do AVG(a) and AVG(b) separately as follows
>>
>> SELECT AVG(a) WHERE a not null
>> SELECT AVG(b) where b not null
>>
>> In
>>
>> SELECT AVG(a), AVG(b) WHERE a not null
>>
>> there is an implicit 'b not null' also applied, which will give you
>> two rows instead of three as in the original example you give.
>>
>> John
>>
>>
>> On 11/1/12 3:57 PM, Michael Beauregard wrote:
>>> I see. I've noticed in past conversations that people generally deal
>>> with this by using a magic value instead of NULLs. Which works fine in
>>> simple cases like:
>>>
>>> SELECT AVG(a) WHERE a != <magic>
>>>
>>> However, I am in a situation where I need the query to return multiple
>>> aggregations such as:
>>>
>>> SELECT AVG(a), AVG(b) WHERE 1 = 1
>>>
>>> and I would like AVG(a) to average all non-magic values of 'a' and
>>> AVG(b) to average all non-magic values of 'b'. Any suggestions on now
>>> to go about doing this?
>>>
>>> Thanks,
>>>
>>> Michael
>>>
>>> On Thu, Nov 1, 2012 at 3:48 PM, K. John Wu <[email protected]> wrote:
>>>> Hi, Michael,
>>>>
>>>> The NULL values in CSV fiels are not ignored by ardea, but by the
>>>> query processing code.  Basically, FastBit query results can not
>>>> contain NULLs, so  'SELECT a WHERE 1 = 1' will return three rows while
>>>> 'SELECT b WHERE 1 = 1' will only return two.
>>>>
>>>> John
>>>>
>>>>
>>>> On 11/1/12 1:55 PM, Michael Beauregard wrote:
>>>>> Hey John,
>>>>>
>>>>> I'm a bit confused as to how NULLs in csv files are treated by ardea.
>>>>> Here is an experiment I just did:
>>>>>
>>>>> in "nulls.csv":
>>>>>
>>>>> 10,1
>>>>> 20,
>>>>> 30,3
>>>>>
>>>>> then load the data with ardea:
>>>>>
>>>>> ardea -d tmp/nulls_test -m a:int,b:int -t nulls.csv
>>>>>
>>>>> then query with ibis:
>>>>>
>>>>> ibis.sh -d tmp/nulls_test -q 'SELECT a, b WHERE 1 = 1' -o out.txt
>>>>>
>>>>> cat out.txt
>>>>> 10, 1
>>>>> 30, 3
>>>>>
>>>>> For some reason the csv row with a NULL in it is ignored by ardea.
>>>>> This is unexpected for me, but I'm wondering if this is working as
>>>>> designed or not. If this is expected behaviour, then I'm wondering how
>>>>> I can specify NULL values when loading data.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Michael
>>>>> _______________________________________________
>>>>> FastBit-users mailing list
>>>>> [email protected]
>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>>
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to