On Wednesday 23 December 2009 21:10:00 K. John Wu wrote:
> Hi, Andreas,

Hi John,

> The code for treating blank fields in CSV files as NULL values have
> been checked into the SVN repository.  For example, the last two
> fields of the first of the two following lines will be treated as NULL
> values when you tell ardea.cpp that they are numerical values,
> 
> school,     ,
> school,     1, 130000
> 
> For string-valued columns, there is some ambiguity as whether you
> intend to have a blank string or it is something missing.  For the
> moment, we treat a blank string as as the user really wanted a blank
> string -- i.e., not NULL.
> 
> You should be able to get the night snapshot at
> <https://codeforge.lbl.gov/snapshots.php?group_id=44>.
> Please let us know how it works out for you if you do have a chance to
> try it.

Thanks for the quick reply and sorry for the late answer ;) Been away the last 
weeks. I just gave it a try and i wonder what the intended result should be.

This is my testcase:

hans,1,2,3,,5
peter,1,,3,4,5
maus,1,2,3,,5
,1,2,3,4,5

This is how i loaded the data with ardea:

ardea -d null_test -m "a:text,b:int,c:int,d:int,e:int,f:int" -t nulls.csv

ls  null_test/
a  b  c  c.msk  d  e  e.msk  f  -part.txt

So that worked (i think i have read somewhere that the msk files are nullmask 
files).

However. When i am doing an AVG() calculation on c this is the result:

ibis -d null_test -n -q "SELECT AVG(c) WHERE 1=1"

tableSelect -- select(AVG(c), 1=1) on table T-null_test produced a table with 
1 row and 2 columns
tableSelect -- the result table (1 x 2) for "SELECT AVG(c) FROM T-null_test 
WHERE 1=1"
536870913.25, 4

This is the result for a COUNT:

c (with counts)
2,      3
2147483647,     1

Then i tried to let ibis handle ignore 2147483647 when doing the AVG:

SELECT AVG(c) WHERE c!=2147483647

536870913.25, 4

So it seems that ardea can now successfully load null values but there is no 
possibility to calculate against them using ibis. Is my observation correct?

What impact are the NULL values supposed to have on calculations in fastbit 
currently and in the future?

Unless i misunderstood something here i guess i will simply assign an 
arbitrary number to NULL values and handle them using adjusted WHERE 
statements.

> 
> John

Kind regards,

        Andreas Streichardt
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to