i run into the issue too and its frustrated me for a while 
this occurs if file is format is dos based as opposed to unix based. 
convert the file to unix and then everything will be parsed correctly. 
this is an assumption in ardea parser that everything is unix based. 

----- Original Message -----

From: [email protected] 
To: [email protected] 
Sent: Wednesday, March 6, 2013 3:00:56 PM 
Subject: FastBit-users Digest, Vol 67, Issue 1 

Send FastBit-users mailing list submissions to 
        [email protected] 

To subscribe or unsubscribe via the World Wide Web, visit 
        https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users 
or, via email, send a message with subject or body 'help' to 
        [email protected] 

You can reach the person managing the list at 
        [email protected] 

When replying, please edit your Subject line so it is more specific 
than "Re: Contents of FastBit-users digest..." 


Today's Topics: 

   1. TPCH tests for fastbit (amihay gonen) 
   2. Re: TPCH tests for fastbit (K. John Wu) 


---------------------------------------------------------------------- 

Message: 1 
Date: Wed, 6 Mar 2013 00:51:24 +0200 
From: amihay gonen <[email protected]> 
Subject: [FastBit-users] TPCH tests for fastbit 
To: [email protected] 
Message-ID: 
        <CAKb+SBWcQ=gMaOFqyOcNUg2rGTAsZyBr3xKcV8Nk=rpgotw...@mail.gmail.com> 
Content-Type: text/plain; charset="iso-8859-1" 

Hi , 
I'm trying to make environment for testing TPCH queries on fastbit (if 
anyone has those queries "translated" to ibis query format it will be 
great). 

I've started by trying to convert row data (1G rows) to col data using 
ardea tool 

On the big table lineitem , the row data is about 780M and the col data got 
1.8G  ,this was strange , by looking into the directory where the col files 
are i see the following : 

-rw-rw-r--. 1 agonen agonen 211M Mar  6 00:07 comment 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 commitdate 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 discount 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 linenumber 
-rw-rw-r--. 1 agonen agonen 104M Mar  6 00:07 linestatus 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 orderkey 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 partkey 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 price 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 quantity 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 receipdate 
-rw-rw-r--. 1 agonen agonen 958M Mar  6 00:07 returnflag  <--- too big for 
varchar(1) 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 shipdate 
-rw-rw-r--. 1 agonen agonen 256M Mar  6 00:07 shipinstuct 
-rw-rw-r--. 1 agonen agonen 202M Mar  6 00:07 shipmode 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 suppkey 
-rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 tax 
-rw-rw-r--. 1 agonen agonen 1.6K Mar  6 00:07 -part.txt 

the returnflag ,for some reason contain the all record. 

I'm using the fastbit-ibis1.3.5 and the command to convet the csv to col is 
: 

~/Code/fastbit-ibis1.3.5/examples/ardea -d 
~/Code/fastbit_col_data/lineitem.tbl -b \| -m 
"orderkey:int,partkey:int,suppkey:int,linenumber:int,quantity: 
float,price:float,discount:float,tax:float,returnflag:text,linestatus:key,shipdate:int,commitdate:int,receipdate:int,shipinstuct:text,shipmode:key,comment:text"
 
-t ~/Code/rowdata/lineitem.tbl 


the base file look like that: 

1|155190|7706|1|17|21168.23|0.04|0.02|N|O|1996-03-13|1996-02-12|1996-03-22|DELIVER
 
IN PERSON|TRUCK|egular courts above the| 
1|67310|7311|2|36|45983.16|0.09|0.06|N|O|1996-04-12|1996-02-28|1996-04-20|TAKE 
BACK RETURN|MAIL|ly final dependencies: slyly bold | 
1|63700|3701|3|8|13309.60|0.10|0.02|N|O|1996-01-29|1996-03-05|1996-01-31|TAKE 
BACK RETURN|REG AIR|riously. regular, express dep| 


any idea ? 

thanks amihay 
-------------- next part -------------- 
An HTML attachment was scrubbed... 
URL: 
http://hpcrdm.lbl.gov/pipermail/fastbit-users/attachments/20130306/b23ba158/attachment-0001.htm
 

------------------------------ 

Message: 2 
Date: Tue, 05 Mar 2013 22:30:49 -0800 
From: "K. John Wu" <[email protected]> 
Subject: Re: [FastBit-users] TPCH tests for fastbit 
To: FastBit Users <[email protected]> 
Cc: amihay gonen <[email protected]> 
Message-ID: <[email protected]> 
Content-Type: text/plain; charset=ISO-8859-1 

Hi, Amihay, 

I just took the three rows you included in the message and tried it 
with your ardea command line.  Things seem to have completed 
successfully in my macbook.  In these three rows, the value of 
returnflag are all 'N'.  Not sure what platform you are using.. 

Maybe your run of ardea has encountered some errors.  If you have 
captured the print out from this command line, would you mind share it 
with me? 

Not sure what you plan to do with the data files, but one thing you 
should know is that there are many queries from TPCH that can not be 
handled by FastBit.  For example, FastBit does not deal with date as 
nicely as DBMS.  FastBit does not do multi-table joins neither. 

John 


On 3/5/13 2:51 PM, amihay gonen wrote: 
> Hi , 
> I'm trying to make environment for testing TPCH queries on fastbit (if 
> anyone has those queries "translated" to ibis query format it will be 
> great). 
> 
> I've started by trying to convert row data (1G rows) to col data using 
> ardea tool 
> 
> On the big table lineitem , the row data is about 780M and the col 
> data got 1.8G  ,this was strange , by looking into the directory where 
> the col files are i see the following : 
> 
> -rw-rw-r--. 1 agonen agonen 211M Mar  6 00:07 comment 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 commitdate 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 discount 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 linenumber 
> -rw-rw-r--. 1 agonen agonen 104M Mar  6 00:07 linestatus 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 orderkey 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 partkey 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 price 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 quantity 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 receipdate 
> -rw-rw-r--. 1 agonen agonen 958M Mar  6 00:07 returnflag  <--- too big 
> for varchar(1) 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 shipdate 
> -rw-rw-r--. 1 agonen agonen 256M Mar  6 00:07 shipinstuct 
> -rw-rw-r--. 1 agonen agonen 202M Mar  6 00:07 shipmode 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 suppkey 
> -rw-rw-r--. 1 agonen agonen  69M Mar  6 00:07 tax 
> -rw-rw-r--. 1 agonen agonen 1.6K Mar  6 00:07 -part.txt 
> 
> the returnflag ,for some reason contain the all record. 
> 
> I'm using the fastbit-ibis1.3.5 and the command to convet the csv to 
> col is : 
> 
> ~/Code/fastbit-ibis1.3.5/examples/ardea -d 
> ~/Code/fastbit_col_data/lineitem.tbl -b \| -m 
> "orderkey:int,partkey:int,suppkey:int,linenumber:int,quantity: 
> float,price:float,discount:float,tax:float,returnflag:text,linestatus:key,shipdate:int,commitdate:int,receipdate:int,shipinstuct:text,shipmode:key,comment:text"
>  
> -t ~/Code/rowdata/lineitem.tbl 
> 
> 
> the base file look like that: 
> 
> 1|155190|7706|1|17|21168.23|0.04|0.02|N|O|1996-03-13|1996-02-12|1996-03-22|DELIVER
>  
> IN PERSON|TRUCK|egular courts above the| 
> 1|67310|7311|2|36|45983.16|0.09|0.06|N|O|1996-04-12|1996-02-28|1996-04-20|TAKE
>  
> BACK RETURN|MAIL|ly final dependencies: slyly bold | 
> 1|63700|3701|3|8|13309.60|0.10|0.02|N|O|1996-01-29|1996-03-05|1996-01-31|TAKE 
> BACK RETURN|REG AIR|riously. regular, express dep| 
> 
> 
> any idea ? 
> 
> thanks amihay 
> 
> 
> _______________________________________________ 
> FastBit-users mailing list 
> [email protected] 
> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users 
> 


------------------------------ 

_______________________________________________ 
FastBit-users mailing list 
[email protected] 
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users 


End of FastBit-users Digest, Vol 67, Issue 1 
******************************************** 

_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to