Re: (noob) performance of queries against csv files

2015-07-02 Thread Ted Dunning
Hey Larry, Drill transforms your CSV data into an internal memory-resident format for processing, but does not change the structure of your original data. If you want to convert your file to parquet, you can do this: create table `foo.parquet` as select * from `foo.csv` This will, however,

Re: (noob) performance of queries against csv files

2015-07-02 Thread Jason Altekruse
I would recommend select with specific column aliases assigned and casts where appropriate. create table parquet_users as select cast(columns[0] as int) as user_id, columns[1] as username, cast(columns[2] as timestamp) as registration_date from `users.csv1`; On Thu, Jul 2, 2015 at 2:46 PM,

Re: (noob) performance of queries against csv files

2015-07-02 Thread Larry White
so the solution is to use select, but with columns specifically defined. is that right? On Thu, Jul 2, 2015 at 4:48 PM, Jason Altekruse altekruseja...@gmail.com wrote: Just one additional note here, I would strongly advise against converting csv files using a select * query out of a csv. The