Abdel,
I sent you the file to the email address. But I found the problem already:
The drillbit runs on a Linux box. When I check the file
file test.csv
the output is:
test.csv: ASCII text, with CRLF line terminators
In this case the query:
select col1,col2,col3 from dfs.datatransfer.`test.csv`
does not work.
If I do a:
dos2unix test.csv
The query does work properly!
So drill does not properly recognize a CRLF linebreak which is standard on
Windows system.
Just for the sake of it, if I do the opposite:
unix2dos test.csv
again it does not work.
Should I file a bug?
Regards,
Uwe
-----Original Message-----
From: Abdel Hakim Deneche [mailto:[email protected]]
Sent: Dienstag, 24. November 2015 18:12
To: user
Subject: Re: Bug in Drill 1.3 CSV - please confirm
Hi Uwe,
I couldn't reproduce the issue using the 1.3 release! can you send me the dummy
test file you created, to my email address (you can't send it to an apache
mailing list).
Thanks
On Tue, Nov 24, 2015 at 3:03 AM, Geercken, Uwe <[email protected]>
wrote:
> I have downloaded 1.3 and made a quick test of the new extractHeader
> feature for text files.
>
> So I updated the storage details and created a dummy test file:
>
> col1,col2,col3
> geercken,uwe,22
> karlson,peter,33
>
>
> when I query the data with this: select * from
> dfs.datatransfer.`test.csv` - it works.
>
> when I query the data with this: select col1,col2 from
> dfs.datatransfer.`test.csv` - it works.
>
> when I query the data with this: select col1,col2,col3 from
> dfs.datatransfer.`test.csv` - it gives me an exception:
>
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> ArrayIndexOutOfBoundsException: -1 Fragment 0:0
>
>
> I figured out, that if I add a comma (,) after "col3" in the header it
> works. So obviously the process does not notice the last column of the
> header.
>
> If I set extractHeader to false and add skipFirstLine instead and do this:
> select columns[0], columns[1], columns[2] from
> dfs.datatransfer.`test.csv`
> - then it works. So the problem seems to be only the header row.
>
>
> I verified the same problem with other files, but can somebody please
> cross-check before I add a Jire?
>
> Thanks,
>
> Uwe
>
--
Abdelhakim Deneche
Software Engineer
<http://www.mapr.com/>
Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>