Uwe,

I filed a bug for this already:

https://issues.apache.org/jira/browse/DRILL-3726
and possibly a duplicate of:
https://issues.apache.org/jira/browse/DRILL-3149

Thank you,
Edmon

On Wed, Nov 25, 2015 at 3:09 AM, Geercken, Uwe <[email protected]>
wrote:

> Abdel,
>
> I sent you the file to the email address. But I found the problem already:
>
> The drillbit runs on a Linux box. When I check the file
>
>         file test.csv
>
> the output is:
>
>         test.csv: ASCII text, with CRLF line terminators
>
> In this case the query:
>
>         select col1,col2,col3  from dfs.datatransfer.`test.csv`
>
> does not work.
>
> If I do a:
>
>         dos2unix test.csv
>
> The query does work properly!
>
> So drill does not properly recognize a CRLF linebreak which is standard on
> Windows system.
>
> Just for the sake of it, if I do the opposite:
>
>         unix2dos test.csv
>
> again it does not work.
>
> Should I file a bug?
>
> Regards,
>
> Uwe
>
>
> -----Original Message-----
> From: Abdel Hakim Deneche [mailto:[email protected]]
> Sent: Dienstag, 24. November 2015 18:12
> To: user
> Subject: Re: Bug in Drill 1.3 CSV - please confirm
>
> Hi Uwe,
>
> I couldn't reproduce the issue using the 1.3 release! can you send me the
> dummy test file you created, to my email address (you can't send it to an
> apache mailing list).
>
> Thanks
>
> On Tue, Nov 24, 2015 at 3:03 AM, Geercken, Uwe <[email protected]
> >
> wrote:
>
> > I have downloaded 1.3 and made a quick test of the new extractHeader
> > feature for text files.
> >
> > So I updated the storage details and created a dummy test file:
> >
> > col1,col2,col3
> > geercken,uwe,22
> > karlson,peter,33
> >
> >
> > when I query the data with this: select *  from
> > dfs.datatransfer.`test.csv` - it works.
> >
> > when I query the data with this: select col1,col2  from
> > dfs.datatransfer.`test.csv` - it works.
> >
> > when I query the data with this: select col1,col2,col3  from
> > dfs.datatransfer.`test.csv` - it gives me an exception:
> >
> > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> > ArrayIndexOutOfBoundsException: -1 Fragment 0:0
> >
> >
> > I figured out, that if I add a comma (,) after "col3" in the header it
> > works. So obviously the process does not notice the last column of the
> > header.
> >
> > If I set extractHeader to false and add skipFirstLine instead and do
> this:
> > select columns[0], columns[1], columns[2]  from
> > dfs.datatransfer.`test.csv`
> > - then it works. So the problem seems to be only the header row.
> >
> >
> > I verified the same problem with other files, but can somebody please
> > cross-check before I add a Jire?
> >
> > Thanks,
> >
> > Uwe
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Reply via email to