While I am not going to tackle your specific question regarding using the
delimited file reader, I will say that the 1.2 build of Drill has support
for Apache HTTPd log format parsing. You only have to supply the format
pattern that was used to create the logs and it will parse the records
properly.

On Thu, Sep 17, 2015 at 8:43 AM, Mathieu Agneray <[email protected]>
wrote:

> Hy,
>
> I'm having an issue with Drill file format.
> I have a CSV file that has space delimiter (apache2 web server logs) and
> double quotes for text area.
> So I have configured my csv file format like this:
>
> "csv": {
>       "type": "text",
>       "extensions": [
>         "csv"
>       ],
>       "escape": "\\",
>       "comment": "\u0000",
>       "delimiter": " "
>     }
>
> and it doesn't work well.
>
> A line look like this:
> XXX.XXX.XXX.XXX 200 "GET / ... etc" "USER AGENT"
>
> Instead of giving me (4 columns):
> ["XXX.XXX.XXX.XXX", "200", "GET / ... etc", "USER AGENT"]
>
> I'm having this response (3columns):
> ["XXX.XXX.XXX.XXX", "200", "GET / ... etc\" \"USER AGENT\""]
>
> But if I edit the file with comma delimiter a the configuration, it's
> working fine.
> Is there a problem within the code for space delimiter?
>
> Thanks
>
> Mathieu Agneray
>



-- 
*Jim Scott*
Director, Enterprise Strategy & Architecture
+1 (347) 746-9281
@kingmesal <https://twitter.com/kingmesal>

<http://www.mapr.com/>
[image: MapR Technologies] <http://www.mapr.com>

Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Reply via email to