[ 
https://issues.apache.org/jira/browse/DRILL-3423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649391#comment-14649391
 ] 

Jacques Nadeau commented on DRILL-3423:
---------------------------------------

Q1:  The main reason is that Drill is targeting analysts rather than 
developers.  We are very focused on separating out data definition from 
business rules.  The user should have to provide no more information than is 
necessary to interact with a new data source.  In the case of an Apache HTTPD 
log, the only that is needed is a format string.  From there, a user can use 
the SQL interface to create alternative views, etc (things that support their 
particular business needs).  The future goal is to make more formats 
self-describing directly (as we have already done with Parquet) or indirectly 
using what we call a .drill file.  This is the same pattern than we use for 
JSON, Avro, HBase, etc.  It allows non-technical users to interact with new 
data quickly and easily.  (Note that this also works better in Drill because we 
have first class capabilities around complex data and the JSON document model.)

Q2: This has to do with the most efficient way to write into Drill and the fact 
that we want to manage the path of write to provide a clean and consistent 
complex data model for the underlying format.  

> Add New HTTPD format plugin
> ---------------------------
>
>                 Key: DRILL-3423
>                 URL: https://issues.apache.org/jira/browse/DRILL-3423
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Storage - Other
>            Reporter: Jacques Nadeau
>            Assignee: Jacques Nadeau
>             Fix For: 1.2.0
>
>
> Add an HTTPD logparser based format plugin.  The author has been kind enough 
> to move the logparser project to be released under the Apache License.  Can 
> find it here:
> <dependency>
>     <groupId>nl.basjes.parse.httpdlog</groupId>
>     <artifactId>httpdlog-parser</artifactId>
>     <version>2.0</version>
> </dependency>
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to