cgivre commented on pull request #2112:
URL: https://github.com/apache/drill/pull/2112#issuecomment-727583315


   @nielsbasjes 
   Thanks for the quick review!  I have a few more things to tweak before it's 
ready for the next round of review, but I've noticed a very significant 
improvement in query performance on my machine with the log files with the 
refactoring.  Did you see any difference?
   
   I have a question for you regarding the file extension.  Right now, Drill 
uses the file extension to determine which format plugin to use for parsing the 
file(s).  One other option that Drill has is the `defaultInputFormat` which is 
an option for a given workspace.  I'd imagine that in a real world situation, 
web server logs would be contained as they are generated in a directory or 
series of directories.  What you could do in that case, is define a workspace 
and set the `defaultInputFormat` to httpd and that would tell Drill to use the 
HTTPD plugin even when there are no file extensions specified.
   
   ```json
          "weblogs": {
            "location": "<path to logs>",
            "writable": false,
            "defaultInputFormat": "httpd"
          }
   ```
   
   With that said, I do like the idea of allowing users to define a pattern for 
filenames that would be associated with a particular file type. I think that 
might be out of scope for this PR however.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to