nielsbasjes commented on pull request #2112: URL: https://github.com/apache/drill/pull/2112#issuecomment-728153873
@cgivre NOTE: Regarding the double output like "Firefox 35Firefox 35". It is a bug in my parser that causes the setter to be called twice under certain conditions (a reset is done incomplete if you change the settings later). I'm working on a patch and a new version should be available this week. 1. The time field like `request_receive_time` (there is no such thing as request_referer_time) will be exactly as you find it in the log file as a "raw string". 2. The original "raw string" string is always parsed if you ask for a "dissection" of it ( https://github.com/nielsbasjes/logparser/blob/master/httpdlog/httpdlog-parser/src/main/java/nl/basjes/parse/httpdlog/dissectors/TimeStampDissector.java#L418 ) and then from the resulting instance the output for these fields is created in the fully predictable ISO format ( https://github.com/nielsbasjes/logparser/blob/master/httpdlog/httpdlog-parser/src/main/java/nl/basjes/parse/httpdlog/dissectors/TimeStampDissector.java#L491 ) 3. Yes, there is overhead in figuring out which fields CAN be extracted so the extra 2 seconds during the setup makes sense. Yet in the runtime (i.e. hammering through the actual data) this Dissector only has impact on the processing time if you are actually asking for a field that comes from it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
