nielsbasjes edited a comment on pull request #2122: URL: https://github.com/apache/drill/pull/2122#issuecomment-741637555
Casts: The `Casts` are the possible types the logparser has for returning the found elements. So if it says `LONG` as one of the options you can specify a setter that receives a `Long`. This is a property that differs per field, yet the old implementation in Drill used a single value for all fields. When this value is STRING it will all work, but when I sorted the fields this changed and it all fell apart. Remapping: The logparser constructs a tree of dissectors based upon types like `STRING`, `TIME.EPOCH`, `IP` and `HTTP.USERAGENT`. A specific Dissector is capable of parsing a certain input type into one of more pieces which will have different types than the input. Now in my example the `request.firstline.uri.query.ua` is a `STRING` for which no Dissector exists. In the logparser you can remap this to a different type like `HTTP.USERAGENT`. If you specify this the UserAgentDissector will automatically be attached and dissect this further into the parts it can produce. What I see in the Drill code is that there is an option to add a '#' (REMAPPING_FLAG) and then specify a new type and this is then used to call the remapping code of the logparser. Yet I have not found any tests or documentation that show how this must be used. I have an idea on how to make this possible. Config option: If you enable the useragent parser and then you do not ask for any fields from it the following happens. 1) An instance of UserAgentDissector is created to see what can be parsed (takes about 1 second). 2) If then no fields are asked from it this instance of the UserAgentDissector is simply removed from the parsing again. So during the actually parsing run there is no performance impact, only during the startup phase. I'll improve this. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
