[
https://issues.apache.org/jira/browse/DRILL-3423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14991779#comment-14991779
]
Jim Scott commented on DRILL-3423:
----------------------------------
Jacques,
I'm not sure I follow this comment "We should also avoid the use of dot
delimiters being automatically generated by Drill."
Am I correct that your concern is specifically with the configuration of the
plugin and mapping of the field names.
Here is the problem I have with creating the mappings in the configuration:
1. There are WAY more ways the parser can parse a field than are logical for us
to create mappings for (e.g. a time field will yield timezone based result and
a utc based.)
2. By providing a mapping within the drill plugin we have to expose every
default for anything that may show up in the log parser (e.g. if a new feature
shows up in the log parser we wouldn't be able to expose it until we make a
change in the plugin).
Regarding wildcard maps of data I can just as easily remove the :map from the
end of the field name. I'm indifferent, really. I put it on there to make it
blatantly obvious.
As for creating maps like this example:
{code:java}
case "IP:connection.client.ip":
add(parser, path, writer.rootAsMap().map("client").varChar("ip"));
break;
case "IP:connection.client.peerip":
add(parser, path,
writer.rootAsMap().map("client").varChar("peer_ip"));
break;
case "IP:connection.server.ip":
add(parser, path, writer.rootAsMap().map("server").varChar("ip"));
{code}
This model makes it extremely difficult to support mapping of data types. This
makes an assumption that those fields are varChar and nothing else. Also based
on the life cycle of creating maps within Drill I don't think this is the most
logical approach to take. Putting the technical details aside, I as a user
don't know that I benefit from nesting the data into maps. While from a data
structure perspective I understand why someone might want to do this, from a
query perspective I think it makes querying the data more difficult.
> Add New HTTPD format plugin
> ---------------------------
>
> Key: DRILL-3423
> URL: https://issues.apache.org/jira/browse/DRILL-3423
> Project: Apache Drill
> Issue Type: New Feature
> Components: Storage - Other
> Reporter: Jacques Nadeau
> Assignee: Jim Scott
> Fix For: 1.4.0
>
>
> Add an HTTPD logparser based format plugin. The author has been kind enough
> to move the logparser project to be released under the Apache License. Can
> find it here:
> <dependency>
> <groupId>nl.basjes.parse.httpdlog</groupId>
> <artifactId>httpdlog-parser</artifactId>
> <version>2.0</version>
> </dependency>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)