Hi Joi, the problem you've described isn't caused by the extractors, but by the dynamic index mapping in Elasticsearch. Graylog doesn't define a schema for log messages but lets Elasticsearch try to come up with a sensible mapping (http://www.elastic.co/guide/en/elasticsearch/reference/1.4/mapping-dynamic-mapping.html) instead.
This means that if an attribute only contained numbers at first, Elasticsearch will assume that the attribute will always be a number. You can try to cycle indices (which will create a new index with a fresh dynamic mapping) or apply index templates as described in https://github.com/Graylog2/graylog2-server/issues/903. Cheers, Jochen On Tuesday, 14 April 2015 18:04:56 UTC+2, Joi Owen wrote: > > I'm trying to extract a port name from a log message such as this one > (copied from my rsyslog permanent archive before it was transfered on into > graylog 1.0.1) > > *2015-04-13T22:42:19-05:00 10.146.156.20 INFO: Port 1:37 link up, 100Mbps > FULL duplex* > > > I want to extract the port name, which in this line is "*1:37*" but > nothing, absolutely nothing I've tried has worked. I have no problem > extracting that field from lines like: > > *2015-04-13T11:06:16-05:00 10.144.24.91 INFO: Port 7 link up, 100Mbps FULL > duplex* > > > I've tried "Port (\d+)", "Port (\S+)", "Port ([\d\:]+)", "Port (\d+:\d+)", > "Port (\d*:?\d+)" and even "Port (.+) link", all with and without ^.+ and > .+$ endings, and nothing works. I can always get the port out when it's > just digits, but as soon as the input contains a colon, it refuses to > match. I've spent two hours trying trick after trick and nothing has > worked. I've been writing regexp in perl for decades so I'm pretty > confident of my basic understanding of regexps. I've studied the Java > documentation as well and don't see any reason why this continues to fail. > > What really, really is bugging me is that *ALL of those patterns worked > fine in the extractor editor test page*, but once I save the extractor > and go try to use it, it fails. I'm selecting actual messages out of the > input and loading the messages up to test against. > > The only thing I can think of is that something about the underlying java > is puking on the ":" in the content being matched, and it's causing the > test to fail. > > Just for grins, I looked at the indexer page, and I see bunches of this: > > *MapperParsingException[failed to parse [port]]; nested: > NumberFormatException[For input string: "1:3"];* > > > But I have specifically told this extractor to NOT convert the thing to a > number. I even tried forcing in a 'lowercase' converter, but that didn't > help, either. It appears that the extractor is insisting on converting the > field to a number before creating it, despite what I told it to do with the > converter settings. > > I've searched through the group posts here and found the ones where > variable white space was an issue; I've checked against the original > content (see above) and that isn't the issue. (I tried using \s+, a space, > etc, and those made no difference, either.) > > Can anyone show me a pattern that will properly return a match for *1:37*? > And have it properly set the new field? > > Here's a copy/paste of the extractor as it exists right now, it's giving > me port fields with values only when the values are one or more digits. > None of them with : are getting set. > > Trying to extract data from *message* into *port*, leaving the original > intact. > Configuration: > > > - regex_value: ^.+INFO:\s+Port\s+(\S+)\s.+$ > > Converters > > > - uppercase > > > Any suggestions would be most welcome. > > -- You received this message because you are subscribed to the Google Groups "graylog2" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
