Hi Joi,

the problem you've described isn't caused by the extractors, but by the 
dynamic index mapping in Elasticsearch. Graylog doesn't define a schema for 
log messages but lets Elasticsearch try to come up with a sensible mapping 
(http://www.elastic.co/guide/en/elasticsearch/reference/1.4/mapping-dynamic-mapping.html)
 
instead.

This means that if an attribute only contained numbers at first, 
Elasticsearch will assume that the attribute will always be a number. You 
can try to cycle indices (which will create a new index with a fresh 
dynamic mapping) or apply index templates as described in 
https://github.com/Graylog2/graylog2-server/issues/903.


Cheers,
Jochen


On Tuesday, 14 April 2015 18:04:56 UTC+2, Joi Owen wrote:
>
> I'm trying to extract a port name from a log message such as this one 
> (copied from my rsyslog permanent archive before it was transfered on into 
> graylog 1.0.1)
>
> *2015-04-13T22:42:19-05:00 10.146.156.20 INFO: Port 1:37 link up, 100Mbps 
> FULL duplex*
>
>
> I want to extract the port name, which in this line is "*1:37*" but 
> nothing, absolutely nothing I've tried has worked.  I have no problem 
> extracting that field from lines like:
>
> *2015-04-13T11:06:16-05:00 10.144.24.91 INFO: Port 7 link up, 100Mbps FULL 
> duplex*
>
>
> I've tried "Port (\d+)", "Port (\S+)", "Port ([\d\:]+)", "Port (\d+:\d+)", 
> "Port (\d*:?\d+)" and even "Port (.+) link", all with and without ^.+ and 
> .+$ endings, and nothing works.  I can always get the port out when it's 
> just digits, but as soon as the input contains a colon, it refuses to 
> match.  I've spent two hours trying trick after trick and nothing has 
> worked.  I've been writing regexp in perl for decades so I'm pretty 
> confident of my basic understanding of regexps.  I've studied the Java 
> documentation as well and don't see any reason why this continues to fail.
>
> What really, really is bugging me is that *ALL of those patterns worked 
> fine in the extractor editor test page*, but once I save the extractor 
> and go try to use it, it fails.  I'm selecting actual messages out of the 
> input and loading the messages up to test against.
>
> The only thing I can think of is that something about the underlying java 
> is puking on the ":" in the content being matched, and it's causing the 
> test to fail.  
>
> Just for grins, I looked at the indexer page, and I see bunches of this:
>
> *MapperParsingException[failed to parse [port]]; nested: 
> NumberFormatException[For input string: "1:3"];*
>
>
> But I have specifically told this extractor to NOT convert the thing to a 
> number.  I even tried forcing in a 'lowercase' converter, but that didn't 
> help, either.  It appears that the extractor is insisting on converting the 
> field to a number before creating it, despite what I told it to do with the 
> converter settings.
>
> I've searched through the group posts here and found the ones where 
> variable white space was an issue; I've checked against the original 
> content (see above) and that isn't the issue.  (I tried using \s+, a space, 
> etc, and those made no difference, either.)
>
> Can anyone show me a pattern that will properly return a match for *1:37*? 
>  And have it properly set the new field?
>
> Here's a copy/paste of the extractor as it exists right now, it's giving 
> me port fields with values only when the values are one or more digits. 
>  None of them with : are getting set.
>
> Trying to extract data from *message* into *port*, leaving the original 
> intact.
> Configuration:
>
>
>    - regex_value: ^.+INFO:\s+Port\s+(\S+)\s.+$
>    
> Converters
>
>
>    - uppercase
>    
>
> Any suggestions would be most welcome.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to