Hi Jochen,
Here's what my "graylog-internal" template currently looks like (as seen
via the Elasticsearch API):
{
"graylog-internal" : {
"order" : 0,
"template" : "graylog2_*",
"settings" : { },
"mappings" : {
"message" : {
"_source" : {
"compress" : true,
"enabled" : true
},
"dynamic_templates" : [ {
"internal_fields" : {
"mapping" : {
"index" : "not_analyzed",
"doc_values" : true
},
"match" : "gl2_*"
}
}, {
"store_generic" : {
"mapping" : {
"index" : "not_analyzed"
},
"match" : "*"
}
} ],
"_ttl" : {
"enabled" : true
},
"properties" : {
"message" : {
"index" : "analyzed",
"analyzer" : "whitespace",
"type" : "string"
},
"timestamp" : {
"format" : "yyyy-MM-dd HH:mm:ss.SSS",
"doc_values" : true,
"type" : "date"
},
"source" : {
"index" : "analyzed",
"analyzer" : "analyzer_keyword",
"type" : "string"
},
"full_message" : {
"index" : "analyzed",
"analyzer" : "whitespace",
"type" : "string"
}
}
}
},
"aliases" : { }
}
}
Here's what my graylog2_3 index currently looks like (as seen via the
Elasticsearch API):
{
"graylog2_3" : {
"aliases" : {
"graylog2_deflector" : { }
},
"mappings" : {
"message" : {
"dynamic_templates" : [ {
"internal_fields" : {
"mapping" : {
"index" : "not_analyzed",
"doc_values" : true
},
"match" : "gl2_*"
}
}, {
"store_generic" : {
"mapping" : {
"index" : "not_analyzed"
},
"match" : "*"
}
} ],
"_ttl" : {
"enabled" : true
},
"_source" : {
"compress" : true
},
"properties" : {
"full_message" : {
"type" : "string",
"analyzer" : "whitespace"
},
"gl2_remote_ip" : {
"type" : "string",
"index" : "not_analyzed",
"doc_values" : true
},
"gl2_remote_port" : {
"type" : "long",
"doc_values" : true
},
"gl2_source_collector" : {
"type" : "string",
"index" : "not_analyzed",
"doc_values" : true
},
"gl2_source_collector_input" : {
"type" : "string",
"index" : "not_analyzed",
"doc_values" : true
},
"gl2_source_input" : {
"type" : "string",
"index" : "not_analyzed",
"doc_values" : true
},
"gl2_source_node" : {
"type" : "string",
"index" : "not_analyzed",
"doc_values" : true
},
"level" : {
"type" : "string",
"index" : "not_analyzed"
},
"message" : {
"type" : "string",
"analyzer" : "whitespace"
},
"source" : {
"type" : "string",
"analyzer" : "analyzer_keyword"
},
"source_file" : {
"type" : "string",
"index" : "not_analyzed"
},
"timestamp" : {
"type" : "date",
"doc_values" : true,
"format" : "yyyy-MM-dd HH:mm:ss.SSS"
},
"version" : {
"type" : "string",
"index" : "not_analyzed"
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1462197971182",
"uuid" : "ylBuS8y3SBKRYMyLuMWApg",
"analysis" : {
"analyzer" : {
"analyzer_keyword" : {
"filter" : "lowercase",
"tokenizer" : "keyword"
}
}
},
"number_of_replicas" : "0",
"number_of_shards" : "4",
"version" : {
"created" : "1070399"
}
}
},
"warmers" : { }
}
}
After cycling the deflector so that it points to the new index, graylog2_3,
I proceeded to delete my old indices.
Using the Graylog API browser, I tried to tokenize a random string (This is
a $test:[to.see.if graylog() work$.):
http://vtor-lx-tomcat-d01:12900/messages/graylog2_3/analyze?string=This%20is%20a%20%24test%3A%5Bto.see.if%20graylog()%20work%24%5D.&pretty=true
{
"tokens" : [ "this", "is", "a", "test", "to.see.if", "graylog", "work" ]
}
This makes sense because if I attempt to tokenize the same string via
Elasticsearch (using the same index), I get the same result:
curl 'vtor-lx-tomcat-d01:9200/graylog2_3/_analyze?pretty=true' -d 'This is
a $test:[to.see.if graylog() work$.'
"tokens" : [ {
"token" : "this",
"start_offset" : 0,
"end_offset" : 4,
"type" : "<ALPHANUM>",
"position" : 1
}, {
"token" : "is",
"start_offset" : 5,
"end_offset" : 7,
"type" : "<ALPHANUM>",
"position" : 2
}, {
"token" : "a",
"start_offset" : 8,
"end_offset" : 9,
"type" : "<ALPHANUM>",
"position" : 3
}, {
"token" : "test",
"start_offset" : 11,
"end_offset" : 15,
"type" : "<ALPHANUM>",
"position" : 4
}, {
"token" : "to.see.if",
"start_offset" : 17,
"end_offset" : 26,
"type" : "<ALPHANUM>",
"position" : 5
}, {
"token" : "graylog",
"start_offset" : 27,
"end_offset" : 34,
"type" : "<ALPHANUM>",
"position" : 6
}, {
"token" : "work",
"start_offset" : 37,
"end_offset" : 41,
"type" : "<ALPHANUM>",
"position" : 7
} ]
}
However, without specifying the index in Elasticsearch, I get the result
that I am looking for:
curl 'vtor-lx-tomcat-d01:9200/_analyze?analyzer=whitespace&pretty=true' -d
'This is a $test:[to.see.if graylog() work$.'
"tokens" : [ {
"token" : "This",
"start_offset" : 0,
"end_offset" : 4,
"type" : "word",
"position" : 1
}, {
"token" : "is",
"start_offset" : 5,
"end_offset" : 7,
"type" : "word",
"position" : 2
}, {
"token" : "a",
"start_offset" : 8,
"end_offset" : 9,
"type" : "word",
"position" : 3
}, {
"token" : "$test:[to.see.if",
"start_offset" : 10,
"end_offset" : 26,
"type" : "word",
"position" : 4
}, {
"token" : "graylog()",
"start_offset" : 27,
"end_offset" : 36,
"type" : "word",
"position" : 5
}, {
"token" : "work$.",
"start_offset" : 37,
"end_offset" : 43,
"type" : "word",
"position" : 6
} ]
}
I feel like I am really close to an answer here. It appears that there is
something wrong with my index mapping/settings.
Sincerely,
On Tuesday, May 3, 2016 at 3:51:49 AM UTC-4, Jochen Schalanda wrote:
>
> Hi Dilip,
>
> are you 100% sure that the message is in a new index, that the index
> template/mapping was properly applied (see
> https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-get-mapping.html),
>
> and that it is the "message" field you were looking for (and not
> "full_message" or another field)?
>
> Cheers,
> Jochen
>
> On Monday, 2 May 2016 18:57:40 UTC+2, Dilip Muthukrishnan wrote:
>>
>> Hi Jochen,
>>
>> Thanks for your reply. I'm using graylog-1.3.4 (server). I removed and
>> added an updated version of the "graylog-internal" template and then cycled
>> the deflector through the web interface. The new index mapping reflects
>> the changes:
>>
>> "message" : {
>> "type" : "string",
>> "analyzer" : "whitespace"
>> }
>>
>>
>> However, it doesn't appear to be reflected in the search. This message
>> is from the latest index but based on this tokenization, it appears to
>> still be using the old "standard analyzer":
>>
>> 02.05.2016 12:47:33.488 *ERROR* [Shell Script Executor Thread for cpu.sh]
>> com.day.crx.core.CRXSessionImpl session# 144563 opened (103)
>> java.lang.Exception: Stack Trace at
>> com.day.crx.core.CRXSessionImpl$Tracker.open(CRXSessionImpl.java:212) at
>> com.day.crx.core.CRXSessionImpl$Tracker.<init>(CRXSessionImpl.java:205) at
>> com.day.crx.core.CRXSessionImpl.<init>(CRXSessionImpl.java:179) at
>> com.day.crx.core.CRXRepositoryImpl.createSessionInstance(CRXRepositoryImpl.java:911)
>>
>> at
>> org.apache.jackrabbit.core.RepositoryImpl.createSession(RepositoryImpl.java:959)
>>
>> at
>> org.apache.jackrabbit.core.SessionFactory.createAdminSession(SessionFactory.java:42)
>>
>> at
>> com.day.crx.sling.server.impl.SlingRepositoryWrapper.loginAdministrative(SlingRepositoryWrapper.java:76)
>>
>> at
>> com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.extractScript(ShellScriptExecutorImpl.java:161)
>>
>> at
>> com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.execute(ShellScriptExecutorImpl.java:114)
>>
>> at
>> com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:99)
>> at
>> com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:158)
>> at
>> com.adobe.granite.monitoring.impl.ScriptConfigImpl$ExecutionThread.run(ScriptConfigImpl.java:208)
>>
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>> Field terms: 02.05.2016124733.488errorshellscriptexecutorthreadforcpu.sh
>> com.day.crx.core.crxsessionimplsession144563opened103java.lang.exception
>> stacktraceattracker.opencrxsessionimpl.java212trackerinit205179
>> com.day.crx.core.crxrepositoryimpl.createsessioninstance
>> crxrepositoryimpl.java911
>> org.apache.jackrabbit.core.repositoryimpl.createsession
>> repositoryimpl.java959
>> org.apache.jackrabbit.core.sessionfactory.createadminsession
>> sessionfactory.java42
>> com.day.crx.sling.server.impl.slingrepositorywrapper.loginadministrative
>> slingrepositorywrapper.java76
>> com.adobe.granite.monitoring.impl.shellscriptexecutorimpl.extractscript
>> shellscriptexecutorimpl.java161
>> com.adobe.granite.monitoring.impl.shellscriptexecutorimpl.execute114
>> com.adobe.granite.monitoring.impl.scriptmbean.invokescriptmbean.java99158
>> com.adobe.granite.monitoring.impl.scriptconfigimplexecutionthread.run
>> scriptconfigimpl.java208java.lang.thread.runthread.java662
>>
>> As you can see, it has been stripped of various characters like colons
>> and parentheses.
>>
>>
>> On Monday, May 2, 2016 at 12:36:38 PM UTC-4, Jochen Schalanda wrote:
>>>
>>> Hi Dilip,
>>>
>>> the index mapping of Graylog is applied by the means of an index
>>> template. In Graylog 2.0.0, the index template will automatically be
>>> updated but in older versions you'll have to remove the index template
>>> yourself for it to be recreated by Graylog.
>>>
>>> See
>>> https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-templates.html
>>>
>>> for details.
>>>
>>> Cheers,
>>> Jochen
>>>
>>> On Thursday, 28 April 2016 21:42:23 UTC+2, Dilip Muthukrishnan wrote:
>>>>
>>>> I'm trying to change the analyzer from "standard" to "whitespace".
>>>> I've set the following property in my Graylog server configuration:
>>>>
>>>> elasticsearch_analyzer = whitespace
>>>>
>>>> It states that my change will be applied to new indices so I manually
>>>> cycled the deflector so that it is now pointing to graylog2_1 (previously
>>>> graylog2_0). However, the new index still uses the "standard" analyzer
>>>> based on the mapping in Elasticsearch:
>>>>
>>>> "message" : {
>>>> "type" : "string",
>>>> "analyzer" : "standard"
>>>> },
>>>>
>>>>
>>>> How do I change the analyzer?
>>>>
>>>>
--
You received this message because you are subscribed to the Google Groups
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/graylog2/7e235566-00d3-4de5-8ec7-b9a480d6e644%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.