Hi Jochen,
I'm still stuck on this one. Any help would be appreciated. Thanks.
Sincerely,
Dilip M.
On Tuesday, May 3, 2016 at 9:32:37 AM UTC-4, Dilip Muthukrishnan wrote:
>
> Hi Jochen,
>
> Here's what my "graylog-internal" template currently looks like (as seen
> via the Elasticsearch API):
>
> {
> "graylog-internal" : {
> "order" : 0,
> "template" : "graylog2_*",
> "settings" : { },
> "mappings" : {
> "message" : {
> "_source" : {
> "compress" : true,
> "enabled" : true
> },
> "dynamic_templates" : [ {
> "internal_fields" : {
> "mapping" : {
> "index" : "not_analyzed",
> "doc_values" : true
> },
> "match" : "gl2_*"
> }
> }, {
> "store_generic" : {
> "mapping" : {
> "index" : "not_analyzed"
> },
> "match" : "*"
> }
> } ],
> "_ttl" : {
> "enabled" : true
> },
> "properties" : {
> "message" : {
> "index" : "analyzed",
> "analyzer" : "whitespace",
> "type" : "string"
> },
> "timestamp" : {
> "format" : "yyyy-MM-dd HH:mm:ss.SSS",
> "doc_values" : true,
> "type" : "date"
> },
> "source" : {
> "index" : "analyzed",
> "analyzer" : "analyzer_keyword",
> "type" : "string"
> },
> "full_message" : {
> "index" : "analyzed",
> "analyzer" : "whitespace",
> "type" : "string"
> }
> }
> }
> },
> "aliases" : { }
> }
> }
>
>
> Here's what my graylog2_3 index currently looks like (as seen via the
> Elasticsearch API):
>
> {
> "graylog2_3" : {
> "aliases" : {
> "graylog2_deflector" : { }
> },
> "mappings" : {
> "message" : {
> "dynamic_templates" : [ {
> "internal_fields" : {
> "mapping" : {
> "index" : "not_analyzed",
> "doc_values" : true
> },
> "match" : "gl2_*"
> }
> }, {
> "store_generic" : {
> "mapping" : {
> "index" : "not_analyzed"
> },
> "match" : "*"
> }
> } ],
> "_ttl" : {
> "enabled" : true
> },
> "_source" : {
> "compress" : true
> },
> "properties" : {
> "full_message" : {
> "type" : "string",
> "analyzer" : "whitespace"
> },
> "gl2_remote_ip" : {
> "type" : "string",
> "index" : "not_analyzed",
> "doc_values" : true
> },
> "gl2_remote_port" : {
> "type" : "long",
> "doc_values" : true
> },
> "gl2_source_collector" : {
> "type" : "string",
> "index" : "not_analyzed",
> "doc_values" : true
> },
> "gl2_source_collector_input" : {
> "type" : "string",
> "index" : "not_analyzed",
> "doc_values" : true
> },
> "gl2_source_input" : {
> "type" : "string",
> "index" : "not_analyzed",
> "doc_values" : true
> },
> "gl2_source_node" : {
> "type" : "string",
> "index" : "not_analyzed",
> "doc_values" : true
> },
> "level" : {
> "type" : "string",
> "index" : "not_analyzed"
> },
> "message" : {
> "type" : "string",
> "analyzer" : "whitespace"
> },
> "source" : {
> "type" : "string",
> "analyzer" : "analyzer_keyword"
> },
> "source_file" : {
> "type" : "string",
> "index" : "not_analyzed"
> },
> "timestamp" : {
> "type" : "date",
> "doc_values" : true,
> "format" : "yyyy-MM-dd HH:mm:ss.SSS"
> },
> "version" : {
> "type" : "string",
> "index" : "not_analyzed"
> }
> }
> }
> },
> "settings" : {
> "index" : {
> "creation_date" : "1462197971182",
> "uuid" : "ylBuS8y3SBKRYMyLuMWApg",
> "analysis" : {
> "analyzer" : {
> "analyzer_keyword" : {
> "filter" : "lowercase",
> "tokenizer" : "keyword"
> }
> }
> },
> "number_of_replicas" : "0",
> "number_of_shards" : "4",
> "version" : {
> "created" : "1070399"
> }
> }
> },
> "warmers" : { }
> }
> }
>
>
> After cycling the deflector so that it points to the new index,
> graylog2_3, I proceeded to delete my old indices.
>
> Using the Graylog API browser, I tried to tokenize a random string (This
> is a $test:[to.see.if graylog() work$.):
>
>
> http://vtor-lx-tomcat-d01:12900/messages/graylog2_3/analyze?string=This%20is%20a%20%24test%3A%5Bto.see.if%20graylog()%20work%24%5D.&pretty=true
>
> {
> "tokens" : [ "this", "is", "a", "test", "to.see.if", "graylog", "work" ]
> }
>
>
> This makes sense because if I attempt to tokenize the same string via
> Elasticsearch (using the same index), I get the same result:
>
> curl 'vtor-lx-tomcat-d01:9200/graylog2_3/_analyze?pretty=true' -d 'This is
> a $test:[to.see.if graylog() work$.'
>
> "tokens" : [ {
> "token" : "this",
> "start_offset" : 0,
> "end_offset" : 4,
> "type" : "<ALPHANUM>",
> "position" : 1
> }, {
> "token" : "is",
> "start_offset" : 5,
> "end_offset" : 7,
> "type" : "<ALPHANUM>",
> "position" : 2
> }, {
> "token" : "a",
> "start_offset" : 8,
> "end_offset" : 9,
> "type" : "<ALPHANUM>",
> "position" : 3
> }, {
> "token" : "test",
> "start_offset" : 11,
> "end_offset" : 15,
> "type" : "<ALPHANUM>",
> "position" : 4
> }, {
> "token" : "to.see.if",
> "start_offset" : 17,
> "end_offset" : 26,
> "type" : "<ALPHANUM>",
> "position" : 5
> }, {
> "token" : "graylog",
> "start_offset" : 27,
> "end_offset" : 34,
> "type" : "<ALPHANUM>",
> "position" : 6
> }, {
> "token" : "work",
> "start_offset" : 37,
> "end_offset" : 41,
> "type" : "<ALPHANUM>",
> "position" : 7
> } ]
> }
>
> However, without specifying the index in Elasticsearch, I get the result
> that I am looking for:
>
> curl 'vtor-lx-tomcat-d01:9200/_analyze?analyzer=whitespace&pretty=true' -d
> 'This is a $test:[to.see.if graylog() work$.'
>
> "tokens" : [ {
> "token" : "This",
> "start_offset" : 0,
> "end_offset" : 4,
> "type" : "word",
> "position" : 1
> }, {
> "token" : "is",
> "start_offset" : 5,
> "end_offset" : 7,
> "type" : "word",
> "position" : 2
> }, {
> "token" : "a",
> "start_offset" : 8,
> "end_offset" : 9,
> "type" : "word",
> "position" : 3
> }, {
> "token" : "$test:[to.see.if",
> "start_offset" : 10,
> "end_offset" : 26,
> "type" : "word",
> "position" : 4
> }, {
> "token" : "graylog()",
> "start_offset" : 27,
> "end_offset" : 36,
> "type" : "word",
> "position" : 5
> }, {
> "token" : "work$.",
> "start_offset" : 37,
> "end_offset" : 43,
> "type" : "word",
> "position" : 6
> } ]
> }
>
> I feel like I am really close to an answer here. It appears that there is
> something wrong with my index mapping/settings.
>
> Sincerely,
>
> On Tuesday, May 3, 2016 at 3:51:49 AM UTC-4, Jochen Schalanda wrote:
>>
>> Hi Dilip,
>>
>> are you 100% sure that the message is in a new index, that the index
>> template/mapping was properly applied (see
>> https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-get-mapping.html),
>>
>> and that it is the "message" field you were looking for (and not
>> "full_message" or another field)?
>>
>> Cheers,
>> Jochen
>>
>> On Monday, 2 May 2016 18:57:40 UTC+2, Dilip Muthukrishnan wrote:
>>>
>>> Hi Jochen,
>>>
>>> Thanks for your reply. I'm using graylog-1.3.4 (server). I removed and
>>> added an updated version of the "graylog-internal" template and then cycled
>>> the deflector through the web interface. The new index mapping reflects
>>> the changes:
>>>
>>> "message" : {
>>> "type" : "string",
>>> "analyzer" : "whitespace"
>>> }
>>>
>>>
>>> However, it doesn't appear to be reflected in the search. This message
>>> is from the latest index but based on this tokenization, it appears to
>>> still be using the old "standard analyzer":
>>>
>>> 02.05.2016 12:47:33.488 *ERROR* [Shell Script Executor Thread for
>>> cpu.sh] com.day.crx.core.CRXSessionImpl session# 144563 opened (103)
>>> java.lang.Exception: Stack Trace at
>>> com.day.crx.core.CRXSessionImpl$Tracker.open(CRXSessionImpl.java:212) at
>>> com.day.crx.core.CRXSessionImpl$Tracker.<init>(CRXSessionImpl.java:205) at
>>> com.day.crx.core.CRXSessionImpl.<init>(CRXSessionImpl.java:179) at
>>> com.day.crx.core.CRXRepositoryImpl.createSessionInstance(CRXRepositoryImpl.java:911)
>>>
>>> at
>>> org.apache.jackrabbit.core.RepositoryImpl.createSession(RepositoryImpl.java:959)
>>>
>>> at
>>> org.apache.jackrabbit.core.SessionFactory.createAdminSession(SessionFactory.java:42)
>>>
>>> at
>>> com.day.crx.sling.server.impl.SlingRepositoryWrapper.loginAdministrative(SlingRepositoryWrapper.java:76)
>>>
>>> at
>>> com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.extractScript(ShellScriptExecutorImpl.java:161)
>>>
>>> at
>>> com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.execute(ShellScriptExecutorImpl.java:114)
>>>
>>> at
>>> com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:99)
>>> at
>>> com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:158)
>>> at
>>> com.adobe.granite.monitoring.impl.ScriptConfigImpl$ExecutionThread.run(ScriptConfigImpl.java:208)
>>>
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>> Field terms: 02.05.2016124733.488errorshellscriptexecutorthreadforcpu.sh
>>> com.day.crx.core.crxsessionimplsession144563opened103java.lang.exception
>>> stacktraceattracker.opencrxsessionimpl.java212trackerinit205179
>>> com.day.crx.core.crxrepositoryimpl.createsessioninstance
>>> crxrepositoryimpl.java911
>>> org.apache.jackrabbit.core.repositoryimpl.createsession
>>> repositoryimpl.java959
>>> org.apache.jackrabbit.core.sessionfactory.createadminsession
>>> sessionfactory.java42
>>> com.day.crx.sling.server.impl.slingrepositorywrapper.loginadministrative
>>> slingrepositorywrapper.java76
>>> com.adobe.granite.monitoring.impl.shellscriptexecutorimpl.extractscript
>>> shellscriptexecutorimpl.java161
>>> com.adobe.granite.monitoring.impl.shellscriptexecutorimpl.execute114
>>> com.adobe.granite.monitoring.impl.scriptmbean.invokescriptmbean.java99
>>> 158com.adobe.granite.monitoring.impl.scriptconfigimplexecutionthread.run
>>> scriptconfigimpl.java208java.lang.thread.runthread.java662
>>>
>>> As you can see, it has been stripped of various characters like colons
>>> and parentheses.
>>>
>>>
>>> On Monday, May 2, 2016 at 12:36:38 PM UTC-4, Jochen Schalanda wrote:
>>>>
>>>> Hi Dilip,
>>>>
>>>> the index mapping of Graylog is applied by the means of an index
>>>> template. In Graylog 2.0.0, the index template will automatically be
>>>> updated but in older versions you'll have to remove the index template
>>>> yourself for it to be recreated by Graylog.
>>>>
>>>> See
>>>> https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-templates.html
>>>>
>>>> for details.
>>>>
>>>> Cheers,
>>>> Jochen
>>>>
>>>> On Thursday, 28 April 2016 21:42:23 UTC+2, Dilip Muthukrishnan wrote:
>>>>>
>>>>> I'm trying to change the analyzer from "standard" to "whitespace".
>>>>> I've set the following property in my Graylog server configuration:
>>>>>
>>>>> elasticsearch_analyzer = whitespace
>>>>>
>>>>> It states that my change will be applied to new indices so I manually
>>>>> cycled the deflector so that it is now pointing to graylog2_1 (previously
>>>>> graylog2_0). However, the new index still uses the "standard" analyzer
>>>>> based on the mapping in Elasticsearch:
>>>>>
>>>>> "message" : {
>>>>> "type" : "string",
>>>>> "analyzer" : "standard"
>>>>> },
>>>>>
>>>>>
>>>>> How do I change the analyzer?
>>>>>
>>>>>
--
You received this message because you are subscribed to the Google Groups
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/graylog2/d0d17fe8-5274-42b5-8912-6c9025fe3547%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.