Hi Jochen,

Here's what my "graylog-internal" template currently looks like (as seen 
via the Elasticsearch API):

{
  "graylog-internal" : {
    "order" : 0,
    "template" : "graylog2_*",
    "settings" : { },
    "mappings" : {
      "message" : {
        "_source" : {
          "compress" : true,
          "enabled" : true
        },
        "dynamic_templates" : [ {
          "internal_fields" : {
            "mapping" : {
              "index" : "not_analyzed",
              "doc_values" : true
            },
            "match" : "gl2_*"
          }
        }, {
          "store_generic" : {
            "mapping" : {
              "index" : "not_analyzed"
            },
            "match" : "*"
          }
        } ],
        "_ttl" : {
          "enabled" : true
        },
        "properties" : {
          "message" : {
            "index" : "analyzed",
            "analyzer" : "whitespace",
            "type" : "string"
          },
          "timestamp" : {
            "format" : "yyyy-MM-dd HH:mm:ss.SSS",
            "doc_values" : true,
            "type" : "date"
          },
          "source" : {
            "index" : "analyzed",
            "analyzer" : "analyzer_keyword",
            "type" : "string"
          },
          "full_message" : {
            "index" : "analyzed",
            "analyzer" : "whitespace",
            "type" : "string"
          }
        }
      }
    },
    "aliases" : { }
  }
}


Here's what my graylog2_3 index currently looks like (as seen via the 
Elasticsearch API):

{
  "graylog2_3" : {
    "aliases" : {
      "graylog2_deflector" : { }
    },
    "mappings" : {
      "message" : {
        "dynamic_templates" : [ {
          "internal_fields" : {
            "mapping" : {
              "index" : "not_analyzed",
              "doc_values" : true
            },
            "match" : "gl2_*"
          }
        }, {
          "store_generic" : {
            "mapping" : {
              "index" : "not_analyzed"
            },
            "match" : "*"
          }
        } ],
        "_ttl" : {
          "enabled" : true
        },
        "_source" : {
          "compress" : true
        },
        "properties" : {
          "full_message" : {
            "type" : "string",
            "analyzer" : "whitespace"
          },
          "gl2_remote_ip" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "gl2_remote_port" : {
            "type" : "long",
            "doc_values" : true
          },
          "gl2_source_collector" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "gl2_source_collector_input" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "gl2_source_input" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "gl2_source_node" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "level" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "message" : {
            "type" : "string",
            "analyzer" : "whitespace"
          },
          "source" : {
            "type" : "string",
            "analyzer" : "analyzer_keyword"
          },
          "source_file" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "timestamp" : {
            "type" : "date",
            "doc_values" : true,
            "format" : "yyyy-MM-dd HH:mm:ss.SSS"
          },
          "version" : {
            "type" : "string",
            "index" : "not_analyzed"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1462197971182",
        "uuid" : "ylBuS8y3SBKRYMyLuMWApg",
        "analysis" : {
          "analyzer" : {
            "analyzer_keyword" : {
              "filter" : "lowercase",
              "tokenizer" : "keyword"
            }
          }
        },
        "number_of_replicas" : "0",
        "number_of_shards" : "4",
        "version" : {
          "created" : "1070399"
        }
      }
    },
    "warmers" : { }
  }
}


After cycling the deflector so that it points to the new index, graylog2_3, 
I proceeded to delete my old indices.

Using the Graylog API browser, I tried to tokenize a random string (This is 
a $test:[to.see.if graylog() work$.):

http://vtor-lx-tomcat-d01:12900/messages/graylog2_3/analyze?string=This%20is%20a%20%24test%3A%5Bto.see.if%20graylog()%20work%24%5D.&pretty=true

{
  "tokens" : [ "this", "is", "a", "test", "to.see.if", "graylog", "work" ]
}


This makes sense because if I attempt to tokenize the same string via 
Elasticsearch (using the same index), I get the same result:

curl 'vtor-lx-tomcat-d01:9200/graylog2_3/_analyze?pretty=true' -d 'This is 
a $test:[to.see.if graylog() work$.'

"tokens" : [ {
    "token" : "this",
    "start_offset" : 0,
    "end_offset" : 4,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "is",
    "start_offset" : 5,
    "end_offset" : 7,
    "type" : "<ALPHANUM>",
    "position" : 2
  }, {
    "token" : "a",
    "start_offset" : 8,
    "end_offset" : 9,
    "type" : "<ALPHANUM>",
    "position" : 3
  }, {
    "token" : "test",
    "start_offset" : 11,
    "end_offset" : 15,
    "type" : "<ALPHANUM>",
    "position" : 4
  }, {
    "token" : "to.see.if",
    "start_offset" : 17,
    "end_offset" : 26,
    "type" : "<ALPHANUM>",
    "position" : 5
  }, {
    "token" : "graylog",
    "start_offset" : 27,
    "end_offset" : 34,
    "type" : "<ALPHANUM>",
    "position" : 6
  }, {
    "token" : "work",
    "start_offset" : 37,
    "end_offset" : 41,
    "type" : "<ALPHANUM>",
    "position" : 7
  } ]
}

However, without specifying the index in Elasticsearch, I get the result 
that I am looking for:

curl 'vtor-lx-tomcat-d01:9200/_analyze?analyzer=whitespace&pretty=true' -d 
'This is a $test:[to.see.if graylog() work$.'

"tokens" : [ {
    "token" : "This",
    "start_offset" : 0,
    "end_offset" : 4,
    "type" : "word",
    "position" : 1
  }, {
    "token" : "is",
    "start_offset" : 5,
    "end_offset" : 7,
    "type" : "word",
    "position" : 2
  }, {
    "token" : "a",
    "start_offset" : 8,
    "end_offset" : 9,
    "type" : "word",
    "position" : 3
  }, {
    "token" : "$test:[to.see.if",
    "start_offset" : 10,
    "end_offset" : 26,
    "type" : "word",
    "position" : 4
  }, {
    "token" : "graylog()",
    "start_offset" : 27,
    "end_offset" : 36,
    "type" : "word",
    "position" : 5
  }, {
    "token" : "work$.",
    "start_offset" : 37,
    "end_offset" : 43,
    "type" : "word",
    "position" : 6
  } ]
}

I feel like I am really close to an answer here.  It appears that there is 
something wrong with my index mapping/settings.

Sincerely,

On Tuesday, May 3, 2016 at 3:51:49 AM UTC-4, Jochen Schalanda wrote:
>
> Hi Dilip,
>
> are you 100% sure that the message is in a new index, that the index 
> template/mapping was properly applied (see 
> https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-get-mapping.html),
>  
> and that it is the "message" field you were looking for (and not 
> "full_message" or another field)?
>
> Cheers,
> Jochen
>
> On Monday, 2 May 2016 18:57:40 UTC+2, Dilip Muthukrishnan wrote:
>>
>> Hi Jochen,
>>
>> Thanks for your reply.  I'm using graylog-1.3.4 (server).  I removed and 
>> added an updated version of the "graylog-internal" template and then cycled 
>> the deflector through the web interface.  The new index mapping reflects 
>> the changes:
>>
>> "message" : {
>>    "type" : "string",
>>    "analyzer" : "whitespace"
>> }
>>
>>
>> However, it doesn't appear to be reflected in the search.  This message 
>> is from the latest index but based on this tokenization, it appears to 
>> still be using the old "standard analyzer":
>>
>> 02.05.2016 12:47:33.488 *ERROR* [Shell Script Executor Thread for cpu.sh] 
>> com.day.crx.core.CRXSessionImpl session# 144563 opened (103) 
>> java.lang.Exception: Stack Trace at 
>> com.day.crx.core.CRXSessionImpl$Tracker.open(CRXSessionImpl.java:212) at 
>> com.day.crx.core.CRXSessionImpl$Tracker.<init>(CRXSessionImpl.java:205) at 
>> com.day.crx.core.CRXSessionImpl.<init>(CRXSessionImpl.java:179) at 
>> com.day.crx.core.CRXRepositoryImpl.createSessionInstance(CRXRepositoryImpl.java:911)
>>  
>> at 
>> org.apache.jackrabbit.core.RepositoryImpl.createSession(RepositoryImpl.java:959)
>>  
>> at 
>> org.apache.jackrabbit.core.SessionFactory.createAdminSession(SessionFactory.java:42)
>>  
>> at 
>> com.day.crx.sling.server.impl.SlingRepositoryWrapper.loginAdministrative(SlingRepositoryWrapper.java:76)
>>  
>> at 
>> com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.extractScript(ShellScriptExecutorImpl.java:161)
>>  
>> at 
>> com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.execute(ShellScriptExecutorImpl.java:114)
>>  
>> at 
>> com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:99) 
>> at 
>> com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:158) 
>> at 
>> com.adobe.granite.monitoring.impl.ScriptConfigImpl$ExecutionThread.run(ScriptConfigImpl.java:208)
>>  
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>> Field terms: 02.05.2016124733.488errorshellscriptexecutorthreadforcpu.sh
>> com.day.crx.core.crxsessionimplsession144563opened103java.lang.exception
>> stacktraceattracker.opencrxsessionimpl.java212trackerinit205179
>> com.day.crx.core.crxrepositoryimpl.createsessioninstance
>> crxrepositoryimpl.java911
>> org.apache.jackrabbit.core.repositoryimpl.createsession
>> repositoryimpl.java959
>> org.apache.jackrabbit.core.sessionfactory.createadminsession
>> sessionfactory.java42
>> com.day.crx.sling.server.impl.slingrepositorywrapper.loginadministrative
>> slingrepositorywrapper.java76
>> com.adobe.granite.monitoring.impl.shellscriptexecutorimpl.extractscript
>> shellscriptexecutorimpl.java161
>> com.adobe.granite.monitoring.impl.shellscriptexecutorimpl.execute114
>> com.adobe.granite.monitoring.impl.scriptmbean.invokescriptmbean.java99158
>> com.adobe.granite.monitoring.impl.scriptconfigimplexecutionthread.run
>> scriptconfigimpl.java208java.lang.thread.runthread.java662
>>
>> As you can see, it has been stripped of various characters like colons 
>> and parentheses.
>>
>>
>> On Monday, May 2, 2016 at 12:36:38 PM UTC-4, Jochen Schalanda wrote:
>>>
>>> Hi Dilip,
>>>
>>> the index mapping of Graylog is applied by the means of an index 
>>> template. In Graylog 2.0.0, the index template will automatically be 
>>> updated but in older versions you'll have to remove the index template 
>>> yourself for it to be recreated by Graylog.
>>>
>>> See 
>>> https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-templates.html
>>>  
>>> for details.
>>>
>>> Cheers,
>>> Jochen
>>>
>>> On Thursday, 28 April 2016 21:42:23 UTC+2, Dilip Muthukrishnan wrote:
>>>>
>>>> I'm trying to change the analyzer from "standard" to "whitespace". 
>>>>  I've set the following property in my Graylog server configuration:
>>>>
>>>> elasticsearch_analyzer = whitespace
>>>>
>>>> It states that my change will be applied to new indices so I manually 
>>>> cycled the deflector so that it is now pointing to graylog2_1 (previously 
>>>> graylog2_0).  However, the new index still uses the "standard" analyzer 
>>>> based on the mapping in Elasticsearch:
>>>>
>>>> "message" : {
>>>>             "type" : "string",
>>>>             "analyzer" : "standard"
>>>>           },
>>>>
>>>>
>>>> How do I change the analyzer?
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/7e235566-00d3-4de5-8ec7-b9a480d6e644%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to