Thanks, I'll have a look. For me the default template created each field
as a multi-field, with the regular, analysed field and an additional "raw"
un-analysed field. I'm extracting quite a lot of fields from the different
log types, which is something I was doing in Splunk before trying
elasticsearch.
"Alert_Level" : {
"type" : "multi_field",
"fields" : {
"Alert_Level" : {
"type" : "string",
"omit_norms" : true
},
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"omit_norms" : true,
"index_options" : "docs",
"include_in_all" : false,
"ignore_above" : 256
}
}
},
I created a new default template in elasticsearch:
curl -XPUT 'http://localhost:9200/_template/template_logstash/' -d '{
"template": "logstash-*",
"settings": {
"index.store.compress.stored": true
},
"mappings": {
"_default_": {
"_source": { "compress": "true" },
"_all" : {
"enabled" : false
}
}
}
}'
This has applied, but the compression doesn't seem to do much. I'm at the
point where I might only be able to store a limited amount of data in
elasticsearch :(
Chris
On Wednesday, March 19, 2014 7:37:41 PM UTC, Joshua Garnett wrote:
>
> Chris,
>
> Yeah digging into the templates was another big win for me. For instance,
> if you try to do a topN query on signature with the default template, you
> end up with words like the and and as your top hits. Setting signature
> to not_analyzed ensures the field isn't tokenized. Below is my template.
>
> --Josh
>
> Logstash settings:
>
> output {
> elasticsearch {
> host => "10.0.0.1"
> cluster => "mycluster"
> index => "logstash-ossec-%{+YYYY.MM.dd}"
> index_type => "ossec"
> template_name => "template-ossec"
> template => "/etc/logstash/elasticsearch_template.json"
> template_overwrite => true
> }
> }
>
> elasticsearch_template.json
>
> {
> "template":"logstash-ossec-*",
> "settings":{
> "index.analysis.analyzer.default.stopwords":"_none_",
> "index.refresh_interval":"5s",
> "index.analysis.analyzer.default.type":"standard"
> },
> "mappings":{
> "ossec":{
> "properties":{
> "@fields.hostname":{
> "type":"string",
> "index":"not_analyzed"
> },
> "@fields.product":{
> "type":"string",
> "index":"not_analyzed"
> },
> "@message":{
> "type":"string",
> "index":"not_analyzed"
> },
> "@timestamp":{
> "type":"date"
> },
> "@version":{
> "type":"string",
> "index":"not_analyzed"
> },
> "acct":{
> "type":"string",
> "index":"not_analyzed"
> },
> "ossec_group":{
> "type":"string",
> "index":"not_analyzed"
> },
> "ossec_server":{
> "type":"string",
> "index":"not_analyzed"
> },
> "raw_message":{
> "type":"string",
> "index":"analyzed"
> },
> "reporting_ip":{
> "type":"string",
> "index":"not_analyzed"
> },
> "reporting_source":{
> "type":"string",
> "index":"analyzed"
> },
> "rule_number":{
> "type":"integer"
> },
> "severity":{
> "type":"integer"
> },
> "signature":{
> "type":"string",
> "index":"not_analyzed"
> },
> "src_ip":{
> "type":"string",
> "index":"not_analyzed"
> },
> "geoip":{
> "type" : "object",
> "dynamic": true,
> "path": "full",
> "properties" : {
> "location" : { "type" : "geo_point" }
> }
> }
> },
> "_all":{
> "enabled":true
> }
> }
> }
> }
>
>
> On Wed, Mar 19, 2014 at 10:54 AM, Chris H <[email protected]<javascript:>
> > wrote:
>
>> Hi, Joshua.
>>
>> I'm using a very similar technique. Are you applying a mapping template,
>> or using the default? I'm using the default automatic templates, because
>> frankly I don't fully understand templates. What this means though is that
>> my daily indexes are larger than the uncompressed alerts.log, between 2-4GB
>> per day, and I'm rapidly running out of disk space. I gather than this can
>> be optimised by enabling compression and excluding the _source and _all
>> fields through the mapping template, but I'm not sure exactly how this
>> works. Just wondered if you've come across the same problem.
>>
>> Thanks.
>>
>>
>> On Saturday, March 8, 2014 10:02:35 PM UTC, Joshua Garnett wrote:
>>>
>>> All,
>>>
>>> I'll probably write a blog post on this, but I wanted to share some work
>>> I've done today. http://vichargrave.com/ossec-log-management-with-
>>> elasticsearch/ shows how to use OSSEC's syslog output to route messages
>>> to Elasticsearch. The problem with this method is it uses UDP. Even when
>>> sending packets to a local process UDP by definition is unreliable.
>>> Garbage collections and other system events can cause packets to be lost.
>>> I've found it tends to cap out at around 1,500 messages per minute.
>>>
>>> To address this issue I've put together a logstash config that will read
>>> the alerts from /var/ossec/logs/alerts/alerts.log. On top of solving
>>> the reliability issue, it also fixes issues with multi-lines being lost,
>>> and adds geoip lookups for the src_ip. I tested it against approximately
>>> 1GB of alerts (3M events).
>>>
>>> input {
>>> file {
>>> type => "ossec"
>>> path => "/var/ossec/logs/alerts/alerts.log"
>>> sincedb_path => "/opt/logstash/"
>>> codec => multiline {
>>> pattern => "^\*\*"
>>> negate => true
>>> what => "previous"
>>> }
>>> }
>>> }
>>>
>>> filter {
>>> if [type] == "ossec" {
>>> # Parse the header of the alert
>>> grok {
>>> # Matches 2014 Mar 08 00:57:49 (some.server.com) 10.1.2.3->ossec
>>> # (?m) fixes issues with multi-lines see
>>> https://logstash.jira.com/browse/LOGSTASH-509
>>> match => ["message", "(?m)\*\* Alert
>>> %{DATA:timestamp_seconds}:%{SPACE}%{WORD}?%{SPACE}\-
>>> %{DATA:ossec_group}\n%{YEAR} %{SYSLOGTIMESTAMP:syslog_timestamp}
>>> \(%{DATA:reporting_host}\)
>>> %{IP:reporting_ip}\-\>%{DATA:reporting_source}\nRule:
>>> %{NONNEGINT:rule_number} \(level %{NONNEGINT:severity}\) \-\>
>>> '%{DATA:signature}'\n%{GREEDYDATA:remaining_message}"]
>>>
>>> # Matches 2014 Mar 08 00:00:00 ossec-server01->/var/log/auth.log
>>> match => ["message", "(?m)\*\* Alert
>>> %{DATA:timestamp_seconds}:%{SPACE}%{WORD}?%{SPACE}\-
>>> %{DATA:ossec_group}\n%{YEAR} %{SYSLOGTIMESTAMP:syslog_timestamp}
>>> %{DATA:reporting_host}\-\>%{DATA:reporting_source}\nRule:
>>> %{NONNEGINT:rule_number} \(level %{NONNEGINT:severity}\) \-\>
>>> '%{DATA:signature}'\n%{GREEDYDATA:remaining_message}"]
>>> }
>>>
>>> # Attempt to parse additional data from the alert
>>> grok {
>>> match => ["remaining_message", "(?m)(Src IP:
>>> %{IP:src_ip}%{SPACE})?(Src Port: %{NONNEGINT:src_port}%{SPACE})?(Dst
>>> IP: %{IP:dst_ip}%{SPACE})?(Dst Port: %{NONNEGINT:dst_port}%{SPACE})?(User:
>>> %{USER:acct}%{SPACE})?%{GREEDYDATA:real_message}"]
>>> }
>>>
>>> geoip {
>>> source => "src_ip"
>>> }
>>>
>>> mutate {
>>> convert => [ "severity", "integer"]
>>> replace => [ "@message", "%{real_message}" ]
>>> replace => [ "@fields.hostname", "%{reporting_host}"]
>>> add_field => [ "@fields.product", "ossec"]
>>> add_field => [ "raw_message", "%{message}"]
>>> add_field => [ "ossec_server", "%{host}"]
>>> remove_field => [ "type", "syslog_program", "syslog_timestamp",
>>> "reporting_host", "message", "timestamp_seconds", "real_message",
>>> "remaining_message", "path", "host", "tags"]
>>> }
>>> }
>>> }
>>>
>>> output {
>>> elasticsearch {
>>> host => "10.0.0.1"
>>> cluster => "mycluster"
>>> }
>>> }
>>>
>>> Here are a few examples of the output this generates.
>>>
>>> {
>>> "@timestamp":"2014-03-08T20:34:08.847Z",
>>> "@version":"1",
>>> "ossec_group":"syslog,sshd,invalid_login,authentication_failed,",
>>> "reporting_ip":"10.1.2.3",
>>> "reporting_source":"/var/log/auth.log",
>>> "rule_number":"5710",
>>> "severity":5,
>>> "signature":"Attempt to login using a non-existent user",
>>> "src_ip":"112.65.211.164",
>>> "geoip":{
>>> "ip":"112.65.211.164",
>>> "country_code2":"CN",
>>> "country_code3":"CHN",
>>> "country_name":"China",
>>> "continent_code":"AS",
>>> "region_name":"23",
>>> "city_name":"Shanghai",
>>> "latitude":31.045600000000007,
>>> "longitude":121.3997,
>>> "timezone":"Asia/Shanghai",
>>> "real_region_name":"Shanghai",
>>> "location":[
>>> 121.3997,
>>> 31.045600000000007
>>> ]
>>> },
>>> "@message":"Mar 8 01:00:59 someserver sshd[22874]: Invalid user
>>> oracle from 112.65.211.164\n",
>>> "@fields.hostname":"someserver.somedomain.com",
>>> "@fields.product":"ossec",
>>> "raw_message":"** Alert 1394240459.2305861: -
>>> syslog,sshd,invalid_login,authentication_failed,\n2014 Mar 08 01:00:59 (
>>> someserver.somedomain.com) 10.1.2.3->/var/log/auth.log\nRule: 5710
>>> (level 5) -> 'Attempt to login using a non-existent user'\nSrc IP:
>>> 112.65.211.164\nMar 8 01:00:59 someserver sshd[22874]: Invalid user oracle
>>> from 112.65.211.164\n",
>>> "ossec_server":"ossec-server.somedomain.com"
>>> }
>>>
>>> and
>>>
>>> {
>>> "@timestamp":"2014-03-08T21:15:23.278Z",
>>> "@version":"1",
>>> "ossec_group":"syslog,sudo",
>>> "reporting_source":"/var/log/auth.log",
>>> "rule_number":"5402",
>>> "severity":3,
>>> "signature":"Successful sudo to ROOT executed",
>>> "acct":"nagios",
>>> "@message":"Mar 8 00:00:03 ossec-server sudo: nagios : TTY=unknown
>>> ; PWD=/ ; USER=root ; COMMAND=/usr/lib/some/command",
>>> "@fields.hostname":"ossec-server",
>>> "@fields.product":"ossec",
>>> "raw_message":"** Alert 1394236804.1451: - syslog,sudo\n2014 Mar 08
>>> 00:00:04 ossec-server->/var/log/auth.log\nRule: 5402 (level 3) ->
>>> 'Successful sudo to ROOT executed'\nUser: nagios\nMar 8 00:00:03
>>> ossec-server sudo: nagios : TTY=unknown ; PWD=/ ; USER=root ;
>>> COMMAND=/usr/lib/some/command",
>>> "ossec_server":"ossec-server.somedomain.com"
>>> }
>>>
>>> If you combine the above with a custom Elasticsearch template, you can
>>> put together some really nice Kibana dashboards.
>>>
>>>
>>> --Josh
>>>
>>>
>>> --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "ossec-list" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
---
You received this message because you are subscribed to the Google Groups
"ossec-list" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.