Re: [rsyslog] liblognorm rule for nginx logs

2017-06-13 Thread Bob Gregory
Hi Luv,

we use the following rules :

rule=http:%remote_addr:word% %ident:word% %auth:word%
[%timestamp:char-to:]%] "%method:word% %request:word%
HTTP/%httpversion:float%" %status:number% %bytes_sent:number%
"%referrer:char-to:"%" "%agent:char-to:"%"%blob:rest%

rule=http:%remote_addr:word% %ident:word% %auth:word%
[%timestamp:char-to:]%] "%method:word% %request:word%
HTTP/%httpversion:float%" %status:number% %bytes_sent:number%
"%referrer:char-to:"%" "%agent:char-to:"%"

rule=http: %remote_addr:word% %ident:word% %auth:word%
[%timestamp:char-to:]%] "%method:word% %request:word%
HTTP/%httpversion:float%" %status:number% %bytes_sent:number%
"%referrer:char-to:"%" "%agent:char-to:"%"%blob:rest%

rule=http: %remote_addr:word% %ident:word% %auth:word%
[%timestamp:char-to:]%] "%method:word% %request:word%
HTTP/%httpversion:float%" %status:number% %bytes_sent:number%
"%referrer:char-to:"%" "%agent:char-to:"%"


Our nginx access log rules look like this:

log_format main '$remote_addr - $remote_user [$time_local] "$request "'
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';


On Tue, 13 Jun 2017 at 10:49 Luv via rsyslog 
wrote:

> I am sending logs to elasticsearch via rsyslog. For the parsing of those
> logs, I am using liblognorm rule.
>
> I want to create fields of nginx logs,
>
> here is a log entry,
>
> 127.0.0.1 - kibanaadmin [13/Jun/2017:14:18:17 +0530] "GET
> /ui/favicons/favicon-32x32.png HTTP/1.1" 304 0 "-" "Mozilla/5.0 (X11;
> Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0"
>
>
> Here is the pattern file,
>
> version=2
>
> rule=:%clientip:ipv4% - %user:word% [%timestamp:char-to:]%] %auth:word%
> "%verb:alpha% %request:word%" %response:number% %bytes:number%
> "%referrer:word"%" "%agent:char-to:{"extradata":"("}"
>
> The reason for parsefailure is I believe due to the date-time format.
>
> Can somebody help in creating a rule for parsing nginx logs ?
>
>
>
>
> --
> View this message in context:
> http://rsyslog-users.1305293.n2.nabble.com/liblognorm-rule-for-nginx-logs-tp7592454.html
> Sent from the rsyslog-users mailing list archive at Nabble.com.
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] Monitoring with rsyslog and riemann

2017-05-29 Thread Bob Gregory
Rumours of my death have been greatly exaggerated. I've been stuck writing
Javascript for a while, but I'm back to the REK stack and omriemann.

I got dynstats working pretty quickly by copy/pasting the impstats code.
I'm not proud.

http://io.made.com/blog/shipping-dynstats-metrics-with-omriemann/

Next up I want to implement custom json metrics, so that when a developer
logs something like:

{
   "_riemann_metric": {
 "service": "api.http_response_time",
 "fields": [
 "endpoint": "login"
 ]
  }
}

we can parse it out and forward it into our metric stack.

That's the missing feature that will let me completely replace Logstash
with Rsyslog in our logging stack.

-- B


On Fri, 30 Dec 2016 at 20:52 David Lang <da...@lang.hm> wrote:

> On Fri, 30 Dec 2016, Bob Gregory wrote:
>
> > The omriemann module is slowly coming together. I want to support a
> > richer json format for advanced use cases, but impstats works, and I can
> > send events to riemann with arbitrary message properties like so:
> >
> > action(type="omriemann"
> >  # global and local properties resolve
> >  host="hostname"
> >  # and so do json properties
> >  metric="!my_metric_field"
> >  # if we can't resolve a property for a config value, we treat it as a
> > literal.
> >  ttl="90" service="my-metric")
> >
> > I wrote a quick blog entry covering one simple use case - mostolog's log
> > explosion
> > https://io.made.com/blog/monitoring-with-riemann-and-rsyslog-part-1/
> >
> > All feedback is welcomed.
>
> also take a look at what dyn_stats generate, I expect that you would end up
> using that a lot.
>
> David Lang
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] (Omriemann) Segfault in ISOBJ_TYPE_assert ?

2016-12-23 Thread Bob Gregory
:facepalm:

That's it, thanks Rainer - merry Xmas, and I'll hopefully have a prototype
ready before our festive hangovers have worn off.

 -- B

On Fri, 23 Dec 2016 at 16:20 Rainer Gerhards <rgerha...@hq.adiscon.com>
wrote:

> 2016-12-23 16:14 GMT+01:00 Bob Gregory <bob.greg...@made.com>:
>
> > Hey all,
> >
> > I'm working on the riemann module, and I've hit a problem.
> >
> > I'm trying to use MsgGetProp to fetch data from the incoming message, in
> > the same way as ommongodb, but I'm getting a segfault during the obj_type
> > check.
> >
> >
> > BEGINdoAction_NoStrings
> > CODESTARTdoAction
> >   uchar* propVal;
> >   msgPropDescr_t propinfo;
> >
> >   short unsigned fieldfree;
> >   rs_size_t fieldlen;
> >
> >   pthread_mutex_lock();
> >
> >   propinfo.id = PROP_HOSTNAME;
> >
> >   dbgprintf("ho ho. reproduce\n");
> >   // explodes here.
> >   propVal = MsgGetProp(pMsgData, NULL, , , ,
> > NULL);
> >
> >   dbgprintf("got it\n");
> >   finalize_it:
> > pthread_mutex_unlock();
> > ENDdoAction
> >
> >
> > Narrowing it down further:
> >
> > [Thread debugging using libthread_db enabled]
> > Using host libthread_db library "/lib64/libthread_db.so.1".
> > Core was generated by `rsyslogd -dn -f test-conf/rsyslog-segfault.conf'.
> > Program terminated with signal 11, Segmentation fault.
> > #0  0x0043b263 in GetString (pThis=0x69d3ac ,
> > ppsz=0x7fdf7ef3b898, plen=0x7fdf7ef3b894) at prop.c:108
> > 108 ISOBJ_TYPE_assert(pThis, prop);
> >
> >
> > (gdb) backtrace
> > #0  0x0043b263 in GetString (pThis=0x69d3ac ,
> > ppsz=0x7fdf7ef3b898, plen=0x7fdf7ef3b894) at prop.c:108
> > #1  0x0041a454 in getHOSTNAME (pM=0x7fdf7ef3b9f0) at msg.c:2489
> > #2  0x0041c7b8 in MsgGetProp (pMsg=pMsg@entry=0x7fdf7ef3b9f0,
> > pTpe=pTpe@entry=0x0, pProp=pProp@entry=0x7fdf7ef3b9b0,
> > pPropLen=pPropLen@entry=0x7fdf7ef3b9ac,
> > pbMustBeFreed=pbMustBeFreed@entry=0x7fdf7ef3b9aa,
> > ttNow=ttNow@entry=0x0)
> > at msg.c:3408
> > #3  0x7fdf828c97f9 in doAction (pMsgData=0x7fdf7ef3b9f0,
> > pWrkrData=0x7fdf70002080) at omriemann.c:179
> > #4  0x00440c54 in actionCallDoAction (pWti=0x8644d0,
> > iparams=, pThis=0x861d30) at ../action.c:1131
> >
> >
> > Any idea how I can start debugging this? It seems like the prop_t that's
> > setup by resolvedns is not properly initialized?
> >
>
> I am almost out for holiday break. I tried to check the code and also think
> this has something to do with the hostname property. However, I wonder why
> this occurs. Does the same problem occur if you write a file? I ask because
> the property is obviously the same. This should be one of the best-tested
> areas of rsyslog, so I am really puzzled...
>
> oh wait, I see you do not dereference pMsgData. That's an array, and you
> need to access the proper message pointers. This is what ommongodb does:
>
> doc = getDefaultBSON(*(smsg_t**)pMsgData);
>
> Rainer
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


[rsyslog] (Omriemann) Segfault in ISOBJ_TYPE_assert ?

2016-12-23 Thread Bob Gregory
Hey all,

I'm working on the riemann module, and I've hit a problem.

I'm trying to use MsgGetProp to fetch data from the incoming message, in
the same way as ommongodb, but I'm getting a segfault during the obj_type
check.


BEGINdoAction_NoStrings
CODESTARTdoAction
  uchar* propVal;
  msgPropDescr_t propinfo;

  short unsigned fieldfree;
  rs_size_t fieldlen;

  pthread_mutex_lock();

  propinfo.id = PROP_HOSTNAME;

  dbgprintf("ho ho. reproduce\n");
  // explodes here.
  propVal = MsgGetProp(pMsgData, NULL, , , ,
NULL);

  dbgprintf("got it\n");
  finalize_it:
pthread_mutex_unlock();
ENDdoAction


Narrowing it down further:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `rsyslogd -dn -f test-conf/rsyslog-segfault.conf'.
Program terminated with signal 11, Segmentation fault.
#0  0x0043b263 in GetString (pThis=0x69d3ac ,
ppsz=0x7fdf7ef3b898, plen=0x7fdf7ef3b894) at prop.c:108
108 ISOBJ_TYPE_assert(pThis, prop);


(gdb) backtrace
#0  0x0043b263 in GetString (pThis=0x69d3ac ,
ppsz=0x7fdf7ef3b898, plen=0x7fdf7ef3b894) at prop.c:108
#1  0x0041a454 in getHOSTNAME (pM=0x7fdf7ef3b9f0) at msg.c:2489
#2  0x0041c7b8 in MsgGetProp (pMsg=pMsg@entry=0x7fdf7ef3b9f0,
pTpe=pTpe@entry=0x0, pProp=pProp@entry=0x7fdf7ef3b9b0,
pPropLen=pPropLen@entry=0x7fdf7ef3b9ac,
pbMustBeFreed=pbMustBeFreed@entry=0x7fdf7ef3b9aa,
ttNow=ttNow@entry=0x0)
at msg.c:3408
#3  0x7fdf828c97f9 in doAction (pMsgData=0x7fdf7ef3b9f0,
pWrkrData=0x7fdf70002080) at omriemann.c:179
#4  0x00440c54 in actionCallDoAction (pWti=0x8644d0,
iparams=, pThis=0x861d30) at ../action.c:1131


Any idea how I can start debugging this? It seems like the prop_t that's
setup by resolvedns is not properly initialized?

 -- Bob.
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] omriemann configuration (was musing on ERK stack)

2016-12-05 Thread Bob Gregory
So internally, Riemann just creates a clojure hashmap for each event {host:
blah, metric: foo, ttl: 60, ... }.

It holds a snapshot of recent events in memory, and it indexes certain
fields - host, service etc.

You can add whatever additional attributes you like, because riemann will
just add them to the map. They operate a little slower than the built-in
fields, but you can work with them in the same way:
http://riemann.io/howto.html#custom-event-attributes

On Mon, 5 Dec 2016 at 12:29 David Lang <da...@lang.hm> wrote:

> I still have a question about what the attributes are. They weren't
> mentioned
> in the video you posted.
>
> David Lang
>
> On Mon, 5 Dec 2016, Bob Gregory wrote:
>
> > Date: Mon, 05 Dec 2016 12:26:17 +
> > From: Bob Gregory <bob.greg...@made.com>
> > Reply-To: rsyslog-users <rsyslog@lists.adiscon.com>
> > To: rsyslog-users <rsyslog@lists.adiscon.com>
> > Subject: Re: [rsyslog] omriemann configuration (was musing on ERK stack)
> >
> >> what about status? is it normally/commonly left blank?
> >
> > It depends on the use case, for something like cpu usage, the state would
> > be blank; likewise for rsyslog message throughput. I would expect to see
> a
> > state for something like monitoring redis operations, or HTTP calls,
> where
> > the metric represents the latency of an operation, and the state is "ok"
> if
> > the operation succeeded, otherwise "error".
> >
> > This is really my point, that most of the fields are left empty in most
> > cases - you're right that there's a lot of flexibility in how to
> represent
> > an event, and it's really down to an end-user to understand how they want
> > to format their data.
> >
> >> as long as the syntax checker for the module can report a config error
> if
> > you
> >> don't have at least one
> >
> > Works for me.
> >
> > -- Bob.
> >
> >
> > On Mon, 5 Dec 2016 at 12:20 David Lang <da...@lang.hm> wrote:
> >
> >> On Mon, 5 Dec 2016, Bob Gregory wrote:
> >>
> >>> Yo yo, David.
> >>>
> >>> I think you're convincing me, at least on the service/programname. That
> >>> means we can default all of the required host, service, timestamp
> >> fields. I
> >>> also like the simpler approach of using the fractional part to decide
> >> which
> >>> kind of metric we're sending. That's a better user-experience.
> >>>
> >>> I still feel reasonably strongly that we oughtn't to default the other
> >>> fields, since the usual case of riemann is for them to be absent.
> >>>
> >>> Are you satisfied with the host, service, timestamp fields having
> >> defaults?
> >>> That means that the following
> >>>
> >>> action(type="omriemann" metric="1") will send {host:$hostname, time:
> >>> $timereported,   service: $programname}
> >>>
> >>> While it's not going to be very useful, it's at least something you can
> >>> dump to console on the Riemann host and see that data are flowing.
> >>
> >> what about status? is it normally/commonly left blank?
> >>
> >> as long as the syntax checker for the module can report a config error
> if
> >> you
> >> don't have at least one of description, metric, status defined in an
> >> action()
> >> call (with an error message that will make it obvious to the admin why
> >> they are
> >> getting the error)
> >>
> >> David Lang
> >>
> >>>
> >>>
> >>> On Mon, 5 Dec 2016 at 11:39 David Lang <da...@lang.hm> wrote:
> >>>
> >>> On Mon, 5 Dec 2016, Bob Gregory wrote:
> >>>
> >>>> @ David Lang, moving omriemann discussion back over here.
> >>>>
> >>>>> we need to try and come up with a reasonable default value for
> >>> parameters.
> >>>>
> >>>> I think I disagree with that. Most of the fields aren't required, and
> we
> >>>> shouldn't send them unless configured otherwise. The intention isn't
> >> that
> >>>> all logs will go to riemann, but only a small subset of logs, after
> >> being
> >>>> substantially transformed.
> >>>
> >>> the biggest thing we need to do is make sure that whatever the user
> >>> attempts to
> >>> send should not stall the feed, so it will either need to be discarde

Re: [rsyslog] omriemann configuration (was musing on ERK stack)

2016-12-05 Thread Bob Gregory
> what about status? is it normally/commonly left blank?

It depends on the use case, for something like cpu usage, the state would
be blank; likewise for rsyslog message throughput. I would expect to see a
state for something like monitoring redis operations, or HTTP calls, where
the metric represents the latency of an operation, and the state is "ok" if
the operation succeeded, otherwise "error".

This is really my point, that most of the fields are left empty in most
cases - you're right that there's a lot of flexibility in how to represent
an event, and it's really down to an end-user to understand how they want
to format their data.

> as long as the syntax checker for the module can report a config error  if
you
> don't have at least one

Works for me.

 -- Bob.


On Mon, 5 Dec 2016 at 12:20 David Lang <da...@lang.hm> wrote:

> On Mon, 5 Dec 2016, Bob Gregory wrote:
>
> > Yo yo, David.
> >
> > I think you're convincing me, at least on the service/programname. That
> > means we can default all of the required host, service, timestamp
> fields. I
> > also like the simpler approach of using the fractional part to decide
> which
> > kind of metric we're sending. That's a better user-experience.
> >
> > I still feel reasonably strongly that we oughtn't to default the other
> > fields, since the usual case of riemann is for them to be absent.
> >
> > Are you satisfied with the host, service, timestamp fields having
> defaults?
> > That means that the following
> >
> > action(type="omriemann" metric="1") will send {host:$hostname, time:
> > $timereported,   service: $programname}
> >
> > While it's not going to be very useful, it's at least something you can
> > dump to console on the Riemann host and see that data are flowing.
>
> what about status? is it normally/commonly left blank?
>
> as long as the syntax checker for the module can report a config error if
> you
> don't have at least one of description, metric, status defined in an
> action()
> call (with an error message that will make it obvious to the admin why
> they are
> getting the error)
>
> David Lang
>
> >
> >
> > On Mon, 5 Dec 2016 at 11:39 David Lang <da...@lang.hm> wrote:
> >
> > On Mon, 5 Dec 2016, Bob Gregory wrote:
> >
> >> @ David Lang, moving omriemann discussion back over here.
> >>
> >>> we need to try and come up with a reasonable default value for
> > parameters.
> >>
> >> I think I disagree with that. Most of the fields aren't required, and we
> >> shouldn't send them unless configured otherwise. The intention isn't
> that
> >> all logs will go to riemann, but only a small subset of logs, after
> being
> >> substantially transformed.
> >
> > the biggest thing we need to do is make sure that whatever the user
> > attempts to
> > send should not stall the feed, so it will either need to be discarded
> > (IMHO a
> > bad idea) or 'fixed up' if it's not valid.
> >
> >> * Description is an unusual field to include - I definitely wouldn't
> >> include the entire log message as a default.
> >> * The programname makes little sense as a service. IF I see that "nginx"
> > or
> >> "rsyslog" is oscillating between 20 and 57 on a graph, what does that
> tell
> >> me?
> >
> > the number of lines of rsyslog data you are getting if nothing else
> (which
> > may be something you want to monitor :-)
> >
> > again, I'm trying to make the defaults do something sane if nothing is
> > configured. It's better to have someone do a trivial configuration and
> get
> > flooded with data than to have them have to get a lot of things right
> before
> > anything shows up.
> >
> >> * TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not
> >> actually sure what happens with a TTL of 0, I'd guess the event
> > immediately
> >> expires, which would be problematic for many cases.
> >
> > Ok, as long as Riemann is designed to survive when no TTL is provided.
> >
> >> * The tags as used by rsyslog are unlikely to map meaningfully to the
> tags
> >> used by riemann because they have very different use cases. I  mostly
> use
> >> tags in rsyslog to tell me whether my logs are json, or HTTP access
> logs,
> >> or PHP exceptions etc so that I know how to handle the output of
> > mmnormalize
> >> - that's not useful data in my monitoring stack.
> >
> > again, I'm looking to set a default that has a chance of working, if 

Re: [rsyslog] omriemann configuration (was musing on ERK stack)

2016-12-05 Thread Bob Gregory
Yo yo, David.

I think you're convincing me, at least on the service/programname. That
means we can default all of the required host, service, timestamp fields. I
also like the simpler approach of using the fractional part to decide which
kind of metric we're sending. That's a better user-experience.

I still feel reasonably strongly that we oughtn't to default the other
fields, since the usual case of riemann is for them to be absent.

Are you satisfied with the host, service, timestamp fields having defaults?
That means that the following

action(type="omriemann" metric="1") will send {host:$hostname, time:
$timereported,   service: $programname}

While it's not going to be very useful, it's at least something you can
dump to console on the Riemann host and see that data are flowing.



On Mon, 5 Dec 2016 at 11:39 David Lang <da...@lang.hm> wrote:

On Mon, 5 Dec 2016, Bob Gregory wrote:

> @ David Lang, moving omriemann discussion back over here.
>
>> we need to try and come up with a reasonable default value for
parameters.
>
> I think I disagree with that. Most of the fields aren't required, and we
> shouldn't send them unless configured otherwise. The intention isn't that
> all logs will go to riemann, but only a small subset of logs, after being
> substantially transformed.

the biggest thing we need to do is make sure that whatever the user
attempts to
send should not stall the feed, so it will either need to be discarded
(IMHO a
bad idea) or 'fixed up' if it's not valid.

> * Description is an unusual field to include - I definitely wouldn't
> include the entire log message as a default.
> * The programname makes little sense as a service. IF I see that "nginx"
or
> "rsyslog" is oscillating between 20 and 57 on a graph, what does that tell
> me?

the number of lines of rsyslog data you are getting if nothing else (which
may be something you want to monitor :-)

again, I'm trying to make the defaults do something sane if nothing is
configured. It's better to have someone do a trivial configuration and get
flooded with data than to have them have to get a lot of things right before
anything shows up.

> * TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not
> actually sure what happens with a TTL of 0, I'd guess the event
immediately
> expires, which would be problematic for many cases.

Ok, as long as Riemann is designed to survive when no TTL is provided.

> * The tags as used by rsyslog are unlikely to map meaningfully to the tags
> used by riemann because they have very different use cases. I  mostly use
> tags in rsyslog to tell me whether my logs are json, or HTTP access logs,
> or PHP exceptions etc so that I know how to handle the output of
mmnormalize
> - that's not useful data in my monitoring stack.

again, I'm looking to set a default that has a chance of working, if you are
always setting the tag, the default doesn't matter.

I set the tags to contain a lot more info, is this a connection or
disconnection, is this a login or failed login, etc.

> It turns out, on a re-reading, that the metric isn't required either -
it's
> absolutely valid to send the event {host: localhost, service: "openvpn",
> status: "up | down" } for example.

ok, makes sense.

> Given that we can't make reasonable guesses about what the user intends, I
> think the sensible approach is to _not send_ any field for which we don't
have
> a value specified, with the exception of the source host and the timestamp
> which have obviously sane defaults.

Here I disagree, not sending anything is likely to generate support
requests of
"I configured omriemann and got a bunch of blank events, what's wrong". I'd
rather send extra data by default so that the people experimenting can at
least
see stuff show up. It's much easier to then tinker with what's showing up
than
to have to figure out what things you have to put in to get anything to
show up.

> Other than that, I think we're in agreement. I particularly like the idea
> of allowing metric to be a json object, that definitely simplifies the
> impstats case.
>
>> how do you signal metric types?
>
> I don't - that's down to upstream collectors. Riemann doesn't care what
the
> metric represents, it's just a number. It has no concept of the type of a
> metric, it just cares about the service name and host and allows you to do
> interesting things with the data stream. We use Collectd at Made, and that
> sends metrics with gauge/counter/derive in the service name.

ok, that's interesting. Every other montioring tool I've used wants it
specified
as it's passed in. In many cases, this can be configured to different
things on
different servers for the same metric.

> There is an expectation that you would only use a single field from
metric,
> metric_f, metric

Re: [rsyslog] omriemann configuration (was musing on ERK stack)

2016-12-05 Thread Bob Gregory
@ David Lang, moving omriemann discussion back over here.

> we need to try and come up with a reasonable default value for parameters.

I think I disagree with that. Most of the fields aren't required, and we
shouldn't send them unless configured otherwise. The intention isn't that
all logs will go to riemann, but only a small subset of logs, after being
substantially transformed.

* Description is an unusual field to include - I definitely wouldn't
include the entire log message as a default.
* The programname makes little sense as a service. IF I see that "nginx" or
"rsyslog" is oscillating between 20 and 57 on a graph, what does that tell
me?
* TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not
actually sure what happens with a TTL of 0, I'd guess the event immediately
expires, which would be problematic for many cases.
* The tags as used by rsyslog are unlikely to map meaningfully to the tags
used by riemann because they have very different use cases. I  mostly use
tags in rsyslog to tell me whether my logs are json, or HTTP access logs,
or PHP exceptions etc so that I know how to handle the output of mmnormalize
 - that's not useful data in my monitoring stack.

It turns out, on a re-reading, that the metric isn't required either - it's
absolutely valid to send the event {host: localhost, service: "openvpn",
status: "up | down" } for example. Given that we can't make reasonable
guesses about what the user intends, I think the sensible approach is to
_not send_ any field for which we don't have a value specified, with the
exception of the source host and the timestamp which have obviously sane
defaults.

Other than that, I think we're in agreement. I particularly like the idea
of allowing metric to be a json object, that definitely simplifies the
impstats case.

> there is only one set of metrics per event (sint64 metric_sint64, double
> metric_d, float metric_f), which do you use (or do you use multiple of
them?).
> Is there an expectation that you only use one?

> how do you signal metric types?

I don't - that's down to upstream collectors. Riemann doesn't care what the
metric represents, it's just a number. It has no concept of the type of a
metric, it just cares about the service name and host and allows you to do
interesting things with the data stream. We use Collectd at Made, and that
sends metrics with gauge/counter/derive in the service name.

There is an expectation that you would only use a single field from metric,
metric_f, metric_s64 etc. I've never actually tried sending more than one
metric to riemann in a message, because clients tend to explicitly forbid
it. Internally, all those protobuf fields map down to the same metric
field, so I'm unsure whether riemann will reject the message or select one
of the fields in an unspecified precedence. Deciding which of those protobuf
 fields to send is a open problem. Any thoughts on that?

Presumably we'll need to allow the user to specify "metric_f",
"metric_sint64" etc in their config, which means we need to be able to
select the one that was actually set to a valid value in the incoming
message. That implies that it is not an error to send an empty value to the
module where a metric was expected, eg.

action(type="omriemann"
  serverhost="riemann.internal"
  serverport=""
  host="$hostname"
  metric_sint64="$!metrics!metric_int"
  state="$!metrics!state"
  metric_f="$!metrics!metric_float"
  service="$!metrics!service")

would behave like this:

{ "metrics": {"service": "my-service", "metric_int": 1}} -> {time: 123,
host: localhost, metric_sint64: 1, service: my-service}
{ "metrics": {"service": "my-service", "metric_float": 98.7}} -> {time:
123, host: localhost, metric_f: 98.7, service: my-service}
{ "metrics": {"service": "my-service", "state": "red"}} -> {time: 123,
host: localhost, state: red, service: my-service}
{ "metrics": { "metric_int": {  "service-a": 1}, "metric_float": {
"service-b", 3.2 } }} -> [ { time: 123, host: localhost, metric_sint64: 1,
service: service-a},   {time:123, host:localhost, service: service-b,
metric_f: 3.2}]

which I guess is okay.

I have no idea what we would do if some anarchist sent us the json object {
"metrics": { "service": "trololol", "metric_f": 0.2, "metric_int": 27 }}.
We can either reject the message and log an error, or specify some
precedence order. I don't have a strong feeling either way; it's probably
ok if we reformat your integer as a double, but I generally like software
to fail fast instead of limping along doing so

Re: [rsyslog] omriemann Re: Are we building an ERK stack?

2016-12-04 Thread Bob Gregory
Hi David,

It's probably best if you _don't_ try to map syslog fields into
riemann fields because the two technologies are accomplishing different
things. Riemann is for processing metrics - numerical data about the state
of our systems, while syslog is about logs - narrative textual data about
our systems.

Service, tags, etc will need to be configured by the end-user; we shouldn't
be guessing what they might be based on our understanding of the log
message.

The reason I would need a Riemann output is that I have three use cases
where I forward data in logs to Riemann from logstash -

1) Logstash's heartbeat (so I can measure latency on my processing pipeline)
2) ERROR and CRITICAL logs so I can alert on them
3) Metrics encoded into json logs by applications.

Service is the "Thing under measurement". The closest analogue would be
programname, but one program might have many services. For example: "http
response time ms", "Bytes read", "Active users", "messages received". Each
of the keys in the key/value messages raised by impstats is a single
service.

Tags are used to aggregate and filter services, they're arbitrary bits of
data; eg. "Message type", "User account type", "ec2 instance type", "site
map area". Our biggest use case for them is in asynchronous processing
pipelines, where we use them to tag the messages we're processing so that
we can see overall throughput and latency, but drill down when we have to.

The metric is the actual measurement, it's a number.

The closest analogue to severity is the "state", which is an arbitrary
string. Usually people use the statuses "ok", "warning", "error" etc. but
it's entirely arbitrary. They're mostly used to trigger state changes in
Riemann.

Description is a narrative description of an event. We only use these in a
single use-case, which is that we forward all logs of ERROR level and
higher to riemann so that it can count them, and send us roll-up emails
every hour, or trigger pagerduty. In this use-case, we set the description
to the incoming log message.

Lastly, the TTL is used to control how long a message should be held
in-memory by Riemann. It can be used to keep a snapshot of current state.
We use it for heartbeats - when an event's TTL expires, if we haven't
received another of the same event, we can raise an alert.

Hope that makes more sense - if you're interested in learning more about
Riemann, there's a great introductory video on the site. http://riemann.io/

The only fields that are required are the host, the service, and the metric.

 -- Bob



On Mon, 5 Dec 2016 at 00:06 David Lang  wrote:

On Mon, 5 Dec 2016, Dave Cottlehuber wrote:

> https://github.com/algernon/riemann-c-client may be of interest to use
> it directly -- its been dropped into collectd as a library now as well,
> and is ported to Debian & FreeBSD already, that I know of. The protobuf
> wire format is
>
https://github.com/algernon/riemann-c-client/blob/master/lib/riemann/proto/riemann.proto
> if that's helpful.

it is.

> What I've found useful with collectd and riemann was to be able to set
> specific custom tags per instance (rsyslog server in our case) which
> makes the sorting in riemann very easy prior to parsing any specific
> message output. Mainly source & instance type:

it looks like the protobuf allows a lot of options in terms of how to store
the
data.

We can make educated guesses as to what makes sense fro the riemann point of
view, but they will only be guesses

as far as tags go, tagging it as being from rsyslog is an obvious item, and
if
we have tags from mmnormalize, they should go here. What else?

should service be the programname or the faclity?

where would facility/severity be stored? is severity == metric?

what sort of stuff normally goes in the description field?

for the attributes, one obvious one is the message, but beyond that it's
less
clear. Given that rsyslog internally tracks things as JSON, I think putting
each
json object as an attribute makes sense, but attributes can't be nested.
Internally to rsyslog, we deal with nested objects by flattening them and
seperating the tiers with a ! (i.e. {foo:{bar:baz}} == foo!bar:baz), is this
reasonable from a riemann point of view? should we use a different character
instead?

David Lang
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
LIKE THAT.
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our 

Re: [rsyslog] REK stack

2016-12-02 Thread Bob Gregory
Big +1, because "erk" sounds like the noise you make when somebody stands
on your toe at a formal social event; Rek Project makes us sound like
dangerous anarchists, or possibly a dub-techno outfit.

On Fri, 2 Dec 2016 at 10:49 Rainer Gerhards 
wrote:

Hi all,

I start a new thread as the other one has a million of different topics now
;-)

Just a short note: I think we should finally call this projekt "REK
stack" vs. ERK and other ideas. This seems to be consensus, is logical
(rsyslog-ES-Kibna, in right order) and as Brian pointed out there
already is prior art ;-).

Violent objections please here. I have updated the rsyslog github REK
project:

https://github.com/rsyslog/rsyslog/projects/1

Rainer
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
LIKE THAT.
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] Are we building an ERK stack?

2016-12-02 Thread Bob Gregory
I'm not sure that's true in the general case.

Of the errors I've had with our elk stack, upward of 95% have been caused
by type errors (json field should be an int but is an object); some small
handful have failed because a message was truncated somewhere asking the
line; a smaller number have failed because somebody hand-crafted json and
forgot about a trailing comma or quote.
Overwhelmingly, the data aren't corrupted: they were invalid at source in a
way that would still allow them to be read as plain Unicode strings.

Obviously I accept that given enough data, I'll see more interesting
failure modes that need more thought, but reading from the errorfile and
pushing to a separate error index would work very well in our environment.

On Fri, 2 Dec 2016, 08:43 David Lang, <da...@lang.hm> wrote:

On Fri, 2 Dec 2016, Bob Gregory wrote:

> You may well be able to insert the rejected log into a different index.
> Most of our failed logs are down to a mismatch between the mapping config
> and the fields in json logs.
>
> An error index that treats the whole message as a single blob should work
> fine.

what bytes would need to be escaped?

what if it's invalid unicode junk, etc.

almost by definition we are talking about corrupt data.

David Lang
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
LIKE THAT.
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] Are we building an ERK stack?

2016-12-02 Thread Bob Gregory
You may well be able to insert the rejected log into a different index.
Most of our failed logs are down to a mismatch between the mapping config
and the fields in json logs.

An error index that treats the whole message as a single blob should work
fine.

On Fri, 2 Dec 2016, 08:35 mosto...@gmail.com,  wrote:

> El 01/12/16 a las 23:08, David Lang escribió:
> > On Thu, 1 Dec 2016, mosto...@gmail.com wrote:
> >
> > I think that you are going to end up with some grief, if the message
> > could not be insterted into ES for some reason, I think the odds are
> > good that you will find that rawmsg can't be inserted either.
> After sending the email I though the same...
>
> > I would keep the errorfile as a file and look at it periodially. I
> > expect that when you first start things up, you will run into a number
> > of errors, but once you work your way though them, the error rate will
> > be low.
> >
> > Set your monitoring system to monitor the size of the errorfile, and
> > it it starts growing significantly, generate an alert.
> Would love to have a more unattended/XXth century way, if anyone knows.
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] omriemann configuration

2016-12-02 Thread Bob Gregory
The problem there is that I'd need to reformat the json output of impstats
in order for it to fit this module. I might be tempted to add a separate
output format to impstats for that case, though, because it seems perverse
to make people do that templating themselves.

We can amortize that work if we also support a statsd output, which seems
like a logical next step.

On Fri, 2 Dec 2016 at 08:12 Rainer Gerhards <rgerha...@hq.adiscon.com>
wrote:

> I need to think a bit before casting a ballot, but
>
> a) json blob sounds great
> b) sounds useful for impstats -- impstats can generate json
>
> Raienr
>
> 2016-12-02 8:41 GMT+01:00 Bob Gregory <bob.greg...@made.com>:
> > Evening all,
> >
> > I've mostly finished my last personal project, so my thoughts are turning
> > to omriemann.
> >
> > I'm trying to work out how we might configure the module. Riemann
> requires
> > that we send a protobuf encoded message containing a few pre-set fields,
> > plus whatever additional fields we feel like forwarding.
> >
> > host: localhost
> > service: cpu-load-average/1m
> > state: ok
> > time: 1480661786
> > description: "everything is perfectly fine"
> > tags: ["laptop", "personal"]
> > metric: 0.58
> > ttl: 120
> > my-custom-field: 27
> >
> > This makes it unusual for an rsyslog module: usually rsyslog is happy to
> > ship arbitrary strings to a destination and only cares about the
> _framing_
> > of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept
> > some number of static parameters, plus a free-form template for the
> actual
> > message.
> >
> > Omriemann, in order to be useful, will need to impose some structure on
> the
> > message itself.
> >
> > How do people think we should configure the module so that people have
> > flexibility over the host, metric value, metric name, and tags on a
> > per-message basis?
> >
> > I guess the simplest thing that could possibly work is defining a simple
> > message format, eg. `host=foo; metric_f=0.6;
> > service=rsyslog.impstats/utime; timestamp=1480661786` that messages need
> to
> > conform to. We can then parse out the key/value pairs in the module and
> > encode them to protobuf.
> >
> > Alternatively, we could set up the structure of the message in the config
> > itself, like this:
> >
> > action(
> >type="omriemann"
> >host="$hostname"
> >metric="$!metric.value"
> >service="$!metric.name")
> >
> > That seems more user-friendly, but rules out using custom fields. I guess
> > I'd have to create a new template per-field during module begin.
> >
> > On a related note, I think I remember seeing some discussion of
> conversion
> > functions recently. Some of the fields need to valid integers, floats,
> unix
> > timestamps etc. What's the best way of parsing those out?
> > ___
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] omriemann configuration

2016-12-02 Thread Bob Gregory
For almost all of the parameters to the module, they _must_ vary by
message. The only exceptions are things like TLS settings, or the remote
host endpoint. Everything else is structured data about an event that
happened elsewhere. Most fields can be omitted if there's no parameter set
- it's unusual that we set a description on a metric for example. Really we
only require host/metric/service - I think we should error if you try to
send an event that doesn't contain these three fields at least.

I'm absolutely happy with a json blob for setting custom fields; you're
right to question their flexibility - they're just string key/value pairs
appended to the end of the protobuf message, so a json blob is perfect.

Thanks for the second opinion. I prefer the structured approach anyway.

On Fri, 2 Dec 2016 at 07:50 David Lang <da...@lang.hm> wrote:

> On Fri, 2 Dec 2016, Bob Gregory wrote:
>
> > Evening all,
> >
> > I've mostly finished my last personal project, so my thoughts are turning
> > to omriemann.
> >
> > I'm trying to work out how we might configure the module. Riemann
> requires
> > that we send a protobuf encoded message containing a few pre-set fields,
> > plus whatever additional fields we feel like forwarding.
> >
> > host: localhost
> > service: cpu-load-average/1m
> > state: ok
> > time: 1480661786
> > description: "everything is perfectly fine"
> > tags: ["laptop", "personal"]
> > metric: 0.58
> > ttl: 120
> > my-custom-field: 27
> >
> > This makes it unusual for an rsyslog module: usually rsyslog is happy to
> > ship arbitrary strings to a destination and only cares about the
> _framing_
> > of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept
> > some number of static parameters, plus a free-form template for the
> actual
> > message.
> >
> > Omriemann, in order to be useful, will need to impose some structure on
> the
> > message itself.
> >
> > How do people think we should configure the module so that people have
> > flexibility over the host, metric value, metric name, and tags on a
> > per-message basis?
>
> use a parameter to pass the variable name to use for the field, and have a
> default if they aren't set.
>
> Also, think hard about the need to set them on a per-message basis.
>
> > I guess the simplest thing that could possibly work is defining a simple
> > message format, eg. `host=foo; metric_f=0.6;
> > service=rsyslog.impstats/utime; timestamp=1480661786` that messages need
> to
> > conform to. We can then parse out the key/value pairs in the module and
> > encode them to protobuf.
>
> no, that way lies madness (I did something very similar in the first
> iteration
> of omudpspoof, but in my defense that was before we had the action() cal)
>
> > Alternatively, we could set up the structure of the message in the config
> > itself, like this:
> >
> > action(
> >   type="omriemann"
> >   host="$hostname"
> >   metric="$!metric.value"
> >   service="$!metric.name")
> >
> > That seems more user-friendly, but rules out using custom fields. I guess
> > I'd have to create a new template per-field during module begin.
>
> this is the right approach for the fixed fields. For defining custom
> fields, can
> you accept a JSON structure and do the right thing?
>
> Given that the protobuf needs to be pre-defined and exist on both sides,
> how
> much flexibility do you really have?
>
> > On a related note, I think I remember seeing some discussion of
> conversion
> > functions recently. Some of the fields need to valid integers, floats,
> unix
> > timestamps etc. What's the best way of parsing those out?
>
> you will be passed strings [1] and need to validate them (and figure out
> what to
> do if you are passed garbage)
>
> [1] timestamps are a possible exception to this.
>
> David Lang
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


[rsyslog] omriemann configuration

2016-12-01 Thread Bob Gregory
Evening all,

I've mostly finished my last personal project, so my thoughts are turning
to omriemann.

I'm trying to work out how we might configure the module. Riemann requires
that we send a protobuf encoded message containing a few pre-set fields,
plus whatever additional fields we feel like forwarding.

host: localhost
service: cpu-load-average/1m
state: ok
time: 1480661786
description: "everything is perfectly fine"
tags: ["laptop", "personal"]
metric: 0.58
ttl: 120
my-custom-field: 27

This makes it unusual for an rsyslog module: usually rsyslog is happy to
ship arbitrary strings to a destination and only cares about the _framing_
of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept
some number of static parameters, plus a free-form template for the actual
message.

Omriemann, in order to be useful, will need to impose some structure on the
message itself.

How do people think we should configure the module so that people have
flexibility over the host, metric value, metric name, and tags on a
per-message basis?

I guess the simplest thing that could possibly work is defining a simple
message format, eg. `host=foo; metric_f=0.6;
service=rsyslog.impstats/utime; timestamp=1480661786` that messages need to
conform to. We can then parse out the key/value pairs in the module and
encode them to protobuf.

Alternatively, we could set up the structure of the message in the config
itself, like this:

action(
   type="omriemann"
   host="$hostname"
   metric="$!metric.value"
   service="$!metric.name")

That seems more user-friendly, but rules out using custom fields. I guess
I'd have to create a new template per-field during module begin.

On a related note, I think I remember seeing some discussion of conversion
functions recently. Some of the fields need to valid integers, floats, unix
timestamps etc. What's the best way of parsing those out?
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] making config changes to a running rsyslog

2016-11-24 Thread Bob Gregory
My biggest use case for SIGHUP tends to be updating addresses of things in
a cloud environment. For rsyslog I just use DNS.

On Thu, 24 Nov 2016, 19:31 David Lang,  wrote:

On Thu, 24 Nov 2016, mosto...@gmail.com wrote:

>> what are people's thoughts on these ideas?
>
> Notice there can be multiple reload scenarios:
> - reload rsyslog config (new modules, inputs, rulesets, actions...)

for now I am explicitly not trying to support the reload of the entire
config.
That really is a hard problem and effectively the same thing as a restart
(and
what happens to log messaes if the new config doesn't have the same queues
as
the old config?

> - add new inputs

and remove inputs so that things can be unmounted

> - modify a template
> - resize a queue

I don't see these as being that useful. And resizing a queue can be rather
complicated. growing an array means you have to have the address space for
it to
grow into, or you have to copy the contents of the array. shrinking an array
doesn't actually give you more available ram, it just results in unused ram.

> The simplest approach I can imagine is to signal HUP to reload: when
signal
> is received, everything is stopped, reloaded and resumed. It may be faster
> than restart, cause modules are already loaded or objects (templates,
> inputs...) still in memory.

rsyslog used to do that, but it resulted in log messages being lost at
every hup
(especially in high-traffic, complex config use-cases), that's why the HUP
was
changed to just be a log rotation thing, closing all outputs.

the problem with using HUP like this is that you don't know when the
process is
complete. you don't know when the system actually delivers the signal to the
process, let alone when the process finishes closing everything. A
synchronous
API for this sort of thing would be useful.

> There's a lot of space for improvement: unload unneeded modules, restart
only
> modified objects, rollover updates...but TBH I don't know if I would go to
> such API.

what would you mean by 'restarting' something?

I think you would be adding a huge number of complex mechanisms that would
never
be used, and therefor would be likely to develop hidden bugs.

David Lang

___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
LIKE THAT.
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] Are we building an ERK stack?

2016-11-24 Thread Bob Gregory
https://io.made.com/blog/rek-it/

I wrote this up earlier.

On Wed, 23 Nov 2016 at 19:38 mosto...@gmail.com  wrote:

> Working, spamming mail list and writing on wiki at the same time. A
> lovely afternoon...
>
> Please, add your lines: https://github.com/rsyslog/rsyslog/wiki
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] omriemann Re: Are we building an ERK stack?

2016-11-23 Thread Bob Gregory
I can easily enough knock together an omriemann - it's protobuf over TCP or
UDP.  TCP allows for message ack.

There are a couple of C clients that are useful as prior art, and I've
worked with a bunch of clients in python, haskell and golang.

On Wed, 23 Nov 2016 at 18:18 David Lang <da...@lang.hm> wrote:

> On Wed, 23 Nov 2016, Bob Gregory wrote:
>
> > For that, I'd like to see better support for GeoIP tagging, a Riemann
> > output plugin, some better guidance on "failed message queues", etc. etc.
> > etc.
>
> With a bit of digging, I can't find where Riemann defines what the
> over-the-wire
> format is that you would need to deliver logs to it.
>
> I see hints that it uses protobuf to serialize things, and has an
> application-level ack mechanism similar to what we have in relp, but the
> levels
> of indirection are stacked high, and the API documenation only points you
> at the
> function defintions.
>
> David Lang
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] Feedback request: minimal log shipper project

2016-11-23 Thread Bob Gregory
+1 for keeping things simple. It seems like we want a couple of new
modules, and some decent documentation in the form of wikis or blog posts.
We have most of the functionality already.

I'd also be interested to hear from some of the sematext people on the list.

On Wed, 23 Nov 2016, 16:27 Rainer Gerhards, 
wrote:

> 2016-11-23 17:21 GMT+01:00 mosto...@gmail.com :
> >
> >> That's a permission issue: We need to be much more restrictive
> >> (security) with who has permissions to the code than to the doc. Thus
> >> we have two repos. I'd prefer a single one, too, but that's not
> >> possible.
> >
> > Understood...does this happened in real life or just in paper? :P
> > I mean: if there are reviewers, I wouldn't care.
>
> sorry, I don't understand what you mean
>
> >
> >>
> >> Is gdocs really that visible? Does anyone agree on it? I even think
> >> some corp folks cannot access it (at least I've seen that when working
> >> with consulting customers). If we do that move, we need bold support
> >> from the community. I personally am skeptic. Besides, I'd prefer LaTex
> >> ;-)
> >
> >
> > I don't know latex yet (but I want to start someday...what about NOW?),
> but
> > google docs is easy to setup for a bunch of people, permissiosn can be
> > easily managed, and it will allow a fast-editing doc, as brainstroming
> for
> > the project.
> > Once is solid, we can switch to github
> >
> > Another option: https://www.sharelatex.com/
>
> I guess that's not for general consumption. There is some learning
> curve to LaTex ;-)
> >
> >
> >>
> >> I am not connected to them. Given the fact that the paid big $ for
> >> logstash, I don't think they would be overly enthusiastic... But I may
> >> be wrong ;-)
> >
> > I can spam them to know what they think...it seems they try to fill the
> gap
> > with Beats, but maybe they didn't.
>
> I have no issue if you try...
>
> Rainer
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


[rsyslog] Are we building an ERK stack?

2016-11-23 Thread Bob Gregory
There've been a few discussions over the last few days that are all
pointing in the same direction:

* Is it better to use Rsyslog's omelasticsearch rather than pushing to
logstash?
* Should we have a minimal log shipper component as distinct from rsyslog's
processing capabilities?
* Ought we to have an imhiredis module?

Really what we're talking about is replacing Logstash (and the various
beats) with rsyslog. I'm perfectly happy with that, Logstash is a
resource-expensive and fickle beast that spoils my otherwise pristine log
pipeline, but I do think the community ought to think about whether this is
the direction they want to take.

For my part, I'm quite happy to help build an imhiredis (and imkafka?)
module but only if I can actually dogfood it, which means replacing
Logstash in our own environment.

For that, I'd like to see better support for GeoIP tagging, a Riemann
output plugin, some better guidance on "failed message queues", etc. etc.
etc.

Are we jointly interested in building the REK stack and, if so, can we
start to work out the feature set we're missing, and the documentation we'd
need for this to work? I'm a little concerned that if we tackle the usecase
piece-meal, we'll end up with lots of disjointed parts that don't really
solve the problem: logstash is not an adequate logstash.
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


[rsyslog] libfastjson license

2016-02-05 Thread Bob Gregory
Hi all,

Really quick question: I'm building some packages for libfastjson and
lognorm on Arch, and I just wanted to check the licensing for libfastjson.

A quick grep shows me some MIT code and some GPLv2 - are there any other
licenses I should tag the package with?

-- 



*Bob Gregory*

Application Architect

MADE.COM <http://www.made.com/>

Skype: flinkywistypomm


[image: MADE]



Made.com Design Limited is a company registered in England and Wales.

Registered number: 07101408 | Registered office: 100 Charing Cross Road,
London WC2H 0HG
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] Mapping log-level names to severity numbers

2016-02-04 Thread Bob Gregory
Hi Dave,

It's the latter. Currently docker is just spraying logs out onto disk, in
both plain text and json format, and there's no logrotate. Instead, we want
just the json logs to go through rsyslog. We'll forward INFO level
application logs to Elasticsearch via Redis, and put a human-readable
version of logs into the journal.

Marking the journal entries with the appropriate syslog severity makes it
easy to query and filter.

The lookup_table functionality actually works better than my proposed
property replacer, because it's simple to modify the lookup if requirements
evolve.

 -- B

On 4 February 2016 at 15:15, Dave Caplinger <davecaplin...@solutionary.com>
wrote:

> Hi Bob,
>
> I'm curious to better understand your objective:
>
> > [some docker container] logs contain a textual severity level based on
> the log4j levels:
> > DEBUG, INFO, WARN, ERROR, CRITICAL, FATAL.
> >
> > The docker syslog integration dumps all the stdout of a container into
> > syslog with a severity of LOG_INFO, and stderr with LOG_ERR.
> >
> > I'd like to parse the incoming json and map the level names to syslog
> > severity numbers.
>
> Is it correct that you're trying to maintain the distinction between the
> LOG_INFO and LOG_ERR streams coming out of the docker containers?  If
> that's *all* you're trying to achieve, would just adding another property
> to your JSON output to store the info/err value be sufficient?
>
> Or... do you have existing downstream log processing that depends on the
> syslog severity values, so mapping the log4j {DEBUG, INFO, WARN, ERROR,
> CRITICAL, FATAL} text values onto syslog {Debug, Info, Notice, Warning,
> Error, Critical, Alert, Emergency} severities is the critical part?
>
> (Or some other scenario I haven't understood yet ...?)
>
> --
> Dave Caplinger | Director, Technical Product Management
> Solutionary — An NTT Group Security Company
>
> > On Feb 4, 2016, at 7:16 AM, Bob Gregory <bob.greg...@made.com> wrote:
> >
> > Hi David,
> >
> > We ran logstash-forwarder in a separate container, and shared volumes
> > between app containers and a forwarding container. That's problematic as
> we
> > move toward a clustered environment, because it means running multiple
> > instances of logstash forwarder, or doing something peculiar with user
> > permissions. Instead we'd like to delegate the log routing and filtering
> to
> > the host OS via Docker's log driver.
> >
> >
> > On 4 February 2016 at 11:40, David Lang <da...@lang.hm> wrote:
> >
> >> On Thu, 4 Feb 2016, Bob Gregory wrote:
> >>
> >> can you syslog over the network to localhost? rsyslog can be pretty
> >>>>
> >>> lightweight if you set queue sizes smaller than the default 500K
> messages.
> >>> If you were running logstash-forwarder, rsyslog should be lighter.
> >>>
> >>> That's an interesting idea, but it would mean running a syslog daemon
> in
> >>> each container. Generally, we stick to a single foreground process per
> >>> container, so there's no init system for managing daemonised services,
> but
> >>> that might change in the future.
> >>>
> >>
> >> how were you running the logstash forwarder then?
> >>
> >>
> >> David Lang
> >> ___
> >> rsyslog mailing list
> >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> >> http://www.rsyslog.com/professional-services/
> >> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> >> DON'T LIKE THAT.
> >>
> >
> >
> >
> > --
> >
> > 
> >
> > *Bob Gregory*
> >
> > Application Architect
> >
> > MADE.COM <http://www.made.com/>
> >
> > Skype: flinkywistypomm
> >
> >
> > [image: MADE]
> >
> >
> >
> > Made.com Design Limited is a company registered in England and Wales.
> >
> > Registered number: 07101408 | Registered office: 100 Charing Cross Road,
> > London WC2H 0HG
> > ___
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control.

Re: [rsyslog] Mapping log-level names to severity numbers

2016-02-04 Thread Bob Gregory
Hi David,

We ran logstash-forwarder in a separate container, and shared volumes
between app containers and a forwarding container. That's problematic as we
move toward a clustered environment, because it means running multiple
instances of logstash forwarder, or doing something peculiar with user
permissions. Instead we'd like to delegate the log routing and filtering to
the host OS via Docker's log driver.


On 4 February 2016 at 11:40, David Lang <da...@lang.hm> wrote:

> On Thu, 4 Feb 2016, Bob Gregory wrote:
>
> can you syslog over the network to localhost? rsyslog can be pretty
>>>
>> lightweight if you set queue sizes smaller than the default 500K messages.
>> If you were running logstash-forwarder, rsyslog should be lighter.
>>
>> That's an interesting idea, but it would mean running a syslog daemon in
>> each container. Generally, we stick to a single foreground process per
>> container, so there's no init system for managing daemonised services, but
>> that might change in the future.
>>
>
> how were you running the logstash forwarder then?
>
>
> David Lang
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>



-- 



*Bob Gregory*

Application Architect

MADE.COM <http://www.made.com/>

Skype: flinkywistypomm


[image: MADE]



Made.com Design Limited is a company registered in England and Wales.

Registered number: 07101408 | Registered office: 100 Charing Cross Road,
London WC2H 0HG
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] Mapping log-level names to severity numbers

2016-02-04 Thread Bob Gregory
Thanks Dave,

1. I wasn't aware that mmnormalize is now json capable. I might kick the
tyres on that, but the lookup works for me thus far.
2. I'm currently building from master because I've got a PR open for
version 8.17
3. IIRC master is currently building against json-c - is that true? In
either case, where do I find more info on libjsonfast? Google tells me
nothing.

 - B

On 4 February 2016 at 17:02, David Lang <da...@lang.hm> wrote:

> On Thu, 4 Feb 2016, Bob Gregory wrote:
>
> Hi Dave,
>>
>> It's the latter. Currently docker is just spraying logs out onto disk, in
>> both plain text and json format, and there's no logrotate. Instead, we
>> want
>> just the json logs to go through rsyslog. We'll forward INFO level
>> application logs to Elasticsearch via Redis, and put a human-readable
>> version of logs into the journal.
>>
>> Marking the journal entries with the appropriate syslog severity makes it
>> easy to query and filter.
>>
>> The lookup_table functionality actually works better than my proposed
>> property replacer, because it's simple to modify the lookup if
>> requirements
>> evolve.
>>
>
> a couple comments
>
> 1. using mmnormalize and the latest liblognorm (with the version=2
> ruleset), rsyslog can parse raw json, it doesn't need the @cee token any
> longer and can parse logs that are a mix of json and non-json data.
>
> 2. the table_lookup code that is in the released versions of rsyslog is
> very limited and has some known bugs. It was a prototype from work that was
> discussed and was going to be sponsored, but the company initiating the
> work fell through. Yesterday a full implementation was merged into the
> master tree for release in 8.17. You really will want to be using that
> version for anything beyond a proof of concept.
>
> 3. we have found some nasty bugs in the json-c library and as a result
> have forked it to libjsonfast, 8.16 will optionally use it if it's
> available, 8.17 will require it.
>
> and 8.17 (or a daily build version of it) will pull in the latest
> liblognorm and libjsonfast.
>
> This is one of those cases where you will really want to be on the very
> latest version.
>
>
> David Lang
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>



-- 



*Bob Gregory*

Application Architect

MADE.COM <http://www.made.com/>

Skype: flinkywistypomm


[image: MADE]



Made.com Design Limited is a company registered in England and Wales.

Registered number: 07101408 | Registered office: 100 Charing Cross Road,
London WC2H 0HG
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] Mapping log-level names to severity numbers

2016-02-03 Thread Bob Gregory
To be clear, there are several simple solutions - I could just require all
applications to log a syslog severity number in their json output - but
a) that would be no fun,
b) that might mean writing mapping code in each app,
 and c) I thought this might be a problem others had encountered for which
I could proffer a simple fix.

Yours playfully,

 -- Bob

On 3 February 2016 at 20:57, Bob Gregory <bob.greg...@made.com> wrote:

> Hi David and Peter,
>
> The applications we're deploying are running inside docker containers.
> They have no direct access to a syslog socket. While it's possible to
> bind-mount a syslog socket into a container, the considered best practice
> is to log direct to stdout. We log JSON because we have some requirements
> for capturing metadata and reporting machine-readable metrics.
>
> Until recently we were capturing that stdout into files on disk, and then
> using logstash-forwarder to ship it to Elasticsearch. We're now rebuilding
> our logging stack, and are planning on using rsyslogd to ship and transform
> logs.
>
> Usually we only ship INFO level and above to the central logging server so
> that we don't drown in noise, so we would like to store DEBUG logs in the
> local journal for troubleshooting in case of crisis.
>
> The proposed solution looks like this:
>
> Docker Container logs to stdout -> Docker Daemon captures stdout and
> directs to syslog -> Rsyslog places human readable text into the journal
> and ships the raw json to redis for later processing.
>
>
> The problem is that the docker daemon is only capturing *raw text on
> stdout*, and is not able to discern the appropriate log level. Anything
> logged to stdout is tagged with LOG_INFO, anything on stderr ships with
> LOG_ERR. I have free reign over the message content, but no useful control
> of the headers. I've considered trying to extend the syslog driver in
> Docker, but realistically that means implementing the kind of introspection
> and transformation at which rsyslog already excels.
>
>
> Yours,
>
>  -- B
>
> On 3 February 2016 at 20:24, David Lang <da...@lang.hm> wrote:
>
>> On Wed, 3 Feb 2016, Bob Gregory wrote:
>>
>> Hi all,
>>>
>>> I'm using rsyslogd as the syslog daemon on a machine running Docker. I've
>>> configured docker to use the syslog logging driver and am able to parse
>>> the
>>> json logs written to stdout by my applications.
>>>
>>> These logs contain a textual severity level based on the log4j levels:
>>> DEBUG, INFO, WARN, ERROR, CRITICAL, FATAL.
>>>
>>> The docker syslog integration dumps all the stdout of a container into
>>> syslog with a severity of LOG_INFO, and stderr with LOG_ERR.
>>>
>>> I'd like to parse the incoming json and map the level names to syslog
>>> severity numbers.
>>>
>>> I can see some related functionality in msg.c, but nothing that's exposed
>>> to end users, so I'm considering writing a new pair of property
>>> replacers:
>>> one to map numbers from standard error level or severity names; another
>>> to
>>> map severity levels to their names:
>>>
>>> template(name="my-magic-template") {
>>>property(name="$!level" severity.fromname="1")
>>>property(name="$!levelno" severity.toname="1")
>>> }
>>>
>>> template(name="my-other-template" string="%level::severity-from-name%
>>> %levelno::severity-to-name%")
>>>
>>> Has anyone got any better ideas? I'd like to continue logging from
>>> containers to stdout, and to continue using the log-level names, because
>>> the php/python/java logging libs support that out-of-the-box and it's one
>>> less thing for devs to worry about.
>>>
>>
>> a properly formatted log message is going to contain the
>> facility/severity information in the header of the message as a numberic
>> value that rsyslog parses for you.
>>
>> is there a way to get this from the docker stuff as syslog messages
>> rather than just raw json? Ideally you get JSON as the body of the syslog
>> message, so you have the header formatted properly and then have the
>> message details in JSON for easy parsing
>>
>> log a few messages with the template RSYSLOG_DebugFormat you may find
>> that this is done properly and you don't have to fight it.
>>
>> David Lang
>> ___
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/pr

Re: [rsyslog] Mapping log-level names to severity numbers

2016-02-03 Thread Bob Gregory
Hi David and Peter,

The applications we're deploying are running inside docker containers. They
have no direct access to a syslog socket. While it's possible to bind-mount
a syslog socket into a container, the considered best practice is to log
direct to stdout. We log JSON because we have some requirements for
capturing metadata and reporting machine-readable metrics.

Until recently we were capturing that stdout into files on disk, and then
using logstash-forwarder to ship it to Elasticsearch. We're now rebuilding
our logging stack, and are planning on using rsyslogd to ship and transform
logs.

Usually we only ship INFO level and above to the central logging server so
that we don't drown in noise, so we would like to store DEBUG logs in the
local journal for troubleshooting in case of crisis.

The proposed solution looks like this:

Docker Container logs to stdout -> Docker Daemon captures stdout and
directs to syslog -> Rsyslog places human readable text into the journal
and ships the raw json to redis for later processing.


The problem is that the docker daemon is only capturing *raw text on
stdout*, and is not able to discern the appropriate log level. Anything
logged to stdout is tagged with LOG_INFO, anything on stderr ships with
LOG_ERR. I have free reign over the message content, but no useful control
of the headers. I've considered trying to extend the syslog driver in
Docker, but realistically that means implementing the kind of introspection
and transformation at which rsyslog already excels.


Yours,

 -- B

On 3 February 2016 at 20:24, David Lang <da...@lang.hm> wrote:

> On Wed, 3 Feb 2016, Bob Gregory wrote:
>
> Hi all,
>>
>> I'm using rsyslogd as the syslog daemon on a machine running Docker. I've
>> configured docker to use the syslog logging driver and am able to parse
>> the
>> json logs written to stdout by my applications.
>>
>> These logs contain a textual severity level based on the log4j levels:
>> DEBUG, INFO, WARN, ERROR, CRITICAL, FATAL.
>>
>> The docker syslog integration dumps all the stdout of a container into
>> syslog with a severity of LOG_INFO, and stderr with LOG_ERR.
>>
>> I'd like to parse the incoming json and map the level names to syslog
>> severity numbers.
>>
>> I can see some related functionality in msg.c, but nothing that's exposed
>> to end users, so I'm considering writing a new pair of property replacers:
>> one to map numbers from standard error level or severity names; another to
>> map severity levels to their names:
>>
>> template(name="my-magic-template") {
>>property(name="$!level" severity.fromname="1")
>>property(name="$!levelno" severity.toname="1")
>> }
>>
>> template(name="my-other-template" string="%level::severity-from-name%
>> %levelno::severity-to-name%")
>>
>> Has anyone got any better ideas? I'd like to continue logging from
>> containers to stdout, and to continue using the log-level names, because
>> the php/python/java logging libs support that out-of-the-box and it's one
>> less thing for devs to worry about.
>>
>
> a properly formatted log message is going to contain the facility/severity
> information in the header of the message as a numberic value that rsyslog
> parses for you.
>
> is there a way to get this from the docker stuff as syslog messages rather
> than just raw json? Ideally you get JSON as the body of the syslog message,
> so you have the header formatted properly and then have the message details
> in JSON for easy parsing
>
> log a few messages with the template RSYSLOG_DebugFormat you may find that
> this is done properly and you don't have to fight it.
>
> David Lang
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>



-- 



*Bob Gregory*

Application Architect

MADE.COM <http://www.made.com/>

Skype: flinkywistypomm


[image: MADE]



Made.com Design Limited is a company registered in England and Wales.

Registered number: 07101408 | Registered office: 100 Charing Cross Road,
London WC2H 0HG
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] Mapping log-level names to severity numbers

2016-02-03 Thread Bob Gregory
Hi David,

Excellent points all. I'll take a look at the table lookup, and reconsider
whether we mount a syslog socket into containers. Just for clarity:

> if you are just catching stdout and configuring the app to write in JSON,
then you can't depend on the messages having anything in them (and can't
get messages from unmodified software/libraries that are trying to log to
/dev/log)

In practice, that's not an issue for two reasons: firstly, when running an
application in a container, it's standard practice to run it in the
foreground and send output to stdout. The only app we run that doesn't
support that pattern is postfix. Secondly, I'm generally only interested in
capturing *our* application logs as json. Things like nginx, HAProxy, or
OpenVpn continue to log in plain text for troubleshooting but we only ship
aggregate metrics to a remote server. It's simple to add a syslog severity
as a <1-7> message prefix, but that'll clash with the @cee cookie we use
for json messages.

> can you syslog over the network to localhost? rsyslog can be pretty
lightweight if you set queue sizes smaller than the default 500K messages.
If you were running logstash-forwarder, rsyslog should be lighter.

That's an interesting idea, but it would mean running a syslog daemon in
each container. Generally, we stick to a single foreground process per
container, so there's no init system for managing daemonised services, but
that might change in the future.

> If your docker containers aren't dynamic, rsyslog can open a log socket
in each one (restart rsyslog with the appropriate config after the
containers are created). I've got an enhancement request in to let us
add/remove such sockets on the fly

The docker containers are *very much* dynamic, but that mightn't be a
problem.

> you really are better off using the syslog interface (with a JSON
structured message body) than writing to stdout and intercepting.

Understood and largely agreed - I'm trying to find a reasonable balance
between standard Docker practice, sympathetic use of rsyslog, and easy
plug-and-play for developers. Thanks for your comments.

 -- B

On 4 February 2016 at 02:16, David Lang <da...@lang.hm> wrote:

> the table_lookup function that's being updated for 8.17 would let you do
> arbitrary lookups including this.
>
> but looking at your problem
>
> if you are just catching stdout and configuring the app to write in JSON,
> then you can't depend on the messages having anything in them (and can't
> get messages from unmodified software/libraries that are trying to log to
> /dev/log)
>
> can you syslog over the network to localhost? rsyslog can be pretty
> lightweight if you set queue sizes smaller than the default 500K messages.
> If you were running logstash-forwarder, rsyslog should be lighter.
>
> If your docker containers aren't dynamic, rsyslog can open a log socket in
> each one (restart rsyslog with the appropriate config after the containers
> are created). I've got an enhancement request in to let us add/remove such
> sockets on the fly
>
> you really are better off using the syslog interface (with a JSON
> structured message body) than writing to stdout and intercepting.
>
> David Lang
>
>
>  On Wed, 3 Feb 2016, Bob Gregory wrote:
>
> Date: Wed, 3 Feb 2016 21:05:50 +
>> From: Bob Gregory <bob.greg...@made.com>
>> Reply-To: rsyslog-users <rsyslog@lists.adiscon.com>
>> To: rsyslog-users <rsyslog@lists.adiscon.com>
>> Subject: Re: [rsyslog] Mapping log-level names to severity numbers
>>
>>
>> To be clear, there are several simple solutions - I could just require all
>> applications to log a syslog severity number in their json output - but
>> a) that would be no fun,
>> b) that might mean writing mapping code in each app,
>> and c) I thought this might be a problem others had encountered for which
>> I could proffer a simple fix.
>>
>> Yours playfully,
>>
>> -- Bob
>>
>> On 3 February 2016 at 20:57, Bob Gregory <bob.greg...@made.com> wrote:
>>
>> Hi David and Peter,
>>>
>>> The applications we're deploying are running inside docker containers.
>>> They have no direct access to a syslog socket. While it's possible to
>>> bind-mount a syslog socket into a container, the considered best practice
>>> is to log direct to stdout. We log JSON because we have some requirements
>>> for capturing metadata and reporting machine-readable metrics.
>>>
>>> Until recently we were capturing that stdout into files on disk, and then
>>> using logstash-forwarder to ship it to Elasticsearch. We're now
>>> rebuilding
>>> our logging stack, and are planning on using rsyslogd to ship and
>>> transform
>>

[rsyslog] Mapping log-level names to severity numbers

2016-02-03 Thread Bob Gregory
Hi all,

I'm using rsyslogd as the syslog daemon on a machine running Docker. I've
configured docker to use the syslog logging driver and am able to parse the
json logs written to stdout by my applications.

These logs contain a textual severity level based on the log4j levels:
DEBUG, INFO, WARN, ERROR, CRITICAL, FATAL.

The docker syslog integration dumps all the stdout of a container into
syslog with a severity of LOG_INFO, and stderr with LOG_ERR.

I'd like to parse the incoming json and map the level names to syslog
severity numbers.

I can see some related functionality in msg.c, but nothing that's exposed
to end users, so I'm considering writing a new pair of property replacers:
one to map numbers from standard error level or severity names; another to
map severity levels to their names:

template(name="my-magic-template") {
property(name="$!level" severity.fromname="1")
property(name="$!levelno" severity.toname="1")
}

template(name="my-other-template" string="%level::severity-from-name%
%levelno::severity-to-name%")

Has anyone got any better ideas? I'd like to continue logging from
containers to stdout, and to continue using the log-level names, because
the php/python/java logging libs support that out-of-the-box and it's one
less thing for devs to worry about.



*Bob Gregory*

Application Architect

MADE.COM <http://www.made.com/>

Skype: flinkywistypomm


[image: MADE]



Made.com Design Limited is a company registered in England and Wales.

Registered number: 07101408 | Registered office: 100 Charing Cross Road,
London WC2H 0HG
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.