Re: [rsyslog] omriemann configuration (was musing on ERK stack)
On Mon, 5 Dec 2016, Bob Gregory wrote: action(type="omriemann" metric="1") will send {host:$hostname, time: $timereported, service: $programname} One point of clarification, rsyslog doesn't currently have a config parameter type that allows for a parameter to sometimes be a constant and sometimes be a variable name. so either you would need to do set $.mymetric="1"; action(type="omriemann" metric="$.mymetric") or you would need to introduce the capability to have a parameter that can be either a constant or a variable. allowing a parameter to be something more complex (a function, or includ math) is a step even further. David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] omriemann configuration (was musing on ERK stack)
So internally, Riemann just creates a clojure hashmap for each event {host: blah, metric: foo, ttl: 60, ... }. It holds a snapshot of recent events in memory, and it indexes certain fields - host, service etc. You can add whatever additional attributes you like, because riemann will just add them to the map. They operate a little slower than the built-in fields, but you can work with them in the same way: http://riemann.io/howto.html#custom-event-attributes On Mon, 5 Dec 2016 at 12:29 David Lang wrote: > I still have a question about what the attributes are. They weren't > mentioned > in the video you posted. > > David Lang > > On Mon, 5 Dec 2016, Bob Gregory wrote: > > > Date: Mon, 05 Dec 2016 12:26:17 + > > From: Bob Gregory > > Reply-To: rsyslog-users > > To: rsyslog-users > > Subject: Re: [rsyslog] omriemann configuration (was musing on ERK stack) > > > >> what about status? is it normally/commonly left blank? > > > > It depends on the use case, for something like cpu usage, the state would > > be blank; likewise for rsyslog message throughput. I would expect to see > a > > state for something like monitoring redis operations, or HTTP calls, > where > > the metric represents the latency of an operation, and the state is "ok" > if > > the operation succeeded, otherwise "error". > > > > This is really my point, that most of the fields are left empty in most > > cases - you're right that there's a lot of flexibility in how to > represent > > an event, and it's really down to an end-user to understand how they want > > to format their data. > > > >> as long as the syntax checker for the module can report a config error > if > > you > >> don't have at least one > > > > Works for me. > > > > -- Bob. > > > > > > On Mon, 5 Dec 2016 at 12:20 David Lang wrote: > > > >> On Mon, 5 Dec 2016, Bob Gregory wrote: > >> > >>> Yo yo, David. > >>> > >>> I think you're convincing me, at least on the service/programname. That > >>> means we can default all of the required host, service, timestamp > >> fields. I > >>> also like the simpler approach of using the fractional part to decide > >> which > >>> kind of metric we're sending. That's a better user-experience. > >>> > >>> I still feel reasonably strongly that we oughtn't to default the other > >>> fields, since the usual case of riemann is for them to be absent. > >>> > >>> Are you satisfied with the host, service, timestamp fields having > >> defaults? > >>> That means that the following > >>> > >>> action(type="omriemann" metric="1") will send {host:$hostname, time: > >>> $timereported, service: $programname} > >>> > >>> While it's not going to be very useful, it's at least something you can > >>> dump to console on the Riemann host and see that data are flowing. > >> > >> what about status? is it normally/commonly left blank? > >> > >> as long as the syntax checker for the module can report a config error > if > >> you > >> don't have at least one of description, metric, status defined in an > >> action() > >> call (with an error message that will make it obvious to the admin why > >> they are > >> getting the error) > >> > >> David Lang > >> > >>> > >>> > >>> On Mon, 5 Dec 2016 at 11:39 David Lang wrote: > >>> > >>> On Mon, 5 Dec 2016, Bob Gregory wrote: > >>> > >>>> @ David Lang, moving omriemann discussion back over here. > >>>> > >>>>> we need to try and come up with a reasonable default value for > >>> parameters. > >>>> > >>>> I think I disagree with that. Most of the fields aren't required, and > we > >>>> shouldn't send them unless configured otherwise. The intention isn't > >> that > >>>> all logs will go to riemann, but only a small subset of logs, after > >> being > >>>> substantially transformed. > >>> > >>> the biggest thing we need to do is make sure that whatever the user > >>> attempts to > >>> send should not stall the feed, so it will either need to be discarded > >>> (IMHO a > >>> bad idea) or 'fixed up' if it
Re: [rsyslog] omriemann configuration (was musing on ERK stack)
I still have a question about what the attributes are. They weren't mentioned in the video you posted. David Lang On Mon, 5 Dec 2016, Bob Gregory wrote: Date: Mon, 05 Dec 2016 12:26:17 + From: Bob Gregory Reply-To: rsyslog-users To: rsyslog-users Subject: Re: [rsyslog] omriemann configuration (was musing on ERK stack) what about status? is it normally/commonly left blank? It depends on the use case, for something like cpu usage, the state would be blank; likewise for rsyslog message throughput. I would expect to see a state for something like monitoring redis operations, or HTTP calls, where the metric represents the latency of an operation, and the state is "ok" if the operation succeeded, otherwise "error". This is really my point, that most of the fields are left empty in most cases - you're right that there's a lot of flexibility in how to represent an event, and it's really down to an end-user to understand how they want to format their data. as long as the syntax checker for the module can report a config error if you don't have at least one Works for me. -- Bob. On Mon, 5 Dec 2016 at 12:20 David Lang wrote: On Mon, 5 Dec 2016, Bob Gregory wrote: Yo yo, David. I think you're convincing me, at least on the service/programname. That means we can default all of the required host, service, timestamp fields. I also like the simpler approach of using the fractional part to decide which kind of metric we're sending. That's a better user-experience. I still feel reasonably strongly that we oughtn't to default the other fields, since the usual case of riemann is for them to be absent. Are you satisfied with the host, service, timestamp fields having defaults? That means that the following action(type="omriemann" metric="1") will send {host:$hostname, time: $timereported, service: $programname} While it's not going to be very useful, it's at least something you can dump to console on the Riemann host and see that data are flowing. what about status? is it normally/commonly left blank? as long as the syntax checker for the module can report a config error if you don't have at least one of description, metric, status defined in an action() call (with an error message that will make it obvious to the admin why they are getting the error) David Lang On Mon, 5 Dec 2016 at 11:39 David Lang wrote: On Mon, 5 Dec 2016, Bob Gregory wrote: @ David Lang, moving omriemann discussion back over here. we need to try and come up with a reasonable default value for parameters. I think I disagree with that. Most of the fields aren't required, and we shouldn't send them unless configured otherwise. The intention isn't that all logs will go to riemann, but only a small subset of logs, after being substantially transformed. the biggest thing we need to do is make sure that whatever the user attempts to send should not stall the feed, so it will either need to be discarded (IMHO a bad idea) or 'fixed up' if it's not valid. * Description is an unusual field to include - I definitely wouldn't include the entire log message as a default. * The programname makes little sense as a service. IF I see that "nginx" or "rsyslog" is oscillating between 20 and 57 on a graph, what does that tell me? the number of lines of rsyslog data you are getting if nothing else (which may be something you want to monitor :-) again, I'm trying to make the defaults do something sane if nothing is configured. It's better to have someone do a trivial configuration and get flooded with data than to have them have to get a lot of things right before anything shows up. * TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not actually sure what happens with a TTL of 0, I'd guess the event immediately expires, which would be problematic for many cases. Ok, as long as Riemann is designed to survive when no TTL is provided. * The tags as used by rsyslog are unlikely to map meaningfully to the tags used by riemann because they have very different use cases. I mostly use tags in rsyslog to tell me whether my logs are json, or HTTP access logs, or PHP exceptions etc so that I know how to handle the output of mmnormalize - that's not useful data in my monitoring stack. again, I'm looking to set a default that has a chance of working, if you are always setting the tag, the default doesn't matter. I set the tags to contain a lot more info, is this a connection or disconnection, is this a login or failed login, etc. It turns out, on a re-reading, that the metric isn't required either - it's absolutely valid to send the event {host: localhost, service: "openvpn", status: "up | down" } for example. ok, makes sense. Given that we ca
Re: [rsyslog] omriemann configuration (was musing on ERK stack)
> what about status? is it normally/commonly left blank? It depends on the use case, for something like cpu usage, the state would be blank; likewise for rsyslog message throughput. I would expect to see a state for something like monitoring redis operations, or HTTP calls, where the metric represents the latency of an operation, and the state is "ok" if the operation succeeded, otherwise "error". This is really my point, that most of the fields are left empty in most cases - you're right that there's a lot of flexibility in how to represent an event, and it's really down to an end-user to understand how they want to format their data. > as long as the syntax checker for the module can report a config error if you > don't have at least one Works for me. -- Bob. On Mon, 5 Dec 2016 at 12:20 David Lang wrote: > On Mon, 5 Dec 2016, Bob Gregory wrote: > > > Yo yo, David. > > > > I think you're convincing me, at least on the service/programname. That > > means we can default all of the required host, service, timestamp > fields. I > > also like the simpler approach of using the fractional part to decide > which > > kind of metric we're sending. That's a better user-experience. > > > > I still feel reasonably strongly that we oughtn't to default the other > > fields, since the usual case of riemann is for them to be absent. > > > > Are you satisfied with the host, service, timestamp fields having > defaults? > > That means that the following > > > > action(type="omriemann" metric="1") will send {host:$hostname, time: > > $timereported, service: $programname} > > > > While it's not going to be very useful, it's at least something you can > > dump to console on the Riemann host and see that data are flowing. > > what about status? is it normally/commonly left blank? > > as long as the syntax checker for the module can report a config error if > you > don't have at least one of description, metric, status defined in an > action() > call (with an error message that will make it obvious to the admin why > they are > getting the error) > > David Lang > > > > > > > On Mon, 5 Dec 2016 at 11:39 David Lang wrote: > > > > On Mon, 5 Dec 2016, Bob Gregory wrote: > > > >> @ David Lang, moving omriemann discussion back over here. > >> > >>> we need to try and come up with a reasonable default value for > > parameters. > >> > >> I think I disagree with that. Most of the fields aren't required, and we > >> shouldn't send them unless configured otherwise. The intention isn't > that > >> all logs will go to riemann, but only a small subset of logs, after > being > >> substantially transformed. > > > > the biggest thing we need to do is make sure that whatever the user > > attempts to > > send should not stall the feed, so it will either need to be discarded > > (IMHO a > > bad idea) or 'fixed up' if it's not valid. > > > >> * Description is an unusual field to include - I definitely wouldn't > >> include the entire log message as a default. > >> * The programname makes little sense as a service. IF I see that "nginx" > > or > >> "rsyslog" is oscillating between 20 and 57 on a graph, what does that > tell > >> me? > > > > the number of lines of rsyslog data you are getting if nothing else > (which > > may be something you want to monitor :-) > > > > again, I'm trying to make the defaults do something sane if nothing is > > configured. It's better to have someone do a trivial configuration and > get > > flooded with data than to have them have to get a lot of things right > before > > anything shows up. > > > >> * TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not > >> actually sure what happens with a TTL of 0, I'd guess the event > > immediately > >> expires, which would be problematic for many cases. > > > > Ok, as long as Riemann is designed to survive when no TTL is provided. > > > >> * The tags as used by rsyslog are unlikely to map meaningfully to the > tags > >> used by riemann because they have very different use cases. I mostly > use > >> tags in rsyslog to tell me whether my logs are json, or HTTP access > logs, > >> or PHP exceptions etc so that I know how to handle the output of > > mmnormalize > >> - that's not useful data in my monitoring stack. > > > > again, I'm looking to set a default that has a chance of working, if you > are > > always setting the tag, the default doesn't matter. > > > > I set the tags to contain a lot more info, is this a connection or > > disconnection, is this a login or failed login, etc. > > > >> It turns out, on a re-reading, that the metric isn't required either - > > it's > >> absolutely valid to send the event {host: localhost, service: "openvpn", > >> status: "up | down" } for example. > > > > ok, makes sense. > > > >> Given that we can't make reasonable guesses about what the user > intends, I > >> think the sensible approach is to _not send_ any field for which we > don't > > have > >> a value specified, with the exception of the source host and the > ti
Re: [rsyslog] omriemann configuration (was musing on ERK stack)
On Mon, 5 Dec 2016, Bob Gregory wrote: Yo yo, David. I think you're convincing me, at least on the service/programname. That means we can default all of the required host, service, timestamp fields. I also like the simpler approach of using the fractional part to decide which kind of metric we're sending. That's a better user-experience. I still feel reasonably strongly that we oughtn't to default the other fields, since the usual case of riemann is for them to be absent. Are you satisfied with the host, service, timestamp fields having defaults? That means that the following action(type="omriemann" metric="1") will send {host:$hostname, time: $timereported, service: $programname} While it's not going to be very useful, it's at least something you can dump to console on the Riemann host and see that data are flowing. what about status? is it normally/commonly left blank? as long as the syntax checker for the module can report a config error if you don't have at least one of description, metric, status defined in an action() call (with an error message that will make it obvious to the admin why they are getting the error) David Lang On Mon, 5 Dec 2016 at 11:39 David Lang wrote: On Mon, 5 Dec 2016, Bob Gregory wrote: @ David Lang, moving omriemann discussion back over here. we need to try and come up with a reasonable default value for parameters. I think I disagree with that. Most of the fields aren't required, and we shouldn't send them unless configured otherwise. The intention isn't that all logs will go to riemann, but only a small subset of logs, after being substantially transformed. the biggest thing we need to do is make sure that whatever the user attempts to send should not stall the feed, so it will either need to be discarded (IMHO a bad idea) or 'fixed up' if it's not valid. * Description is an unusual field to include - I definitely wouldn't include the entire log message as a default. * The programname makes little sense as a service. IF I see that "nginx" or "rsyslog" is oscillating between 20 and 57 on a graph, what does that tell me? the number of lines of rsyslog data you are getting if nothing else (which may be something you want to monitor :-) again, I'm trying to make the defaults do something sane if nothing is configured. It's better to have someone do a trivial configuration and get flooded with data than to have them have to get a lot of things right before anything shows up. * TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not actually sure what happens with a TTL of 0, I'd guess the event immediately expires, which would be problematic for many cases. Ok, as long as Riemann is designed to survive when no TTL is provided. * The tags as used by rsyslog are unlikely to map meaningfully to the tags used by riemann because they have very different use cases. I mostly use tags in rsyslog to tell me whether my logs are json, or HTTP access logs, or PHP exceptions etc so that I know how to handle the output of mmnormalize - that's not useful data in my monitoring stack. again, I'm looking to set a default that has a chance of working, if you are always setting the tag, the default doesn't matter. I set the tags to contain a lot more info, is this a connection or disconnection, is this a login or failed login, etc. It turns out, on a re-reading, that the metric isn't required either - it's absolutely valid to send the event {host: localhost, service: "openvpn", status: "up | down" } for example. ok, makes sense. Given that we can't make reasonable guesses about what the user intends, I think the sensible approach is to _not send_ any field for which we don't have a value specified, with the exception of the source host and the timestamp which have obviously sane defaults. Here I disagree, not sending anything is likely to generate support requests of "I configured omriemann and got a bunch of blank events, what's wrong". I'd rather send extra data by default so that the people experimenting can at least see stuff show up. It's much easier to then tinker with what's showing up than to have to figure out what things you have to put in to get anything to show up. Other than that, I think we're in agreement. I particularly like the idea of allowing metric to be a json object, that definitely simplifies the impstats case. how do you signal metric types? I don't - that's down to upstream collectors. Riemann doesn't care what the metric represents, it's just a number. It has no concept of the type of a metric, it just cares about the service name and host and allows you to do interesting things with the data stream. We use Collectd at Made, and that sends metrics with gauge/counter/derive in the service name. ok, that's interesting. Every other montioring tool I've used wants it specified as it's passed in. In many cases, this can be configured to different things on different servers for the same metric
Re: [rsyslog] omriemann configuration (was musing on ERK stack)
Yo yo, David. I think you're convincing me, at least on the service/programname. That means we can default all of the required host, service, timestamp fields. I also like the simpler approach of using the fractional part to decide which kind of metric we're sending. That's a better user-experience. I still feel reasonably strongly that we oughtn't to default the other fields, since the usual case of riemann is for them to be absent. Are you satisfied with the host, service, timestamp fields having defaults? That means that the following action(type="omriemann" metric="1") will send {host:$hostname, time: $timereported, service: $programname} While it's not going to be very useful, it's at least something you can dump to console on the Riemann host and see that data are flowing. On Mon, 5 Dec 2016 at 11:39 David Lang wrote: On Mon, 5 Dec 2016, Bob Gregory wrote: > @ David Lang, moving omriemann discussion back over here. > >> we need to try and come up with a reasonable default value for parameters. > > I think I disagree with that. Most of the fields aren't required, and we > shouldn't send them unless configured otherwise. The intention isn't that > all logs will go to riemann, but only a small subset of logs, after being > substantially transformed. the biggest thing we need to do is make sure that whatever the user attempts to send should not stall the feed, so it will either need to be discarded (IMHO a bad idea) or 'fixed up' if it's not valid. > * Description is an unusual field to include - I definitely wouldn't > include the entire log message as a default. > * The programname makes little sense as a service. IF I see that "nginx" or > "rsyslog" is oscillating between 20 and 57 on a graph, what does that tell > me? the number of lines of rsyslog data you are getting if nothing else (which may be something you want to monitor :-) again, I'm trying to make the defaults do something sane if nothing is configured. It's better to have someone do a trivial configuration and get flooded with data than to have them have to get a lot of things right before anything shows up. > * TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not > actually sure what happens with a TTL of 0, I'd guess the event immediately > expires, which would be problematic for many cases. Ok, as long as Riemann is designed to survive when no TTL is provided. > * The tags as used by rsyslog are unlikely to map meaningfully to the tags > used by riemann because they have very different use cases. I mostly use > tags in rsyslog to tell me whether my logs are json, or HTTP access logs, > or PHP exceptions etc so that I know how to handle the output of mmnormalize > - that's not useful data in my monitoring stack. again, I'm looking to set a default that has a chance of working, if you are always setting the tag, the default doesn't matter. I set the tags to contain a lot more info, is this a connection or disconnection, is this a login or failed login, etc. > It turns out, on a re-reading, that the metric isn't required either - it's > absolutely valid to send the event {host: localhost, service: "openvpn", > status: "up | down" } for example. ok, makes sense. > Given that we can't make reasonable guesses about what the user intends, I > think the sensible approach is to _not send_ any field for which we don't have > a value specified, with the exception of the source host and the timestamp > which have obviously sane defaults. Here I disagree, not sending anything is likely to generate support requests of "I configured omriemann and got a bunch of blank events, what's wrong". I'd rather send extra data by default so that the people experimenting can at least see stuff show up. It's much easier to then tinker with what's showing up than to have to figure out what things you have to put in to get anything to show up. > Other than that, I think we're in agreement. I particularly like the idea > of allowing metric to be a json object, that definitely simplifies the > impstats case. > >> how do you signal metric types? > > I don't - that's down to upstream collectors. Riemann doesn't care what the > metric represents, it's just a number. It has no concept of the type of a > metric, it just cares about the service name and host and allows you to do > interesting things with the data stream. We use Collectd at Made, and that > sends metrics with gauge/counter/derive in the service name. ok, that's interesting. Every other montioring tool I've used wants it specified as it's passed in. In many cases, this can be configured to different things on different servers for the same metric. > There is an expectation that you would only use a single field from metric, > metric_f, metric_s64 etc. I've never actually tried sending more than one > metric to riemann in a message, because clients tend to explicitly forbid > it. Internally, all those protobuf fields map down to the same metric > field, so I'm unsure
Re: [rsyslog] omriemann configuration (was musing on ERK stack)
On Mon, 5 Dec 2016, Bob Gregory wrote: @ David Lang, moving omriemann discussion back over here. we need to try and come up with a reasonable default value for parameters. I think I disagree with that. Most of the fields aren't required, and we shouldn't send them unless configured otherwise. The intention isn't that all logs will go to riemann, but only a small subset of logs, after being substantially transformed. the biggest thing we need to do is make sure that whatever the user attempts to send should not stall the feed, so it will either need to be discarded (IMHO a bad idea) or 'fixed up' if it's not valid. * Description is an unusual field to include - I definitely wouldn't include the entire log message as a default. * The programname makes little sense as a service. IF I see that "nginx" or "rsyslog" is oscillating between 20 and 57 on a graph, what does that tell me? the number of lines of rsyslog data you are getting if nothing else (which may be something you want to monitor :-) again, I'm trying to make the defaults do something sane if nothing is configured. It's better to have someone do a trivial configuration and get flooded with data than to have them have to get a lot of things right before anything shows up. * TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not actually sure what happens with a TTL of 0, I'd guess the event immediately expires, which would be problematic for many cases. Ok, as long as Riemann is designed to survive when no TTL is provided. * The tags as used by rsyslog are unlikely to map meaningfully to the tags used by riemann because they have very different use cases. I mostly use tags in rsyslog to tell me whether my logs are json, or HTTP access logs, or PHP exceptions etc so that I know how to handle the output of mmnormalize - that's not useful data in my monitoring stack. again, I'm looking to set a default that has a chance of working, if you are always setting the tag, the default doesn't matter. I set the tags to contain a lot more info, is this a connection or disconnection, is this a login or failed login, etc. It turns out, on a re-reading, that the metric isn't required either - it's absolutely valid to send the event {host: localhost, service: "openvpn", status: "up | down" } for example. ok, makes sense. Given that we can't make reasonable guesses about what the user intends, I think the sensible approach is to _not send_ any field for which we don't have a value specified, with the exception of the source host and the timestamp which have obviously sane defaults. Here I disagree, not sending anything is likely to generate support requests of "I configured omriemann and got a bunch of blank events, what's wrong". I'd rather send extra data by default so that the people experimenting can at least see stuff show up. It's much easier to then tinker with what's showing up than to have to figure out what things you have to put in to get anything to show up. Other than that, I think we're in agreement. I particularly like the idea of allowing metric to be a json object, that definitely simplifies the impstats case. how do you signal metric types? I don't - that's down to upstream collectors. Riemann doesn't care what the metric represents, it's just a number. It has no concept of the type of a metric, it just cares about the service name and host and allows you to do interesting things with the data stream. We use Collectd at Made, and that sends metrics with gauge/counter/derive in the service name. ok, that's interesting. Every other montioring tool I've used wants it specified as it's passed in. In many cases, this can be configured to different things on different servers for the same metric. There is an expectation that you would only use a single field from metric, metric_f, metric_s64 etc. I've never actually tried sending more than one metric to riemann in a message, because clients tend to explicitly forbid it. Internally, all those protobuf fields map down to the same metric field, so I'm unsure whether riemann will reject the message or select one of the fields in an unspecified precedence. Deciding which of those protobuf fields to send is a open problem. Any thoughts on that? we don't want the user to have to specify the specific type, that's too likely to lead to someone picking the wrong type. Everything is a string in rsyslog to start with, so I'd say that the metric should use the s64 if there is no fractinal portion and double if there is. I have no idea what we would do if some anarchist sent us the json object { "metrics": { "service": "trololol", "metric_f": 0.2, "metric_int": 27 }}. We can either reject the message and log an error, or specify some precedence order. I don't have a strong feeling either way; it's probably ok if we reformat your integer as a double, but I generally like software to fail fast instead of limping along doing something un
Re: [rsyslog] omriemann configuration (was musing on ERK stack)
@ David Lang, moving omriemann discussion back over here. > we need to try and come up with a reasonable default value for parameters. I think I disagree with that. Most of the fields aren't required, and we shouldn't send them unless configured otherwise. The intention isn't that all logs will go to riemann, but only a small subset of logs, after being substantially transformed. * Description is an unusual field to include - I definitely wouldn't include the entire log message as a default. * The programname makes little sense as a service. IF I see that "nginx" or "rsyslog" is oscillating between 20 and 57 on a graph, what does that tell me? * TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not actually sure what happens with a TTL of 0, I'd guess the event immediately expires, which would be problematic for many cases. * The tags as used by rsyslog are unlikely to map meaningfully to the tags used by riemann because they have very different use cases. I mostly use tags in rsyslog to tell me whether my logs are json, or HTTP access logs, or PHP exceptions etc so that I know how to handle the output of mmnormalize - that's not useful data in my monitoring stack. It turns out, on a re-reading, that the metric isn't required either - it's absolutely valid to send the event {host: localhost, service: "openvpn", status: "up | down" } for example. Given that we can't make reasonable guesses about what the user intends, I think the sensible approach is to _not send_ any field for which we don't have a value specified, with the exception of the source host and the timestamp which have obviously sane defaults. Other than that, I think we're in agreement. I particularly like the idea of allowing metric to be a json object, that definitely simplifies the impstats case. > there is only one set of metrics per event (sint64 metric_sint64, double > metric_d, float metric_f), which do you use (or do you use multiple of them?). > Is there an expectation that you only use one? > how do you signal metric types? I don't - that's down to upstream collectors. Riemann doesn't care what the metric represents, it's just a number. It has no concept of the type of a metric, it just cares about the service name and host and allows you to do interesting things with the data stream. We use Collectd at Made, and that sends metrics with gauge/counter/derive in the service name. There is an expectation that you would only use a single field from metric, metric_f, metric_s64 etc. I've never actually tried sending more than one metric to riemann in a message, because clients tend to explicitly forbid it. Internally, all those protobuf fields map down to the same metric field, so I'm unsure whether riemann will reject the message or select one of the fields in an unspecified precedence. Deciding which of those protobuf fields to send is a open problem. Any thoughts on that? Presumably we'll need to allow the user to specify "metric_f", "metric_sint64" etc in their config, which means we need to be able to select the one that was actually set to a valid value in the incoming message. That implies that it is not an error to send an empty value to the module where a metric was expected, eg. action(type="omriemann" serverhost="riemann.internal" serverport="" host="$hostname" metric_sint64="$!metrics!metric_int" state="$!metrics!state" metric_f="$!metrics!metric_float" service="$!metrics!service") would behave like this: { "metrics": {"service": "my-service", "metric_int": 1}} -> {time: 123, host: localhost, metric_sint64: 1, service: my-service} { "metrics": {"service": "my-service", "metric_float": 98.7}} -> {time: 123, host: localhost, metric_f: 98.7, service: my-service} { "metrics": {"service": "my-service", "state": "red"}} -> {time: 123, host: localhost, state: red, service: my-service} { "metrics": { "metric_int": { "service-a": 1}, "metric_float": { "service-b", 3.2 } }} -> [ { time: 123, host: localhost, metric_sint64: 1, service: service-a}, {time:123, host:localhost, service: service-b, metric_f: 3.2}] which I guess is okay. I have no idea what we would do if some anarchist sent us the json object { "metrics": { "service": "trololol", "metric_f": 0.2, "metric_int": 27 }}. We can either reject the message and log an error, or specify some precedence order. I don't have a strong feeling either way; it's probably ok if we reformat your integer as a double, but I generally like software to fail fast instead of limping along doing something unexpected. On Mon, 5 Dec 2016 at 07:32 Bob Gregory wrote: Will do, thanks! I suspect the next step will be an open pull request, and I'll invite people to have a play with it and tell me what needs to happen next. -- B On Sun, 4 Dec 2016 at 23:10 Dave Cottlehuber wrote: Hi Bob, I'm a riemann user and this sounds very interesting. I am behind on list reading atm but if you want any f
Re: [rsyslog] omriemann configuration
2016-12-02 9:30 GMT+01:00 Bob Gregory : > The problem there is that I'd need to reformat the json output of impstats > in order for it to fit this module. I might be tempted to add a separate > output format to impstats for that case, though, because it seems perverse > to make people do that templating themselves. +1 for additonal format. Should work out of the box. Pls open issue tracker (and I don't mind if you create a PR for it). > > We can amortize that work if we also support a statsd output, which seems > like a logical next step. +1, also worth an issue Side-note: I like the issues as they can cleanly show what work is supposed to happen.I'll align all of them to the REK project in github. Rainer > > On Fri, 2 Dec 2016 at 08:12 Rainer Gerhards > wrote: > >> I need to think a bit before casting a ballot, but >> >> a) json blob sounds great >> b) sounds useful for impstats -- impstats can generate json >> >> Raienr >> >> 2016-12-02 8:41 GMT+01:00 Bob Gregory : >> > Evening all, >> > >> > I've mostly finished my last personal project, so my thoughts are turning >> > to omriemann. >> > >> > I'm trying to work out how we might configure the module. Riemann >> requires >> > that we send a protobuf encoded message containing a few pre-set fields, >> > plus whatever additional fields we feel like forwarding. >> > >> > host: localhost >> > service: cpu-load-average/1m >> > state: ok >> > time: 1480661786 >> > description: "everything is perfectly fine" >> > tags: ["laptop", "personal"] >> > metric: 0.58 >> > ttl: 120 >> > my-custom-field: 27 >> > >> > This makes it unusual for an rsyslog module: usually rsyslog is happy to >> > ship arbitrary strings to a destination and only cares about the >> _framing_ >> > of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept >> > some number of static parameters, plus a free-form template for the >> actual >> > message. >> > >> > Omriemann, in order to be useful, will need to impose some structure on >> the >> > message itself. >> > >> > How do people think we should configure the module so that people have >> > flexibility over the host, metric value, metric name, and tags on a >> > per-message basis? >> > >> > I guess the simplest thing that could possibly work is defining a simple >> > message format, eg. `host=foo; metric_f=0.6; >> > service=rsyslog.impstats/utime; timestamp=1480661786` that messages need >> to >> > conform to. We can then parse out the key/value pairs in the module and >> > encode them to protobuf. >> > >> > Alternatively, we could set up the structure of the message in the config >> > itself, like this: >> > >> > action( >> >type="omriemann" >> >host="$hostname" >> >metric="$!metric.value" >> >service="$!metric.name") >> > >> > That seems more user-friendly, but rules out using custom fields. I guess >> > I'd have to create a new template per-field during module begin. >> > >> > On a related note, I think I remember seeing some discussion of >> conversion >> > functions recently. Some of the fields need to valid integers, floats, >> unix >> > timestamps etc. What's the best way of parsing those out? >> > ___ >> > rsyslog mailing list >> > http://lists.adiscon.net/mailman/listinfo/rsyslog >> > http://www.rsyslog.com/professional-services/ >> > What's up with rsyslog? Follow https://twitter.com/rgerhards >> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> ___ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] omriemann configuration
The problem there is that I'd need to reformat the json output of impstats in order for it to fit this module. I might be tempted to add a separate output format to impstats for that case, though, because it seems perverse to make people do that templating themselves. We can amortize that work if we also support a statsd output, which seems like a logical next step. On Fri, 2 Dec 2016 at 08:12 Rainer Gerhards wrote: > I need to think a bit before casting a ballot, but > > a) json blob sounds great > b) sounds useful for impstats -- impstats can generate json > > Raienr > > 2016-12-02 8:41 GMT+01:00 Bob Gregory : > > Evening all, > > > > I've mostly finished my last personal project, so my thoughts are turning > > to omriemann. > > > > I'm trying to work out how we might configure the module. Riemann > requires > > that we send a protobuf encoded message containing a few pre-set fields, > > plus whatever additional fields we feel like forwarding. > > > > host: localhost > > service: cpu-load-average/1m > > state: ok > > time: 1480661786 > > description: "everything is perfectly fine" > > tags: ["laptop", "personal"] > > metric: 0.58 > > ttl: 120 > > my-custom-field: 27 > > > > This makes it unusual for an rsyslog module: usually rsyslog is happy to > > ship arbitrary strings to a destination and only cares about the > _framing_ > > of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept > > some number of static parameters, plus a free-form template for the > actual > > message. > > > > Omriemann, in order to be useful, will need to impose some structure on > the > > message itself. > > > > How do people think we should configure the module so that people have > > flexibility over the host, metric value, metric name, and tags on a > > per-message basis? > > > > I guess the simplest thing that could possibly work is defining a simple > > message format, eg. `host=foo; metric_f=0.6; > > service=rsyslog.impstats/utime; timestamp=1480661786` that messages need > to > > conform to. We can then parse out the key/value pairs in the module and > > encode them to protobuf. > > > > Alternatively, we could set up the structure of the message in the config > > itself, like this: > > > > action( > >type="omriemann" > >host="$hostname" > >metric="$!metric.value" > >service="$!metric.name") > > > > That seems more user-friendly, but rules out using custom fields. I guess > > I'd have to create a new template per-field during module begin. > > > > On a related note, I think I remember seeing some discussion of > conversion > > functions recently. Some of the fields need to valid integers, floats, > unix > > timestamps etc. What's the best way of parsing those out? > > ___ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] omriemann configuration
I need to think a bit before casting a ballot, but a) json blob sounds great b) sounds useful for impstats -- impstats can generate json Raienr 2016-12-02 8:41 GMT+01:00 Bob Gregory : > Evening all, > > I've mostly finished my last personal project, so my thoughts are turning > to omriemann. > > I'm trying to work out how we might configure the module. Riemann requires > that we send a protobuf encoded message containing a few pre-set fields, > plus whatever additional fields we feel like forwarding. > > host: localhost > service: cpu-load-average/1m > state: ok > time: 1480661786 > description: "everything is perfectly fine" > tags: ["laptop", "personal"] > metric: 0.58 > ttl: 120 > my-custom-field: 27 > > This makes it unusual for an rsyslog module: usually rsyslog is happy to > ship arbitrary strings to a destination and only cares about the _framing_ > of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept > some number of static parameters, plus a free-form template for the actual > message. > > Omriemann, in order to be useful, will need to impose some structure on the > message itself. > > How do people think we should configure the module so that people have > flexibility over the host, metric value, metric name, and tags on a > per-message basis? > > I guess the simplest thing that could possibly work is defining a simple > message format, eg. `host=foo; metric_f=0.6; > service=rsyslog.impstats/utime; timestamp=1480661786` that messages need to > conform to. We can then parse out the key/value pairs in the module and > encode them to protobuf. > > Alternatively, we could set up the structure of the message in the config > itself, like this: > > action( >type="omriemann" >host="$hostname" >metric="$!metric.value" >service="$!metric.name") > > That seems more user-friendly, but rules out using custom fields. I guess > I'd have to create a new template per-field during module begin. > > On a related note, I think I remember seeing some discussion of conversion > functions recently. Some of the fields need to valid integers, floats, unix > timestamps etc. What's the best way of parsing those out? > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] omriemann configuration
For almost all of the parameters to the module, they _must_ vary by message. The only exceptions are things like TLS settings, or the remote host endpoint. Everything else is structured data about an event that happened elsewhere. Most fields can be omitted if there's no parameter set - it's unusual that we set a description on a metric for example. Really we only require host/metric/service - I think we should error if you try to send an event that doesn't contain these three fields at least. I'm absolutely happy with a json blob for setting custom fields; you're right to question their flexibility - they're just string key/value pairs appended to the end of the protobuf message, so a json blob is perfect. Thanks for the second opinion. I prefer the structured approach anyway. On Fri, 2 Dec 2016 at 07:50 David Lang wrote: > On Fri, 2 Dec 2016, Bob Gregory wrote: > > > Evening all, > > > > I've mostly finished my last personal project, so my thoughts are turning > > to omriemann. > > > > I'm trying to work out how we might configure the module. Riemann > requires > > that we send a protobuf encoded message containing a few pre-set fields, > > plus whatever additional fields we feel like forwarding. > > > > host: localhost > > service: cpu-load-average/1m > > state: ok > > time: 1480661786 > > description: "everything is perfectly fine" > > tags: ["laptop", "personal"] > > metric: 0.58 > > ttl: 120 > > my-custom-field: 27 > > > > This makes it unusual for an rsyslog module: usually rsyslog is happy to > > ship arbitrary strings to a destination and only cares about the > _framing_ > > of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept > > some number of static parameters, plus a free-form template for the > actual > > message. > > > > Omriemann, in order to be useful, will need to impose some structure on > the > > message itself. > > > > How do people think we should configure the module so that people have > > flexibility over the host, metric value, metric name, and tags on a > > per-message basis? > > use a parameter to pass the variable name to use for the field, and have a > default if they aren't set. > > Also, think hard about the need to set them on a per-message basis. > > > I guess the simplest thing that could possibly work is defining a simple > > message format, eg. `host=foo; metric_f=0.6; > > service=rsyslog.impstats/utime; timestamp=1480661786` that messages need > to > > conform to. We can then parse out the key/value pairs in the module and > > encode them to protobuf. > > no, that way lies madness (I did something very similar in the first > iteration > of omudpspoof, but in my defense that was before we had the action() cal) > > > Alternatively, we could set up the structure of the message in the config > > itself, like this: > > > > action( > > type="omriemann" > > host="$hostname" > > metric="$!metric.value" > > service="$!metric.name") > > > > That seems more user-friendly, but rules out using custom fields. I guess > > I'd have to create a new template per-field during module begin. > > this is the right approach for the fixed fields. For defining custom > fields, can > you accept a JSON structure and do the right thing? > > Given that the protobuf needs to be pre-defined and exist on both sides, > how > much flexibility do you really have? > > > On a related note, I think I remember seeing some discussion of > conversion > > functions recently. Some of the fields need to valid integers, floats, > unix > > timestamps etc. What's the best way of parsing those out? > > you will be passed strings [1] and need to validate them (and figure out > what to > do if you are passed garbage) > > [1] timestamps are a possible exception to this. > > David Lang > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] omriemann configuration
On Fri, 2 Dec 2016, Bob Gregory wrote: Evening all, I've mostly finished my last personal project, so my thoughts are turning to omriemann. I'm trying to work out how we might configure the module. Riemann requires that we send a protobuf encoded message containing a few pre-set fields, plus whatever additional fields we feel like forwarding. host: localhost service: cpu-load-average/1m state: ok time: 1480661786 description: "everything is perfectly fine" tags: ["laptop", "personal"] metric: 0.58 ttl: 120 my-custom-field: 27 This makes it unusual for an rsyslog module: usually rsyslog is happy to ship arbitrary strings to a destination and only cares about the _framing_ of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept some number of static parameters, plus a free-form template for the actual message. Omriemann, in order to be useful, will need to impose some structure on the message itself. How do people think we should configure the module so that people have flexibility over the host, metric value, metric name, and tags on a per-message basis? use a parameter to pass the variable name to use for the field, and have a default if they aren't set. Also, think hard about the need to set them on a per-message basis. I guess the simplest thing that could possibly work is defining a simple message format, eg. `host=foo; metric_f=0.6; service=rsyslog.impstats/utime; timestamp=1480661786` that messages need to conform to. We can then parse out the key/value pairs in the module and encode them to protobuf. no, that way lies madness (I did something very similar in the first iteration of omudpspoof, but in my defense that was before we had the action() cal) Alternatively, we could set up the structure of the message in the config itself, like this: action( type="omriemann" host="$hostname" metric="$!metric.value" service="$!metric.name") That seems more user-friendly, but rules out using custom fields. I guess I'd have to create a new template per-field during module begin. this is the right approach for the fixed fields. For defining custom fields, can you accept a JSON structure and do the right thing? Given that the protobuf needs to be pre-defined and exist on both sides, how much flexibility do you really have? On a related note, I think I remember seeing some discussion of conversion functions recently. Some of the fields need to valid integers, floats, unix timestamps etc. What's the best way of parsing those out? you will be passed strings [1] and need to validate them (and figure out what to do if you are passed garbage) [1] timestamps are a possible exception to this. David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
[rsyslog] omriemann configuration
Evening all, I've mostly finished my last personal project, so my thoughts are turning to omriemann. I'm trying to work out how we might configure the module. Riemann requires that we send a protobuf encoded message containing a few pre-set fields, plus whatever additional fields we feel like forwarding. host: localhost service: cpu-load-average/1m state: ok time: 1480661786 description: "everything is perfectly fine" tags: ["laptop", "personal"] metric: 0.58 ttl: 120 my-custom-field: 27 This makes it unusual for an rsyslog module: usually rsyslog is happy to ship arbitrary strings to a destination and only cares about the _framing_ of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept some number of static parameters, plus a free-form template for the actual message. Omriemann, in order to be useful, will need to impose some structure on the message itself. How do people think we should configure the module so that people have flexibility over the host, metric value, metric name, and tags on a per-message basis? I guess the simplest thing that could possibly work is defining a simple message format, eg. `host=foo; metric_f=0.6; service=rsyslog.impstats/utime; timestamp=1480661786` that messages need to conform to. We can then parse out the key/value pairs in the module and encode them to protobuf. Alternatively, we could set up the structure of the message in the config itself, like this: action( type="omriemann" host="$hostname" metric="$!metric.value" service="$!metric.name") That seems more user-friendly, but rules out using custom fields. I guess I'd have to create a new template per-field during module begin. On a related note, I think I remember seeing some discussion of conversion functions recently. Some of the fields need to valid integers, floats, unix timestamps etc. What's the best way of parsing those out? ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.