@ David Lang, moving omriemann discussion back over here.

> we need to try and come up with a reasonable default value for parameters.

I think I disagree with that. Most of the fields aren't required, and we
shouldn't send them unless configured otherwise. The intention isn't that
all logs will go to riemann, but only a small subset of logs, after being
substantially transformed.

* Description is an unusual field to include - I definitely wouldn't
include the entire log message as a default.
* The programname makes little sense as a service. IF I see that "nginx" or
"rsyslog" is oscillating between 20 and 57 on a graph, what does that tell
me?
* TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not
actually sure what happens with a TTL of 0, I'd guess the event immediately
expires, which would be problematic for many cases.
* The tags as used by rsyslog are unlikely to map meaningfully to the tags
used by riemann because they have very different use cases. I  mostly use
tags in rsyslog to tell me whether my logs are json, or HTTP access logs,
or PHP exceptions etc so that I know how to handle the output of mmnormalize
 - that's not useful data in my monitoring stack.

It turns out, on a re-reading, that the metric isn't required either - it's
absolutely valid to send the event {host: localhost, service: "openvpn",
status: "up | down" } for example. Given that we can't make reasonable
guesses about what the user intends, I think the sensible approach is to
_not send_ any field for which we don't have a value specified, with the
exception of the source host and the timestamp which have obviously sane
defaults.

Other than that, I think we're in agreement. I particularly like the idea
of allowing metric to be a json object, that definitely simplifies the
impstats case.

> there is only one set of metrics per event (sint64 metric_sint64, double
> metric_d, float metric_f), which do you use (or do you use multiple of
them?).
> Is there an expectation that you only use one?

> how do you signal metric types?

I don't - that's down to upstream collectors. Riemann doesn't care what the
metric represents, it's just a number. It has no concept of the type of a
metric, it just cares about the service name and host and allows you to do
interesting things with the data stream. We use Collectd at Made, and that
sends metrics with gauge/counter/derive in the service name.

There is an expectation that you would only use a single field from metric,
metric_f, metric_s64 etc. I've never actually tried sending more than one
metric to riemann in a message, because clients tend to explicitly forbid
it. Internally, all those protobuf fields map down to the same metric
field, so I'm unsure whether riemann will reject the message or select one
of the fields in an unspecified precedence. Deciding which of those protobuf
 fields to send is a open problem. Any thoughts on that?

Presumably we'll need to allow the user to specify "metric_f",
"metric_sint64" etc in their config, which means we need to be able to
select the one that was actually set to a valid value in the incoming
message. That implies that it is not an error to send an empty value to the
module where a metric was expected, eg.

action(type="omriemann"
          serverhost="riemann.internal"
          serverport="5555"
          host="$hostname"
          metric_sint64="$!metrics!metric_int"
          state="$!metrics!state"
          metric_f="$!metrics!metric_float"
          service="$!metrics!service")

would behave like this:

{ "metrics": {"service": "my-service", "metric_int": 1}} -> {time: 123,
host: localhost, metric_sint64: 1, service: my-service}
{ "metrics": {"service": "my-service", "metric_float": 98.7}} -> {time:
123, host: localhost, metric_f: 98.7, service: my-service}
{ "metrics": {"service": "my-service", "state": "red"}} -> {time: 123,
host: localhost, state: red, service: my-service}
{ "metrics": { "metric_int": {  "service-a": 1}, "metric_float": {
"service-b", 3.2 } }} -> [ { time: 123, host: localhost, metric_sint64: 1,
service: service-a},   {time:123, host:localhost, service: service-b,
metric_f: 3.2}]

which I guess is okay.

I have no idea what we would do if some anarchist sent us the json object {
"metrics": { "service": "trololol", "metric_f": 0.2, "metric_int": 27 }}.
We can either reject the message and log an error, or specify some
precedence order. I don't have a strong feeling either way; it's probably
ok if we reformat your integer as a double, but I generally like software
to fail fast instead of limping along doing something unexpected.

On Mon, 5 Dec 2016 at 07:32 Bob Gregory <bob.greg...@made.com> wrote:

Will do, thanks!

I suspect the next step will be an open pull request, and I'll invite
people to have a play with it and tell me what needs to happen next.

 -- B

On Sun, 4 Dec 2016 at 23:10 Dave Cottlehuber <d...@skunkwerks.at> wrote:

Hi Bob,

I'm a riemann user and this sounds very interesting. I am behind on list
reading atm but if you want any feedback on the discussion just @ reply
me or contact directly and I'l pop in to the discussion.

Sounds great :-)

A+
Dave

On Fri, 2 Dec 2016, at 08:41, Bob Gregory wrote:
> Evening all,
>
> I've mostly finished my last personal project, so my thoughts are turning
> to omriemann.
>
> I'm trying to work out how we might configure the module. Riemann
> requires
> that we send a protobuf encoded message containing a few pre-set fields,
> plus whatever additional fields we feel like forwarding.
>
> host: localhost
> service: cpu-load-average/1m
> state: ok
> time: 1480661786
> description: "everything is perfectly fine"
> tags: ["laptop", "personal"]
> metric: 0.58
> ttl: 120
> my-custom-field: 27
>
> This makes it unusual for an rsyslog module: usually rsyslog is happy to
> ship arbitrary strings to a destination and only cares about the
> _framing_
> of your data: omelasticsearch, ommysql, omkafka, omrelp etc. all accept
> some number of static parameters, plus a free-form template for the
> actual
> message.
>
> Omriemann, in order to be useful, will need to impose some structure on
> the
> message itself.
>
> How do people think we should configure the module so that people have
> flexibility over the host, metric value, metric name, and tags on a
> per-message basis?
>
> I guess the simplest thing that could possibly work is defining a simple
> message format, eg. `host=foo; metric_f=0.6;
> service=rsyslog.impstats/utime; timestamp=1480661786` that messages need
> to
> conform to. We can then parse out the key/value pairs in the module and
> encode them to protobuf.
>
> Alternatively, we could set up the structure of the message in the config
> itself, like this:
>
> action(
>    type="omriemann"
>    host="$hostname"
>    metric="$!metric.value"
>    service="$!metric.name")
>
> That seems more user-friendly, but rules out using custom fields. I guess
> I'd have to create a new template per-field during module begin.
>
> On a related note, I think I remember seeing some discussion of
> conversion
> functions recently. Some of the fields need to valid integers, floats,
> unix
> timestamps etc. What's the best way of parsing those out?
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.


--
—
  Dave Cottlehuber
  +43 67 67 22 44 78
  Managing Director
  Skunkwerks, GmbH
  http://skunkwerks.at/
  ATU70126204
  Firmenbuch 410811i
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to