Yo yo, David.

I think you're convincing me, at least on the service/programname. That
means we can default all of the required host, service, timestamp fields. I
also like the simpler approach of using the fractional part to decide which
kind of metric we're sending. That's a better user-experience.

I still feel reasonably strongly that we oughtn't to default the other
fields, since the usual case of riemann is for them to be absent.

Are you satisfied with the host, service, timestamp fields having defaults?
That means that the following

action(type="omriemann" metric="1") will send {host:$hostname, time:
$timereported,   service: $programname}

While it's not going to be very useful, it's at least something you can
dump to console on the Riemann host and see that data are flowing.



On Mon, 5 Dec 2016 at 11:39 David Lang <da...@lang.hm> wrote:

On Mon, 5 Dec 2016, Bob Gregory wrote:

> @ David Lang, moving omriemann discussion back over here.
>
>> we need to try and come up with a reasonable default value for
parameters.
>
> I think I disagree with that. Most of the fields aren't required, and we
> shouldn't send them unless configured otherwise. The intention isn't that
> all logs will go to riemann, but only a small subset of logs, after being
> substantially transformed.

the biggest thing we need to do is make sure that whatever the user
attempts to
send should not stall the feed, so it will either need to be discarded
(IMHO a
bad idea) or 'fixed up' if it's not valid.

> * Description is an unusual field to include - I definitely wouldn't
> include the entire log message as a default.
> * The programname makes little sense as a service. IF I see that "nginx"
or
> "rsyslog" is oscillating between 20 and 57 on a graph, what does that tell
> me?

the number of lines of rsyslog data you are getting if nothing else (which
may be something you want to monitor :-)

again, I'm trying to make the defaults do something sane if nothing is
configured. It's better to have someone do a trivial configuration and get
flooded with data than to have them have to get a lot of things right before
anything shows up.

> * TTL can be defaulted by Riemann. We shouldn't set it to 0. I'm not
> actually sure what happens with a TTL of 0, I'd guess the event
immediately
> expires, which would be problematic for many cases.

Ok, as long as Riemann is designed to survive when no TTL is provided.

> * The tags as used by rsyslog are unlikely to map meaningfully to the tags
> used by riemann because they have very different use cases. I  mostly use
> tags in rsyslog to tell me whether my logs are json, or HTTP access logs,
> or PHP exceptions etc so that I know how to handle the output of
mmnormalize
> - that's not useful data in my monitoring stack.

again, I'm looking to set a default that has a chance of working, if you are
always setting the tag, the default doesn't matter.

I set the tags to contain a lot more info, is this a connection or
disconnection, is this a login or failed login, etc.

> It turns out, on a re-reading, that the metric isn't required either -
it's
> absolutely valid to send the event {host: localhost, service: "openvpn",
> status: "up | down" } for example.

ok, makes sense.

> Given that we can't make reasonable guesses about what the user intends, I
> think the sensible approach is to _not send_ any field for which we don't
have
> a value specified, with the exception of the source host and the timestamp
> which have obviously sane defaults.

Here I disagree, not sending anything is likely to generate support
requests of
"I configured omriemann and got a bunch of blank events, what's wrong". I'd
rather send extra data by default so that the people experimenting can at
least
see stuff show up. It's much easier to then tinker with what's showing up
than
to have to figure out what things you have to put in to get anything to
show up.

> Other than that, I think we're in agreement. I particularly like the idea
> of allowing metric to be a json object, that definitely simplifies the
> impstats case.
>
>> how do you signal metric types?
>
> I don't - that's down to upstream collectors. Riemann doesn't care what
the
> metric represents, it's just a number. It has no concept of the type of a
> metric, it just cares about the service name and host and allows you to do
> interesting things with the data stream. We use Collectd at Made, and that
> sends metrics with gauge/counter/derive in the service name.

ok, that's interesting. Every other montioring tool I've used wants it
specified
as it's passed in. In many cases, this can be configured to different
things on
different servers for the same metric.

> There is an expectation that you would only use a single field from
metric,
> metric_f, metric_s64 etc. I've never actually tried sending more than one
> metric to riemann in a message, because clients tend to explicitly forbid
> it. Internally, all those protobuf fields map down to the same metric
> field, so I'm unsure whether riemann will reject the message or select one
> of the fields in an unspecified precedence. Deciding which of those
protobuf
> fields to send is a open problem. Any thoughts on that?

we don't want the user to have to specify the specific type, that's too
likely
to lead to someone picking the wrong type. Everything is a string in
rsyslog to
start with, so I'd say that the metric should use the s64 if there is no
fractinal portion and double if there is.

> I have no idea what we would do if some anarchist sent us the json object
{
> "metrics": { "service": "trololol", "metric_f": 0.2, "metric_int": 27 }}.
> We can either reject the message and log an error, or specify some
> precedence order. I don't have a strong feeling either way; it's probably
> ok if we reformat your integer as a double, but I generally like software
> to fail fast instead of limping along doing something unexpected.

That's one reason for not specifying them individually :-)

But rather than the JSON looking like:

{ "metrics": { "service": "trololol", "metric_f": 0.2, "metric_int": 27 }}

I would have it be something more like:

{ "service": "trololol", "metrics": {"foo":"0.2", "bar":"27"}}

then you would do something like

action(type="omriemann" host="$hostname" service="$!service_" description=""
metrics="$!metrics" target="ip to riemann server" port="port of server")

host could be left out and let it default to $hostname

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to