Re: [heka] Heka Prometheus Plugin

Johannes Ziemke Mon, 15 Jun 2015 15:51:01 -0700

Hi David,

sorry for not responding earlier, have been quite busy recently.

On Tue, Jun 2, 2015 at 9:35 PM, David Birdsong <[email protected]>
wrote:

>
> I'll start by explaining what caused me to set out on this plugin. I want
> Prometheus to be able to scrape a growing list of prescribed Influxdb
> queries. The original thought was to code up a simple proxy that could
> translate a Prometheus scrape request into an Influxdb query. I quickly
> realized how capturing all the query types ahead of time would be hard and
> that I'd probably have to recompile often. Since data massaging is all I
> need, I realized I wanted a scripting engine that could run inside the
> proxy. I've only dabbled in cgo and don't have time to dive in deeper to
> learn how to build a Lua sandbox in a go program, but Heka's got this fully
> baked and I've hacked on Heka enough to be productive more quickly.
>

> With Heka, it's easy to compose an input spec for an output plugin which
> can then drive that output plugin's logic. The best way to massage messages
> in Heka are in sandbox decoders or, if any aggregation is needed, in
> filters. The phase doesn't matter to the output plugin so long as the Heka
> message includes the necessary field to drive the output plugin properly.
>
> For us at imgix, this means we can route a lot of data to Prometheus, not
> just Influxdb queries. We aggregate many log types into counters and
> histograms in Heka currently. With a small amount of additional Lua and a
> sensible input message spec, these metrics can now be made available to
> Prometheus.
>

That sounds similar to what I'm trying to do, but I guess I just lack the
heka foo. I have log lines which I want to count sliced by field names. As
a result, I want counters reflecting the number of requests for which field
X = Y. For example the number of GET requests for hostname example.com. I
haven't used the influxdb plugins, but I assume they are doing something
similar. Beside that, some loglines might also contain metrics like
requests size or duration which I would like to expose to prometheus as
summaries.
How do you do this aggregation into counters and histograms in heka? Do you
have some example? Our inputs + decoders parse our logfiles, so we have
everything nicely in heka message but how to aggregate this and how to
define labels?

> I set out to provide both Counters and Gauges originally, but found that
> only Gauges made sense to me--at least for now. Both Influxdb and Heka
> filters are better suited to perform aggregations than a Heka output
> plugin--we only need Prometheus to see the snaphot.
>

Prometheus should see the raw counters, so IMO the counters would be the
most interesting/common use case. Heka filter to do the counting makes
sense, but for things like logline counting you want a always increasing
counter and do the rate conversion etc in prometheus: Everytime prometheus
scrapes it should see the current total loglines.

> A Heka filter or an Influxdb query can emit the values we want Prometheus
> to see at any given time interval (often 15s I think?) Sending raw values
> to an output plugin to perform increments or histogram calculations in Heka
> is sort of an anti-pattern IMO.
>

You are probably right. As I said, I probably just lack knowledge how to do
things like counting of loglines 'per field'.

> Does Describe only get called once on the prometheus client?
>

IIRC it's possible that it's called multiple times. Either way, it should
always return the same metrics. But if you don't know about some metrics in
advance, it's fine to not send them. That's what lot of exporters are
doing:
https://github.com/prometheus/graphite_exporter/blob/master/main.go#L172

> Is there a way to trigger Describe to get called periodically or on
> demand? If not, I'll probably find a StoppableListener implementation,
> instantiate a new prometheus client, and create a new http server as new
> metrics are discovered. Think there's a better way?
>

What you're doing right now looks pretty good: Just return your static
metrics in Describe() and use NewConstMetric to return metrics dynamically.
Do you have some ideas how to implement histograms and summaries? I assume
this gets nasty since you need to persist the quantile buckets between
messages:
https://godoc.org/github.com/prometheus/client_golang/prometheus#MustNewConstSummary
This is what in the end caused me to process the raw messages in the output.

> On Tue, Jun 2, 2015 at 10:29 AM, Johannes Ziemke <
> [email protected]> wrote:
>
>> Hi,
>>
>> just read your code and it seems like it supports only gauges. Do you
>> plan to add support for counters as well? I'm actually most interested
>> about counters since the most common thing I'll need to do is count various
>> kind of log lines.
>> And how do you plan to get data for your gauges? Do you have some custom
>> input provinding messages with full labels and everything or do you do some
>> mapping in between?
>>
>> On Tue, Jun 2, 2015 at 5:45 PM, Johannes Ziemke <
>> [email protected]> wrote:
>>
>>> Hi David,
>>>
>>> I just saw that you're working on a prometheus plugin for heka as well.
>>> Here is my first take: https://github.com/docker-infra/heka-prometheus
>>>
>>> Maybe we should join efforts and work on this together or at least chat
>>> about it. Looks like it's not trivial to get this right and I have not much
>>> experience with heka but lots with prometheus, so might be helpful :)
>>>
>>> I saw you in #heka and I'm there as well (nick: fish). Or come by at
>>> #prometheus on freenode.
>>>
>>
>>
>

_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Re: [heka] Heka Prometheus Plugin

Reply via email to