Re: Total Volume metrics of Kafka

2019-01-17 Thread Gabriele Paggi
On Thu, 17 Jan 2019 at 00:44, Peter Bukowinski  wrote:

> On each broker, we have a process (scheduled with cron) that polls the
> kafka jmx api every 60 seconds. It sends the metrics data to graphite (
> https://graphiteapp.org). We have graphite configured as a data source
> for grafana (https://grafana.com) and use it to build various dashboards
> to present the metrics we’re interested in.
>
> There are various jmx-to-graphite tools available. We use one written in
> house, but this one looks like it’ll do the job:
> https://github.com/logzio/jmx2graphite


Hi Peter,

We use this reporter (https://github.com/damienclaveau/kafka-graphite),
which we add to the classpath and configure it in Kafka's server.properties:


kafka.metrics.reporters=com.criteo.kafka.KafkaGraphiteMetricsReporter
kafka.metrics.polling.interval.secs=60
kafka.graphite.metrics.reporter.enabled=true
kafka.graphite.metrics.host=carbon.service.consul
kafka.graphite.metrics.port=2003
kafka.graphite.metrics.group={{ grains['fqdn']|replace(".","_") }}

kafka.graphite.metrics.exclude.regex=(kafka.network.*|kafka.log.*|kafka.cluster.*(InSyncReplicasCount|ReplicasCount|UnderReplicated))
kafka.graphite.dimension.enabled.meanRate=false
kafka.graphite.dimension.enabled.rate1m=false
kafka.graphite.dimension.enabled.rate5m=false
kafka.graphite.dimension.enabled.rate15m=false
kafka.graphite.dimension.enabled.min=false
kafka.graphite.dimension.enabled.max=false
kafka.graphite.dimension.enabled.mean=false
kafka.graphite.dimension.enabled.sum=false
kafka.graphite.dimension.enabled.stddev=false
kafka.graphite.dimension.enabled.median=false
kafka.graphite.dimension.enabled.p75=false
kafka.graphite.dimension.enabled.p95=false
kafka.graphite.dimension.enabled.p98=false
kafka.graphite.dimension.enabled.p99=false
kafka.graphite.dimension.enabled.p999=false

That spares you from having to run a cronjob and a jmx bridge. It works
also with Kafka 1.x and 2.x

-- 
Gabriele


Re: Total Volume metrics of Kafka

2019-01-16 Thread Peter Bukowinski
On each broker, we have a process (scheduled with cron) that polls the kafka 
jmx api every 60 seconds. It sends the metrics data to graphite 
(https://graphiteapp.org). We have graphite configured as a data source for 
grafana (https://grafana.com) and use it to build various dashboards to present 
the metrics we’re interested in.

There are various jmx-to-graphite tools available. We use one written in house, 
but this one looks like it’ll do the job: https://github.com/logzio/jmx2graphite


> On Jan 16, 2019, at 2:15 PM, Amitav Mohanty  wrote:
> 
> Peter,
> 
> Thanks for the inputs. I am interested in aggregate bytes published into a
> topic. The approach of metrics collector along with graphing tool seems
> appealing. I can volume ingested over arbitrary periods of time which is
> exactly what I am looking for. Can you please point to some metrics
> collector that I can use? Is it sort of a cron-job that notes the rate
> every minute or every 15 mins?
> 
> Regards,
> Amitav
> 
> On Thu, Jan 17, 2019 at 3:23 AM Peter Bukowinski  wrote:
> 
>> Amitav,
>> 
>> When you say total volume, do you want a topic’s size on disk, taking into
>> account replication and retention, or do you want the aggregate bytes
>> published into a topic? If you have a metrics collector and a graphing tool
>> such as grafana, you can transform the rate metrics to a byte sum by
>> applying an integral function, but those will always grow and not take into
>> account deletion after the retention period.
>> 
>> If you want metrics on how much space a topic occupies on disk, I’d
>> suggest using collectd and this plugin:
>> https://github.com/HubSpot/collectd-kafka-disk
>> 
>> —
>> Peter
>> 
>>> On Jan 16, 2019, at 1:12 PM, Amitav Mohanty 
>> wrote:
>>> 
>>> Hi
>>> 
>>> I am interested in getting total volume of data that a topic ingested in
>> a
>>> period of time. Does Kafka collect any such metrics? I check JMX console
>>> but I only found rate metrics.
>>> 
>>> Regards,
>>> Amitav
>> 
>> 



Re: Total Volume metrics of Kafka

2019-01-16 Thread Amitav Mohanty
Peter,

Thanks for the inputs. I am interested in aggregate bytes published into a
topic. The approach of metrics collector along with graphing tool seems
appealing. I can volume ingested over arbitrary periods of time which is
exactly what I am looking for. Can you please point to some metrics
collector that I can use? Is it sort of a cron-job that notes the rate
every minute or every 15 mins?

Regards,
Amitav

On Thu, Jan 17, 2019 at 3:23 AM Peter Bukowinski  wrote:

> Amitav,
>
> When you say total volume, do you want a topic’s size on disk, taking into
> account replication and retention, or do you want the aggregate bytes
> published into a topic? If you have a metrics collector and a graphing tool
> such as grafana, you can transform the rate metrics to a byte sum by
> applying an integral function, but those will always grow and not take into
> account deletion after the retention period.
>
> If you want metrics on how much space a topic occupies on disk, I’d
> suggest using collectd and this plugin:
> https://github.com/HubSpot/collectd-kafka-disk
>
> —
> Peter
>
> > On Jan 16, 2019, at 1:12 PM, Amitav Mohanty 
> wrote:
> >
> > Hi
> >
> > I am interested in getting total volume of data that a topic ingested in
> a
> > period of time. Does Kafka collect any such metrics? I check JMX console
> > but I only found rate metrics.
> >
> > Regards,
> > Amitav
>
>


Re: Total Volume metrics of Kafka

2019-01-16 Thread Peter Bukowinski
Amitav,

When you say total volume, do you want a topic’s size on disk, taking into 
account replication and retention, or do you want the aggregate bytes published 
into a topic? If you have a metrics collector and a graphing tool such as 
grafana, you can transform the rate metrics to a byte sum by applying an 
integral function, but those will always grow and not take into account 
deletion after the retention period.

If you want metrics on how much space a topic occupies on disk, I’d suggest 
using collectd and this plugin: https://github.com/HubSpot/collectd-kafka-disk

—
Peter

> On Jan 16, 2019, at 1:12 PM, Amitav Mohanty  wrote:
> 
> Hi
> 
> I am interested in getting total volume of data that a topic ingested in a
> period of time. Does Kafka collect any such metrics? I check JMX console
> but I only found rate metrics.
> 
> Regards,
> Amitav