[foreman-dev] Re: Foreman instrumenting analysis

Lukas Zapletal Tue, 07 Nov 2017 01:58:26 -0800

Any other ideas for telemetry protocols?

If there are none, I will rebase my telemetry patch back to the
original version based on statsd.


LZ

On Tue, Oct 31, 2017 at 8:33 PM, Lukas Zapletal <[email protected]> wrote:
> Hello,
>
> I am seeking for app instrumenting protocol for Foreman Rails
> application that will fulfill the following requirements:
>
> The protocol must work with multi-process server like Passneger.
> The protocol can be easily integrated into Foreman Tasks and Smart Proxy.
> The protocol or agent must support aggregation of time-based data
> (quantiles, average).
> The protocol must integrate with top three open-source monitoring frameworks.
>
> Let me summarize my findings so far. I am looking for advice or
> comments on this topic. I already worked on some prototypes, but
> before I commit to some final solution, I want to be sure I will not
> miss something I don't know about.
>
> Before you send comments, please keep in mind I am not searching for
> monitoring solution to integrate with. I want an application
> instrumentation library (or protocol) to be able export measurements
> (or telemetry data if you like) from Rails (like number or requests
> processed, SQL queries, time spent in db or view, time spent rendering
> a template or calling a backend system).
>
>
> Prometheus
>
>
> Flexible text-based protocol (alternatively protobuf) with HTTP
> REST-like communication. It was designed to be pull-based, meaning
> that an agent makes HTTP calls to web application which holds all
> metrics until they are flushed. It was build for Prometheus monitoring
> framework (Apache licenced) created by SoundCloud initially. Server
> and most agents are written in Go, can run without external database
> or export into 3rd party storage backends.
>
>
> It looks great, but it has a major problem - the Ruby client library
> (called client_ruby) does not support multi-process web servers at
> all. There are some hacks but these are using local temp files or
> shared memory with rather bad benchmark results (see the links down
> below).
>
>
> There is a possibility to push metrics into a separate component
> called PushGateway, but this was created for things like cron jobs or
> rake tasks. Doing multiple HTTP requests for each metric per single
> app request will unlikely perform well. In the README authors have
> note that this should be considered as "temporary solution".
>
>
> Although Prometheus seems to have vibrant community, the Ruby library
> development pace slowed down as SoundCloud "does not use many Ruby
> apps anymore". But it is still a good option to have.
>
>
> https://prometheus.io
> https://prometheus.io/docs/instrumenting/pushing/
> https://github.com/prometheus/client_ruby
> https://github.com/prometheus/client_ruby/issues/9
> https://github.com/prometheus/client_ruby/commits/multiprocess
>
>
> OpenTSDB
>
>
> OpenTSDB consists of a Time Series Daemon (TSD) as well as set of
> command line utilities. Interaction with OpenTSDB is primarily
> achieved by running one or more of the TSDs. Each TSD is independent.
> There is no master, no shared state so you can run as many TSDs as
> required to handle any load you throw at it. Each TSD uses the open
> source database Hadoop/HBase or hosted Google Bigtable service to
> store and retrieve time-series data.
>
>
> It uses push mechanism via REST JSON API with alternative
> "telnet-like" text endpoint. Although it does have some agents, it is
> more used as a storage backend than end-to-end monitoring solution.
>
>
> http://opentsdb.net/overview.html
>
>
> Statsd
>
>
> Main idea behind this instrumentation protocol is simple - get the
> measurement out of the application as fast as possible using UDP
> datagram. A collector agent usually runs locally, it does aggregation
> and relays the measurements to target backend system. The vanilla
> version does not support tagging, but there are extensions or mappings
> possible to support that.
>
>
> Almost all monitoring platforms has some kind of
> agent/importer/exporter that talks via statsd. The original statsd
> daemon was written in Perl years ago, then it was re-popularized by
> node.js implementation, but there are many alternative agents from
> which the most promising is statsite with very easy extensibility.
>
>
> This protocol is my favourite because it plays well with multiprocess
> Ruby servers or other Foreman components (all can just send UDP
> packets to localhost) and it also takes all aggregation and storing
> temporary data out of Ruby application. It also brings chances of
> regressions in our codebase to bare minimum - in the worst case the
> aggregating agent can fail but UDP packets will simply get lost
> without interrupting the application. The best Ruby client library
> seems to be statsd-instrument actively maintained by Shopify, it is
> very small without any runtime dependency.
>
>
> https://github.com/etsy/statsd/blob/master/docs/metric_types.md
> https://github.com/Shopify/statsd-instrument
> https://github.com/prometheus/statsd_exporter
> https://github.com/statsite/statsite
> https://codeascraft.com/2011/02/15/measure-anything-measure-everything/
>
>
> New Relic, Instrumental, DataDog, Rollbar
>
>
> All are paid services, some clients are open-source (Instrumental is
> MIT licenced) but usually with not well documented protocol and worse
> integration to different monitoring solutions. There are plenty of
> similar offerings, I might have missed some here.
>
>
> https://newrelic.com
> https://instrumentalapp.com
> https://instrumentalapp.com/docs/tcp-collector
>
>
> Zabbix, Nagios, Icinga
>
>
> These are more of "alerting" systems (system or service is down) and
> they all support application instrumentation to some degree, but it is
> not the core of what they do. I have seen them referred as "legacy
> monitoring systems", but I think they are still very relevant. They
> are not good fit for my use case tho at all.
>
>
> Conclusion
>
>
> To me it looks like the most open and flexible protocol seems to be
> statsd. This will give our users the largest flexibility for further
> integration - there are plenty of generic agents which can relay data
> to backend systems.
>
>
> Comments?
>
> --
> Later,
>   Lukas @lzap Zapletal



-- 
Later,
  Lukas @lzap Zapletal

-- 
You received this message because you are subscribed to the Google Groups 
"foreman-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[foreman-dev] Re: Foreman instrumenting analysis

Reply via email to