Mixed opinion. It is very useful, that's the good part. I run on a cluster with 40 nodes. We run around 5k jobs per day and collect metrics every 5 mins using cadvisor which translates to around 51 metrics per scrape. This is a couple million reads per day. Long story short, we had to install jobs to trim the fat removing metrics we don't need. Even then the AWS m4.2xlarge instance is barely enough. So, painful to make it work. But useful yes. Otherwise we wouldn't do all the work to keep it. Luck!
---------------------------------------- From: "Lee Porte" <[email protected]> Sent: Tuesday, August 16, 2016 7:04 AM To: [email protected] Subject: Re: Anybody using Prometheus in Production We have it running on about 10 services at the moment across a pair of servers. We are triggering alerting from these too. Based upon our initial results, well be rolling it out to more in the not too distant future. On 15 Aug 2016 9:56 p.m., "Vaibhav Khanduja" <[email protected]> wrote: Great! ... Can you please share the scale at which you are using it? Numbers of servers/services? Thanks, On Mon, Aug 15, 2016 at 1:49 PM, Lee Porte <[email protected]> wrote: Yes, we are. We're finding its working well for us. On 15 Aug 2016 9:48 p.m., "Vaibhav Khanduja" <[email protected]> wrote: https://prometheus.io/ ?

