On Thu, Feb 27, 2020, at 07:36, Riyan Shaik wrote: > > I believe the downside or a corner case with the prometheus instances behind > LB is, if one of the instances goes down, then you may have gaps in your > graphs and you can't backfill (not supported by prom) the data once that > instance comes back up.
We're using Prometheus for gathering statistics for long-term usage changes (e.g. to know when a cluster needs to be scaled out); for short-term analysis of performance (e.g. why the heck did our message rate drop last night? Oh right, the SAN disk latency went up by a factor of 10), and for alerting (messages to the Labs aren't flowing). In all cases, short drop-outs in statistics gathering simply haven't been a problem, other than I've had to smooth a few alerts to prevent them from resolving and refiring on a missed scrape, (e.g. changing "up != 1" to "avg_over_time(up[1m]) < 0.9". In short, I'm happy trading off occasional gaps in my data for the simplicity of Prometheus. > Out of curiosity, i'd like to know why have you decided to go with VM over > Thanos ? Would really to your PoV and experiences with VM ? My personal experience was that setting up Victoriametrics as an aggregator for use with Grafana was incredibly simple, while setting up Thanos was just-less-simple-enough that I never succeeded. (It's not that Thanos is really that difficult, but I work on a small, understaffed team and there are never enough spare minutes :). -- Harald [email protected] -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/8e92c2a8-8dd7-4cf9-979d-c41de390c179%40www.fastmail.com.

