The minimum useful scrape time for Prometheus is 2 minutes (120 seconds).  
This is because Prometheus treats timeseries which are older than 5 minutes 
as "stale", i.e. gone away.  Using a 2 minute scrape interval means that 
even if you miss one scrape, the timeseries won't go stale.

When you say "we'll need to make use of the push gateway", you should 
reconsider this.  Whatever the problem is, the push gateway is almost 
always NOT the right solution.  Pushgateway is just a cache that stores the 
last pushed value; it exists for one-shot scripts which vanish after 
execution but want to stash a value away to be scraped later (e.g. cron 
jobs).  If you describe your use case, someone can help you find a better 
solution.

Despite what the docs say, people have reported using prometheus for 
long-term storage successfully.  There are a whole range of options with 
remote_write that you can use too.  Thanos with object store will certainly 
give you very reliable persistent long-term storage, and will do 
downsampling too (although this is mainly to improve performance of queries 
over long time ranges, rather than to reduce the storage size).

If you're only graphing over 24 hour periods, and alerting over the last N 
samples, native prometheus should be fine: the alerting rules will mainly 
be looking at the "head" chunks which remain in RAM.

However, 8 million series for a single prometheus instance might be pushing 
it a bit too much.  I would be more comfortable with sharding across a few 
smaller instances than having one monster instance.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8f3740cb-7d13-4acb-be12-551bf6e8fa8ao%40googlegroups.com.

Reply via email to