The minimum useful scrape time for Prometheus is 2 minutes (120 seconds). This is because Prometheus treats timeseries which are older than 5 minutes as "stale", i.e. gone away. Using a 2 minute scrape interval means that even if you miss one scrape, the timeseries won't go stale.
When you say "we'll need to make use of the push gateway", you should reconsider this. Whatever the problem is, the push gateway is almost always NOT the right solution. Pushgateway is just a cache that stores the last pushed value; it exists for one-shot scripts which vanish after execution but want to stash a value away to be scraped later (e.g. cron jobs). If you describe your use case, someone can help you find a better solution. Despite what the docs say, people have reported using prometheus for long-term storage successfully. There are a whole range of options with remote_write that you can use too. Thanos with object store will certainly give you very reliable persistent long-term storage, and will do downsampling too (although this is mainly to improve performance of queries over long time ranges, rather than to reduce the storage size). If you're only graphing over 24 hour periods, and alerting over the last N samples, native prometheus should be fine: the alerting rules will mainly be looking at the "head" chunks which remain in RAM. However, 8 million series for a single prometheus instance might be pushing it a bit too much. I would be more comfortable with sharding across a few smaller instances than having one monster instance. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/8f3740cb-7d13-4acb-be12-551bf6e8fa8ao%40googlegroups.com.

