On Saturday, October 10, 2020 at 2:32:14 PM UTC+5:30 [email protected] wrote: 4.6TB for 50 days seems like a lot. How many metrics and how many samples per second are you collecting? Just estimating based on the data, it sounds like you might have more than 10 million series and 600-700 samples per second. This might be the time to start thinking about sharding. You can check for sure with these queries: prometheus_tsdb_head_series rate(prometheus_tsdb_head_samples_appended_total[1h] >>>> Hi Ben, my time series collection hasn't touched 10 million yet, its around 5.5 million as of now, but my sampling rate is quite steep, sitting at approximately 643522. Since my time series are quite manageable by a single Prometheus instance I am avoiding sharding as of now because it would complicate the entire setup. What is your thought on this?
For handling HA clustering and sharding, I recommend looking into Thanos. It can be added to your existing Prometheus and rolled out incrementally. >>>> Yes, I looked at Thanos but my only problem is that Thanos will use Object Storage for long time retention which will have latency while extracting old data. That is why I am inclined towards VictoriaMetrics. What's your view on going with VictoriaMetrics? > d) d) If we do use 2 separate disks for the 2 instances, how will we manage the config files? If you don't have any configuration management, I recommend using https://github.com/cloudalchemy/ansible-prometheus. It's very easy to get going. >>>> Thanks. I'll check it out. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/6cfb736c-ba38-4e8d-8468-cdc84f2971f2n%40googlegroups.com.

