On Mon, Nov 30, 2020 at 1:08 PM Aliaksandr Valialkin <[email protected]> wrote:
> > > On Sun, Nov 29, 2020 at 3:10 PM Ben Kochie <[email protected]> wrote: > >> On Sun, Nov 29, 2020 at 11:51 AM Aliaksandr Valialkin <[email protected]> >> wrote: >> >>> >>> >>> On Fri, Nov 27, 2020 at 11:11 AM Ben Kochie <[email protected]> wrote: >>> >>>> >>>>> >>>>>> Or else is there any other ways by which we can solve this issue. >>>>>> >>>>> >>>>> Using something other than federation. remote_write is able to buffer >>>>> up data locally if the endpoint is down. >>>>> >>>>> Prometheus itself can't accept remote_write requests, so you'd have to >>>>> write to some other system >>>>> <https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage> >>>>> which can. I suggest VictoriaMetrics, as it's simple to run and has a >>>>> very >>>>> prometheus-like API, which can be queried as if it were a prometheus >>>>> instance. >>>>> >>>> >>>> I recommend Thanos, as it scales better and with less effort than >>>> VictoriaMetrics. It also uses PromQL code directly, so you will get the >>>> same results as Prometheus, not an emulation of PromQL. >>>> >>>> >>> Could you share more details on why you think that VictoriaMetrics has >>> scalability issues and is harder to set up and operate than Thanos? >>> VictoriaMetrics users have quite the opposite opinion. See >>> https://victoriametrics.github.io/CaseStudies.html and >>> https://medium.com/faun/comparing-thanos-to-victoriametrics-cluster-b193bea1683 >>> . >>> >> >> Thanos uses object storage, which avoids the need for manual sharding of >> TSDB storage. Today I have 100TiB of data stored in object storage buckets. >> I make no changes to scale up or down these buckets. >> >> > VictoriaMetrics stores data on persistent disks. Every replicated durable > persistent disk in GCP <https://cloud.google.com/persistent-disk> can scale > up to 64TB > <https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd> > without the need to stop VictoriaMetrics, i.e. without downtime. Given that > VictoriaMetrics > compresses real-world data much better than Prometheus > <https://valyala.medium.com/prometheus-vs-victoriametrics-benchmark-on-node-exporter-metrics-4ca29c75590f>, > a single-node VictoriaMetrics can substitute the whole Thanos cluster for > your workload (in theory of course - just give it a try in order to verify > this statement :) ). Cluster version of VictoriaMetrics > <https://victoriametrics.github.io/Cluster-VictoriaMetrics.html> can > scale to petabytes. For example, a cluster with one terabyte capacity can > be built with 16 vmstorage nodes with 64TB persistent disk per each node. > That's why VictoriaMetrics in production usually has lower infrastructure > costs than Thanos. > * GCP persistent disk costs double that of object storage, and is zone local only. * Cost is four times if you want regional replication. * GCP persistent disks don't have multi-regional replication (GCS does by default). * Object storage versioning makes for easy lifecycle management for disaster recovery. * Plus you have to maintain some percent of un-used filesystem overhead to avoid running out of space. * You can't shrink Persistent disks * And we're back to manual labor required to scale. Storing on persistent disks is a major reason why we don't just use Prometheus for TSDB. As an instance-level SPoF, the cost of persistent disks compared to object storage, and the toil involved. No thanks, we're moving away from old-school architectures. > > > -- > Best Regards, > > Aliaksandr Valialkin, CTO VictoriaMetrics > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABbyFmqNtcDOu2nGTunh2QQz27ym3wkUDAfLN4Eos-FDUAwL%3DA%40mail.gmail.com.

