Re: [prometheus-users] Preventing data loss from poor network communication

Mathieu Tétreault Mon, 15 Jun 2020 05:30:25 -0700

Alright, I'll look into it.

Just in case we don't have the resources required to run prometheus and
thanos sidecar on the metrics server.


Would there be any issues using the pushgateway to cache the metrics while
the network is down? I understand that it would be more complicated to
implement, but other than that? I'll do some testing this week, but I was
wondering if there were anything that I was missing.

Thanks for your help, it is really appreciated.

Cheers,

Mathieu

On Sun, Jun 14, 2020 at 7:32 AM Stuart Clark <[email protected]>
wrote:

> What you'd generally do is look at using federation or one of the global
> storage systems like Victoria Metrics, Thanos or Cortex.
>
> You'd have a Prometheus server in each location, and then central systems
> for global views and alerts.
>
> On 14 June 2020 12:19:43 BST, "Mathieu Tétreault" <
> [email protected]> wrote:
>>
>> I will have to double check, at first glance, the metrics servers didn't
>> have enough resources available to run prometheus alongside their
>> application.
>> That's the main reason why I started to investigate setting up a watchdog
>> setup and the pushgateway.
>>
>> My understanding is that it will also prevent grafana frome properly
>> displaying the data properly from time to time. Since sometimes it won't be
>> able to query the metrics server, an issue that would be less visible if we
>> have a global prometheus instance that stores all the data.
>>
>> Cheers,
>>
>> Mathieu
>>
>> On Sat, Jun 13, 2020 at 8:25 AM Stuart Clark <[email protected]>
>> wrote:
>>
>>> On 12/06/2020 19:45, Mathieu Tétreault wrote:
>>> > We plan on using prometheus to fetch data from multiples servers and
>>> > the link between the metrics's server and the prometheus servers is
>>> > known for not being that reliable. The instability can last a couples
>>> > of minutes and there is nothing we can do about it.
>>> >
>>> > Most of the time prometheus will be able to fetch the metrics.
>>> > However, when prometheus is unable to pull the data the metrics server
>>> > will need to be able to cache them until the connection is back.
>>> >
>>> > Since most of the time the connection will be up, I was thinking about
>>> > setting up a watchdog refreshed by the metric pull. When the watchdog
>>> > trigs, then cache the data until the pushgateway is pulled.
>>> >
>>> > If anyone had any advise on that, that'd be appreciated.
>>> >
>>>
>>> Is it possible to run the Prometheus server on the other end of the link?
>>>
>>> In general it is advised to run Prometheus servers as close as possible
>>> to the things being monitored. For example a server per datacenter
>>> instead of a single global server, etc.
>>>
>>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAO%2BPXKMdJCKuBJqZp0TOthyAr6okKrgJH3cNMSLGSqUjzYBgKg%40mail.gmail.com.

Re: [prometheus-users] Preventing data loss from poor network communication

Reply via email to