Re: [prometheus-users] Preventing data loss from poor network communication

Aliaksandr Valialkin Fri, 19 Jun 2020 05:33:15 -0700

Hi Mathieu!

What kind of resources are available on the metrics server? Probably,
vmagent
<https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md>
could be placed on each metrics server in order to reliably collect data
and then send it to a centralized storage when the connection is available.
This is one of the main use cases for vmagent - see
https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md#iot-and-edge-monitoring
for
details.


On Mon, Jun 15, 2020 at 3:30 PM Mathieu Tétreault <
[email protected]> wrote:

> Alright, I'll look into it.
>
> Just in case we don't have the resources required to run prometheus and
> thanos sidecar on the metrics server.
>
> Would there be any issues using the pushgateway to cache the metrics while
> the network is down? I understand that it would be more complicated to
> implement, but other than that? I'll do some testing this week, but I was
> wondering if there were anything that I was missing.
>
> Thanks for your help, it is really appreciated.
>
> Cheers,
>
> Mathieu
>
> On Sun, Jun 14, 2020 at 7:32 AM Stuart Clark <[email protected]>
> wrote:
>
>> What you'd generally do is look at using federation or one of the global
>> storage systems like Victoria Metrics, Thanos or Cortex.
>>
>> You'd have a Prometheus server in each location, and then central systems
>> for global views and alerts.
>>
>> On 14 June 2020 12:19:43 BST, "Mathieu Tétreault" <
>> [email protected]> wrote:
>>>
>>> I will have to double check, at first glance, the metrics servers didn't
>>> have enough resources available to run prometheus alongside their
>>> application.
>>> That's the main reason why I started to investigate setting up a
>>> watchdog setup and the pushgateway.
>>>
>>> My understanding is that it will also prevent grafana frome properly
>>> displaying the data properly from time to time. Since sometimes it won't be
>>> able to query the metrics server, an issue that would be less visible if we
>>> have a global prometheus instance that stores all the data.
>>>
>>> Cheers,
>>>
>>> Mathieu
>>>
>>> On Sat, Jun 13, 2020 at 8:25 AM Stuart Clark <[email protected]>
>>> wrote:
>>>
>>>> On 12/06/2020 19:45, Mathieu Tétreault wrote:
>>>> > We plan on using prometheus to fetch data from multiples servers and
>>>> > the link between the metrics's server and the prometheus servers is
>>>> > known for not being that reliable. The instability can last a couples
>>>> > of minutes and there is nothing we can do about it.
>>>> >
>>>> > Most of the time prometheus will be able to fetch the metrics.
>>>> > However, when prometheus is unable to pull the data the metrics
>>>> server
>>>> > will need to be able to cache them until the connection is back.
>>>> >
>>>> > Since most of the time the connection will be up, I was thinking
>>>> about
>>>> > setting up a watchdog refreshed by the metric pull. When the watchdog
>>>> > trigs, then cache the data until the pushgateway is pulled.
>>>> >
>>>> > If anyone had any advise on that, that'd be appreciated.
>>>> >
>>>>
>>>> Is it possible to run the Prometheus server on the other end of the
>>>> link?
>>>>
>>>> In general it is advised to run Prometheus servers as close as possible
>>>> to the things being monitored. For example a server per datacenter
>>>> instead of a single global server, etc.
>>>>
>>>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAO%2BPXKMdJCKuBJqZp0TOthyAr6okKrgJH3cNMSLGSqUjzYBgKg%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAO%2BPXKMdJCKuBJqZp0TOthyAr6okKrgJH3cNMSLGSqUjzYBgKg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmD3mJvgC8t%3DsMO0mAdqwLG0HEepZU34rfhkU4DirEkvKQ%40mail.gmail.com.

Re: [prometheus-users] Preventing data loss from poor network communication

Reply via email to