Have similar case. I would like to use remote-write to collect metrics from 
multiple namespaces/clusters, however federation seems me much more 
reliable. Federation endpoint is just another scrapping target - in case of 
network failure (or any other failure) I will get an alert that federation 
endpoint is down. In case of remote write I have risks to stay blind. I see 
no clear mechanism to be sure I'm getting the metrics =/

What are the possible solutions in this case?
On Wednesday, 20 July 2022 at 18:16:29 UTC+2 [email protected] wrote:

> @Stuart: I agree with most of the ideas you say :-) I see remote-write as 
> the most appropriate metrics forwarding for my deployment use case.
>                 Using federation is not good in terms of interface 
> standardization, HA of monitoring stack, and feature support. 
>                 For the above case, I have functions and a dedicated set 
> of engineers who own such workload to query individual instances, and the 
> global instance is used as centralized monitoring.
>                 I was looking at this 
> <https://github.com/prometheus/prometheus/issues/5666>closed bug, raised 
> on Prometheus in the 2019 Summer. To my understanding, there are 
> performance issues with remote-write but most of them are resolved and the 
> community sees remote-write to perform better when compared to the 
> federation. Am I thinking correctly? 
>                 Could you clarify the performance comparison between 
> remote-write and federation?
>
> /Teja
>
> On Tuesday, July 19, 2022 at 5:02:11 PM UTC+2 Stuart Clark wrote:
>
>> On 19/07/2022 13:24, tejaswini vadlamudi wrote: 
>> > @Ben: Makes a point, but getting Thanos or Cortex into the picture 
>> > could be a way forward after some time. For now, do you think it is 
>> > good enough to use remote-write instead of federation?  From a 
>> > performance and resource consumption POV, do you see remote-write as 
>> > the way-forward? 
>> > 
>> With remote write you could use agent mode, so you don't have to have 
>> local storage other than for the destination instance. 
>>
>> However again it depends what you are trying to achieve and why you have 
>> suggested having four instances. Are you wanting to query all four 
>> instances or only the "global" one? Are you wanting to copy all data to 
>> the "global" instance or only some metrics? Every data point, or only at 
>> a lower frequency? 
>>
>> If you are intending to copy all data (both metrics & data points) that 
>> leans towards remote write as federation works differently. But in that 
>> case there doesn't seem to be any advantage in having the extra three 
>> instances at all (unless you are intending on doing local querying, 
>> alerting or recording rules) - so I'd just have a single instance that 
>> scrapes all namespaces. 
>>
>> Alternatively if you are needing to have separate instances with local 
>> storage/querying then I'd probably not look to copy all the data to the 
>> "global" instance (which just doubles storage and memory usage) and 
>> either use remote write for a much smaller subset of metrics, federation 
>> with a slower scrape rate/reduced set of metrics, or as Ben suggested 
>> something like Thanos (other options exist as well) to do away with the 
>> fourth instance entirely and distribute the queries to the individual 
>> instances instead. 
>>
>> Maybe if you could explain a bit about what the design is hoping to 
>> achieve it would help us advise better? 
>>
>> -- 
>> Stuart Clark 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/344c19de-664e-44a2-b389-56145585d47cn%40googlegroups.com.

Reply via email to