Re: [prometheus-users] Re: Unusual traffic in prometheus nodes.

Brian Candler Fri, 28 Jul 2023 02:29:03 -0700

Another query to try:
topk(10, scrape_samples_scraped)

On Friday, 28 July 2023 at 09:53:00 UTC+1 Ben Kochie wrote:


> That's 7 billion metrics, which would require approximately  30-50TiB of 
> ram.
>
> On Thu, Jul 27, 2023 at 5:50 PM Brian Candler <[email protected]> wrote:
>
>> As Stuart says, that looks correct, assuming your metrics don't have any 
>> labels other than the ones you've excluded. You'd save a lot of typing just 
>> by doing:
>>
>>     sum(scrape_samples_scraped)
>>
>> which is expected to return a single value, with no labels (as it's 
>> summed across all timeseries of this metric).
>>
>> The value 7,525,871,918 does seem quite high - what was it before?  You 
>> can set an execution time for this query in the PromQL browser, or draw a 
>> graph this expression over time, to see historical values.
>>
>> You could also look at
>>     count(scrape_samples_scraped)
>>
>> or more simply
>>     count(up)
>>
>> and see if that has jumped up: it would imply that lots more targets have 
>> been added (e.g. more pods are being monitored).
>>
>> If not, then as well as Stuart's suggestion of graphing 
>> "scrape_samples_scraped" by itself to see if one particular target is 
>> generating way more metrics than usual, you could try different summary 
>> variants like
>>
>> sum by (instance,job) (scrape_samples_scraped)
>> sum by (clusterName) (scrape_samples_scraped)
>> ... etc
>>
>> and see if there's a spike in any of these.  This may help you drill down 
>> to the offending item(s).
>>
>> On Thursday, 27 July 2023 at 15:51:24 UTC+1 Uvais Ibrahim wrote:
>>
>>> Hi Brain,
>>>
>>> This is the query that I have used.
>>>
>>> sum(scrape_samples_scraped)without(app,app_kubernetes_io_managed_by,clusterName,release,environment,instance,job,k8s_cluster,kubernetes_name,kubernetes_namespace,ou,app_kubernetes_io_component,app_kubernetes_io_name,app_kubernetes_io_version,kustomize_toolkit_fluxcd_io_name,kustomize_toolkit_fluxcd_io_namespace,application,name,role,app_kubernetes_io_instance,app_kubernetes_io_part_of,control_plane,beta_kubernetes_io_arch,beta_kubernetes_io_instance_type,
>>>  
>>> beta_kubernetes_io_os, failure_domain_beta_kubernetes_io_region, 
>>> failure_domain_beta_kubernetes_io_zone,kubernetes_io_arch, 
>>> kubernetes_io_hostname, kubernetes_io_os, node_kubernetes_io_instance_type, 
>>> nodegroup, topology_kubernetes_io_region, 
>>> topology_kubernetes_io_zone,chart,heritage,revised,transit,component,namespace,
>>>  
>>> pod_name, pod_template_hash, security_istio_io_tlsMode, 
>>> service_istio_io_canonical_name, 
>>> service_istio_io_canonical_revision,k8s_app,kubernetes_io_cluster_service,kubernetes_io_name,route_reflector)
>>>
>>> Which simply excluded every label but still I am getting a result like 
>>> this
>>>
>>> {}  7525871918
>>>
>>>
>>> It shouldn't return any results right?
>>>
>>> Prometheus version: 2.36.2
>>>
>>> By increased traffic I meant that, the prometheus servers are getting 
>>> high traffic from a specific point of time. Currently prometheus is getting 
>>> 13 million packets earlier it was like 2 to 3 M packets on an average. And 
>>> the prometheus endpoint is not public.
>>>
>>>
>>> On Thursday, July 27, 2023 at 6:06:10 PM UTC+5:30 Brian Candler wrote:
>>>
>>>> scrape_samples_scraped always has the labels which prometheus itself 
>>>> adds (i.e. job and instance).
>>>>
>>>> Extraordinary claims require extraordinary evidence. Are you saying 
>>>> that the PromQL query *scrape_samples_scraped{job="",instance=""}* 
>>>> returns a result?  If so, what's the number?  What do you mean by "with 
>>>> increased size" - increased as compared to what? And what version of 
>>>> prometheus are you running?
>>>>
>>>> In any case, what you see with scrape_samples_scraped may be completely 
>>>> unrelated to the "high traffic" issue.  Is your prometheus server exposed 
>>>> to the Internet? Maybe someone is accessing it remotely.  Even if not, you 
>>>> can use packet capture to work out where the traffic is going to and from. 
>>>>  
>>>> A tool like https://www.sniffnet.net/ may be helpful.
>>>>
>>>> On Thursday, 27 July 2023 at 13:14:25 UTC+1 Uvais Ibrahim wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Since last night, my Prometheus EC2 servers are getting high traffic 
>>>>> unusually. When I was checking in Prometheus I can see this 
>>>>> metric scrape_samples_scraped with with increased size but without any 
>>>>> labels. What could be the reason?
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Uvais Ibrahim
>>>>>
>>>>>
>>>>>
>>>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/811fba5c-1bd3-4677-b276-84116180a1acn%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/811fba5c-1bd3-4677-b276-84116180a1acn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/96aeb5fc-068d-4264-9c5b-f017a44188dan%40googlegroups.com.

Re: [prometheus-users] Re: Unusual traffic in prometheus nodes.

Reply via email to