[prometheus-users] Re: what insecure_skip_verify will do

2024-05-15 Thread Alexander Wilke
It will skip the certificate Check. So certificate May be valid or invalid 
and is Always trusted.
Connection is still encrypted

Sameer Modak schrieb am Mittwoch, 15. Mai 2024 um 17:04:07 UTC+2:

> Hello Team,
>
> If i set  insecure_skip_verify: true will my data be unsecured. Will it 
> be non ssl??
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/1291d1fc-8429-4b4f-af46-d1a9d507dfd2n%40googlegroups.com.


Re: [prometheus-users] snmp generator.yml fails error err="cannot find oid '1.22610.2.4.1.2.2' to walk

2024-05-04 Thread Alexander Wilke
riod: 1: 1-second sampling, 2: 
5-second sampling,
3: 10-second sampling, 4: 30-second sampling, 5: 60-second 
sampling. - 1.3.6.1.4.1.22610.2.4.1.3.6.1.2'
  indexes:
  - labelname: axSysCpuIndexInUsage
type: gauge
  - labelname: axSysCpuUsagePeriodIndex
type: gauge
- name: axSysCpuUsageValueAtPeriod
  oid: 1.3.6.1.4.1.22610.2.4.1.3.6.1.3
  type: gauge
  help: The CPU usage value at given period, 1-sec, 5-sec, 10-sec, 
30-sec, and
60-sec. - 1.3.6.1.4.1.22610.2.4.1.3.6.1.3
  indexes:
  - labelname: axSysCpuIndexInUsage
type: gauge
  - labelname: axSysCpuUsagePeriodIndex
type: gauge
max_repetitions: 100
retries: 0
timeout: 30s
root@ubiquiti:/opt/prometheus/snmp_exporter/generator/mibs_a10#


Mehran Saeed schrieb am Samstag, 4. Mai 2024 um 11:13:09 UTC+2:

> yeah tried without oids but still the same result. 
>
> just to double check. 
> The correct way is to clone the generator repo, add the custom mib to the 
> mibs dir and run the make generate cmd to generate the executable generator 
> file and then run ./generator generate to run the generator. 
>
> not sure what I am doing wrong here. 
>
>
> On Fri, May 3, 2024 at 6:45 PM Alexander Wilke  
> wrote:
>
>> Try this instead of oid
>> axSysMemory
>> axSysCpu
>>
>>
>>
>>
>> Mehran Saeed schrieb am Mittwoch, 1. Mai 2024 um 12:42:06 UTC+2:
>>
>>> Just wondering if this format correct for the generator.yml as in is 
>>> this how mibs or defined 
>>>
>>> auths:
>>>   public_v1:
>>> version: 1
>>>   public_v2:
>>> version: 2
>>>
>>>
>>>   prometheus_v3:
>>> username: user
>>> password: pwd
>>> auth_protocol: SHA
>>> priv_protocol: AES
>>> security_level: authPriv
>>> priv_password: pwd
>>> version: 3
>>>
>>> modules:
>>>   
>>>   a10:
>>> walk:
>>>   - 1.3.6.1.4.1.22610.2.4.1.2.1
>>>
>>>
>>> before running the  ./generator generate \ cmd I did make generate as 
>>> well to create the generator execution file. 
>>> getting the same error no matter what I change. 
>>> On Monday, April 29, 2024 at 8:18:19 AM UTC+1 Mehran Saeed wrote:
>>>
>>>> Yes correct thats the one 
>>>>
>>>> On Sun, Apr 28, 2024 at 5:43 PM Alexander Wilke  
>>>> wrote:
>>>>
>>>>> Is this MIB in the MIBs folder?
>>>>> https://www.circitor.fr/Mibs/Mib/A/A10-AX-MIB.mib
>>>>>
>>>>> Mehran Saeed schrieb am Samstag, 27. April 2024 um 21:08:54 UTC+2:
>>>>>
>>>>>> Thanks for responding. 
>>>>>>
>>>>>> below are the two MIBs for memory I am trying to use:
>>>>>>
>>>>>> memory usage: 1.3.6.1.4.1.22610.2.4.1.2.2
>>>>>>
>>>>>> Field Name : axSysMemoryUsage 
>>>>>> Field Type: Integer32 
>>>>>> Field Status : current 
>>>>>> Description : The usage memory(KB). 
>>>>>> OID : 1.3.6.1.4.1.22610.2.4.1.2.2  
>>>>>>
>>>>>> memory total: 1.3.6.1.4.1.22610.2.4.1.2.1
>>>>>>
>>>>>>   Field Name : axSysMemoryTotal 
>>>>>> Field Type: Integer32
>>>>>> Field Status : current 
>>>>>> Description : The total memory(KB). 
>>>>>> OID : 1.3.6.1.4.1.22610.2.4.1.2.1  
>>>>>>
>>>>>>
>>>>>> I used the cmd below to get the generator executable file:
>>>>>>
>>>>>> make generate
>>>>>>
>>>>>> then to run the generator ran this cmd:
>>>>>>
>>>>>> ./generator generate   -m /snmp_exporter/generator/mibs  -g 
>>>>>> /generator/generator.yml   -o /snmp_exporter/mib/snmp.yml
>>>>>>
>>>>>>
>>>>>> I tried changing the generator.yml file by mentioning just object id 
>>>>>> and OID but no luck
>>>>>>
>>>>>> ---
>>>>>> auths:
>>>>>>   public_v1:
>>>>>> version: 1
>>>>>>   public_v2:
>>>>>> version: 2
>>>>>>
>>>>>>
>>>>>>   prometheus_v3:
>>>>>> username: user
>>>>>> password: pwd
>>>&

Re: [prometheus-users] snmp generator.yml fails error err="cannot find oid '1.22610.2.4.1.2.2' to walk

2024-05-03 Thread Alexander Wilke
Try this instead of oid
axSysMemory
axSysCpu




Mehran Saeed schrieb am Mittwoch, 1. Mai 2024 um 12:42:06 UTC+2:

> Just wondering if this format correct for the generator.yml as in is this 
> how mibs or defined 
>
> auths:
>   public_v1:
> version: 1
>   public_v2:
> version: 2
>
>
>   prometheus_v3:
> username: user
> password: pwd
> auth_protocol: SHA
> priv_protocol: AES
> security_level: authPriv
> priv_password: pwd
> version: 3
>
> modules:
>   
>   a10:
> walk:
>   - 1.3.6.1.4.1.22610.2.4.1.2.1
>
>
> before running the  ./generator generate \ cmd I did make generate as 
> well to create the generator execution file. 
> getting the same error no matter what I change. 
> On Monday, April 29, 2024 at 8:18:19 AM UTC+1 Mehran Saeed wrote:
>
>> Yes correct thats the one 
>>
>> On Sun, Apr 28, 2024 at 5:43 PM Alexander Wilke  
>> wrote:
>>
>>> Is this MIB in the MIBs folder?
>>> https://www.circitor.fr/Mibs/Mib/A/A10-AX-MIB.mib
>>>
>>> Mehran Saeed schrieb am Samstag, 27. April 2024 um 21:08:54 UTC+2:
>>>
>>>> Thanks for responding. 
>>>>
>>>> below are the two MIBs for memory I am trying to use:
>>>>
>>>> memory usage: 1.3.6.1.4.1.22610.2.4.1.2.2
>>>>
>>>> Field Name : axSysMemoryUsage 
>>>> Field Type: Integer32 
>>>> Field Status : current 
>>>> Description : The usage memory(KB). 
>>>> OID : 1.3.6.1.4.1.22610.2.4.1.2.2  
>>>>
>>>> memory total: 1.3.6.1.4.1.22610.2.4.1.2.1
>>>>
>>>>   Field Name : axSysMemoryTotal 
>>>> Field Type: Integer32
>>>> Field Status : current 
>>>> Description : The total memory(KB). 
>>>> OID : 1.3.6.1.4.1.22610.2.4.1.2.1  
>>>>
>>>>
>>>> I used the cmd below to get the generator executable file:
>>>>
>>>> make generate
>>>>
>>>> then to run the generator ran this cmd:
>>>>
>>>> ./generator generate   -m /snmp_exporter/generator/mibs  -g 
>>>> /generator/generator.yml   -o /snmp_exporter/mib/snmp.yml
>>>>
>>>>
>>>> I tried changing the generator.yml file by mentioning just object id 
>>>> and OID but no luck
>>>>
>>>> ---
>>>> auths:
>>>>   public_v1:
>>>> version: 1
>>>>   public_v2:
>>>> version: 2
>>>>
>>>>
>>>>   prometheus_v3:
>>>> username: user
>>>> password: pwd
>>>> auth_protocol: SHA
>>>> priv_protocol: AES
>>>> security_level: authPriv
>>>> priv_password: pwd
>>>> version: 3
>>>>
>>>> modules:
>>>>   
>>>>   a10:
>>>> walk: 
>>>>   - 1.3.6.1.4.1.22610.2.4.1.2.1
>>>>
>>>>
>>>> On Saturday, April 27, 2024 at 7:22:32 PM UTC+1 Alexander Wilke wrote:
>>>>
>>>>> You probably do not have alle required mibs in the folder.
>>>>> Please Post the MIBs which you want to use and the generator command 
>>>>> you used.
>>>>>
>>>>> PS:
>>>>> On top of all MIBs there are comments Like IMPORT which describes 
>>>>> which other MIBs are needed.
>>>>> Fürther use MIBv2 with names instead OIDs. Put the Name in the 
>>>>> Generator yml Not the oid.
>>>>>
>>>>> Mehran Saeed schrieb am Samstag, 27. April 2024 um 19:38:49 UTC+2:
>>>>>
>>>>>> yes sure
>>>>>> below are the logs whilst generating 
>>>>>>
>>>>>> ```
>>>>>> MIBDIRS='mibs' ./generator --fail-on-parse-errors generate
>>>>>> ts=2024-04-27T17:34:02.776Z caller=net_snmp.go:175 level=info 
>>>>>> msg="Loading MIBs" from=mibs
>>>>>> ts=2024-04-27T17:34:03.011Z caller=main.go:124 level=warn 
>>>>>> msg="NetSNMP reported parse error(s)" errors=3839
>>>>>> ts=2024-04-27T17:34:03.094Z caller=main.go:53 level=info 
>>>>>> msg="Generating config for module" module=a10
>>>>>> ts=2024-04-27T17:34:03.115Z caller=main.go:134 level=error msg="Error 
>>>>>> generating config netsnmp" err="cannot find oid 
>>>>>> 'ax

Re: [prometheus-users] snmp generator.yml fails error err="cannot find oid '1.22610.2.4.1.2.2' to walk

2024-04-28 Thread Alexander Wilke
Is this MIB in the MIBs folder?
https://www.circitor.fr/Mibs/Mib/A/A10-AX-MIB.mib

Mehran Saeed schrieb am Samstag, 27. April 2024 um 21:08:54 UTC+2:

> Thanks for responding. 
>
> below are the two MIBs for memory I am trying to use:
>
> memory usage: 1.3.6.1.4.1.22610.2.4.1.2.2
>
> Field Name : axSysMemoryUsage 
> Field Type: Integer32 
> Field Status : current 
> Description : The usage memory(KB). 
> OID : 1.3.6.1.4.1.22610.2.4.1.2.2  
>
> memory total: 1.3.6.1.4.1.22610.2.4.1.2.1
>
>   Field Name : axSysMemoryTotal 
> Field Type: Integer32
> Field Status : current 
> Description : The total memory(KB). 
> OID : 1.3.6.1.4.1.22610.2.4.1.2.1  
>
>
> I used the cmd below to get the generator executable file:
>
> make generate
>
> then to run the generator ran this cmd:
>
> ./generator generate   -m /snmp_exporter/generator/mibs  -g 
> /generator/generator.yml   -o /snmp_exporter/mib/snmp.yml
>
>
> I tried changing the generator.yml file by mentioning just object id and 
> OID but no luck
>
> ---
> auths:
>   public_v1:
> version: 1
>   public_v2:
> version: 2
>
>
>   prometheus_v3:
> username: user
> password: pwd
> auth_protocol: SHA
> priv_protocol: AES
> security_level: authPriv
> priv_password: pwd
> version: 3
>
> modules:
>   
>   a10:
> walk: 
>   - 1.3.6.1.4.1.22610.2.4.1.2.1
>
>
> On Saturday, April 27, 2024 at 7:22:32 PM UTC+1 Alexander Wilke wrote:
>
>> You probably do not have alle required mibs in the folder.
>> Please Post the MIBs which you want to use and the generator command you 
>> used.
>>
>> PS:
>> On top of all MIBs there are comments Like IMPORT which describes which 
>> other MIBs are needed.
>> Fürther use MIBv2 with names instead OIDs. Put the Name in the Generator 
>> yml Not the oid.
>>
>> Mehran Saeed schrieb am Samstag, 27. April 2024 um 19:38:49 UTC+2:
>>
>>> yes sure
>>> below are the logs whilst generating 
>>>
>>> ```
>>> MIBDIRS='mibs' ./generator --fail-on-parse-errors generate
>>> ts=2024-04-27T17:34:02.776Z caller=net_snmp.go:175 level=info 
>>> msg="Loading MIBs" from=mibs
>>> ts=2024-04-27T17:34:03.011Z caller=main.go:124 level=warn msg="NetSNMP 
>>> reported parse error(s)" errors=3839
>>> ts=2024-04-27T17:34:03.094Z caller=main.go:53 level=info msg="Generating 
>>> config for module" module=a10
>>> ts=2024-04-27T17:34:03.115Z caller=main.go:134 level=error msg="Error 
>>> generating config netsnmp" err="cannot find oid 
>>> 'axSysSecondaryVersionOnDisk' to walk
>>> ```
>>>
>>> Also if I try generating the if-mib objects from if-mib module they work 
>>> fine. 
>>> I need to generate mibs for A10. have put the a10 mib file in the mibs 
>>> directory. 
>>> Looks like its not able to find the OIDs for a10. 
>>>
>>>
>>> On Sat, Apr 27, 2024 at 10:38 AM Ben Kochie  wrote:
>>>
>>>> Can you post the logs of the `generator generate`? What about `generator 
>>>> parse_errors`?
>>>>
>>>> On Sat, Apr 27, 2024 at 11:37 AM Mehran Saeed  
>>>> wrote:
>>>>
>>> Hello 
>>>>> I am trying to generate snmp.yml from generator but it fails. its for 
>>>>> a10 load balancers as I needed extra mibs for CPU and memory. below is 
>>>>> the 
>>>>> generator.yml config. 
>>>>> I have added the mib file into the correct mib directory as well. 
>>>>>
>>>>> ```
>>>>> ---
>>>>> auths:
>>>>>   public_v1:
>>>>> version: 1
>>>>>   public_v2:
>>>>> version: 2
>>>>>
>>>>>   prometheus_v3:
>>>>> username: user
>>>>> password: pwd
>>>>> auth_protocol: SHA
>>>>> priv_protocol: AES
>>>>> security_level: authPriv
>>>>> priv_password: pwd
>>>>> version: 3
>>>>>
>>>>>   
>>>>>
>>>>> modules:
>>>>>   # Default IF-MIB interfaces table with ifIndex.
>>>>>   a10:
>>>>> walk:
>>>>>   - 1.22610.2.4.1.2.2
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Prometheus Users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to prometheus-use...@googlegroups.com.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/prometheus-users/3255afc8-e4cd-4260-b405-bf13047cd53fn%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/prometheus-users/3255afc8-e4cd-4260-b405-bf13047cd53fn%40googlegroups.com?utm_medium=email_source=footer>
>>>>> .
>>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/05d6da1d-c5a7-47b4-9dcf-670d3ffc330cn%40googlegroups.com.


Re: [prometheus-users] snmp generator.yml fails error err="cannot find oid '1.22610.2.4.1.2.2' to walk

2024-04-27 Thread Alexander Wilke
You probably do not have alle required mibs in the folder.
Please Post the MIBs which you want to use and the generator command you 
used.

PS:
On top of all MIBs there are comments Like IMPORT which describes which 
other MIBs are needed.
Fürther use MIBv2 with names instead OIDs. Put the Name in the Generator 
yml Not the oid.

Mehran Saeed schrieb am Samstag, 27. April 2024 um 19:38:49 UTC+2:

> yes sure
> below are the logs whilst generating 
>
> ```
> MIBDIRS='mibs' ./generator --fail-on-parse-errors generate
> ts=2024-04-27T17:34:02.776Z caller=net_snmp.go:175 level=info msg="Loading 
> MIBs" from=mibs
> ts=2024-04-27T17:34:03.011Z caller=main.go:124 level=warn msg="NetSNMP 
> reported parse error(s)" errors=3839
> ts=2024-04-27T17:34:03.094Z caller=main.go:53 level=info msg="Generating 
> config for module" module=a10
> ts=2024-04-27T17:34:03.115Z caller=main.go:134 level=error msg="Error 
> generating config netsnmp" err="cannot find oid 
> 'axSysSecondaryVersionOnDisk' to walk
> ```
>
> Also if I try generating the if-mib objects from if-mib module they work 
> fine. 
> I need to generate mibs for A10. have put the a10 mib file in the mibs 
> directory. 
> Looks like its not able to find the OIDs for a10. 
>
>
> On Sat, Apr 27, 2024 at 10:38 AM Ben Kochie  wrote:
>
>> Can you post the logs of the `generator generate`? What about `generator 
>> parse_errors`?
>>
>> On Sat, Apr 27, 2024 at 11:37 AM Mehran Saeed  
>> wrote:
>>
> Hello 
>>> I am trying to generate snmp.yml from generator but it fails. its for 
>>> a10 load balancers as I needed extra mibs for CPU and memory. below is the 
>>> generator.yml config. 
>>> I have added the mib file into the correct mib directory as well. 
>>>
>>> ```
>>> ---
>>> auths:
>>>   public_v1:
>>> version: 1
>>>   public_v2:
>>> version: 2
>>>
>>>   prometheus_v3:
>>> username: user
>>> password: pwd
>>> auth_protocol: SHA
>>> priv_protocol: AES
>>> security_level: authPriv
>>> priv_password: pwd
>>> version: 3
>>>
>>>   
>>>
>>> modules:
>>>   # Default IF-MIB interfaces table with ifIndex.
>>>   a10:
>>> walk:
>>>   - 1.22610.2.4.1.2.2
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to prometheus-use...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/3255afc8-e4cd-4260-b405-bf13047cd53fn%40googlegroups.com
>>>  
>>> 
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/0da8cd69-ba90-4bcc-bf12-157e9d893e94n%40googlegroups.com.


Re: [prometheus-users] Generator snmp_exporter return error 500 in prometheus

2024-04-22 Thread Alexander Wilke
Is it possible the snmp.yml timeout is too Low?
Scrape_co fig hast 1m but logs say 5s. Maybe you should try and Set 
timeout: 50s in SNMP.yml, too and retries: 0.
Or retries: 3 and timeout: 15s.

Ben Kochie schrieb am Dienstag, 16. April 2024 um 11:20:05 UTC+2:

> I've got a new packet debugging option that I've been working on:
> https://github.com/prometheus/snmp_exporter/pull/1157
>
> On Tue, Apr 16, 2024 at 10:56 AM Nicolas  wrote:
>
>> Hi again,
>>
>> So I confirm that the 1.3.6.1.2.1.1 doesn't work fine but I don't know 
>> why...
>>
>> *debug.log* :
>>
>>
>> *214 level=debug auth=cisco_v3 target=xx.xx.xx.xx module=arte_mib 
>> msg="Walking subtree" oid=1.3.6.1.2.1.1393 level=info auth=cisco_v3 target= 
>> xx.xx.xx.xx   module=arte_mib msg="Error scraping target" err="error 
>> walking target xx.xx.xx.xx: request timeout (after 3 retries)"464 
>> level=debug auth=cisco_v3 target= xx.xx.xx.xx   module=arte_mib 
>> msg="Finished scrape" duration_seconds=20.048886702*
>>
>> *$ snmpwalk -v3 -l authPriv -u user -a SHA -A secret -x AES -X secret 
>> xx.xx.xx.xx 1.3.6.1.2.1.1*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *SNMPv2-MIB::sysDescr.0 = STRING: Cisco IOS Software [Cupertino], 
>> Catalyst L3 Switch Software (CAT9K_IOSXE), Version 17.9.4, RELEASE SOFTWARE 
>> (fc5)Technical Support: http://www.cisco.com/techsupport 
>> Copyright (c) 1986-2023 by Cisco Systems, 
>> Inc.Compiled Wed 26-Jul-23 10:26 by mcpreSNMPv2-MIB::sysObjectID.0 = OID: 
>> SNMPv2-SMI::enterprises.9.1.2494DISMAN-EVENT-MIB::sysUpTimeInstance = 
>> Timeticks: (725228220) 83 days, 22:31:22.20SNMPv2-MIB::sysContact.0 = 
>> STRING:SNMPv2-MIB::sysName.0 = STRING: xx.xx.xx.xxSNMPv2-MIB::sysLocation.0 
>> = STRING: SITESNMPv2-MIB::sysServices.0 = INTEGER: 
>> 6SNMPv2-MIB::sysORLastChange.0 = Timeticks: (0) 
>> 0:00:00.00SNMPv2-MIB::sysORID.1 = OID: 
>> SNMPv2-SMI::enterprises.9.7.129SNMPv2-MIB::sysORID.2 = OID: 
>> SNMPv2-SMI::enterprises.9.7.115SNMPv2-MIB::sysORID.3 = OID: 
>> SNMPv2-SMI::enterprises.9.7.265*
>>
>> *generator.yml*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *auths:  cisco_v3:security_level: authPrivusername: usesr
>> password: secretauth_protocol: SHApriv_protocol: AES
>> priv_password: secretversion: 3modules:  arte_mib:walk:- 
>> 1.3.6.1.2.1.1*
>>
>> *./generator generate -m /usr/share/snmp/mibs/  -g generator.yml -o 
>> snmp.yml*
>>
>>
>>
>> *ts=2024-04-16T08:34:31.972Z caller=net_snmp.go:175 level=info 
>> msg="Loading MIBs" from=/usr/share/snmp/mibs/ts=2024-04-16T08:34:32.016Z 
>> caller=main.go:53 level=info msg="Generating config for module" 
>> module=arte_mibts=2024-04-16T08:34:32.018Z caller=main.go:68 level=info 
>> msg="Generated metrics" module=arte_mib 
>> metrics=12ts=2024-04-16T08:34:32.019Z caller=main.go:93 level=info 
>> msg="Config written" 
>> file=/etc/prometheus/snmp_generator/snmp_exporter-0.25.0/generator/snmp.yml*
>>
>> *snmp.yml (generated by generator) *
>> # WARNING: This file was auto-generated using snmp_exporter generator, 
>> manual changes will be lost.
>> auths:
>>   cisco_v3:
>> community: public
>> security_level: authPriv
>> username: user
>> password: secret
>> auth_protocol: SHA
>> priv_protocol: AES
>> priv_password: secret
>> version: 3
>> modules:
>>   arte_mib:
>> walk:
>> - 1.3.6.1.2.1.1
>> metrics:
>> - name: sysDescr
>>   oid: 1.3.6.1.2.1.1.1
>>   type: DisplayString
>>   help: A textual description of the entity - 1.3.6.1.2.1.1.1
>> - name: sysObjectID
>>   oid: 1.3.6.1.2.1.1.2
>>   type: OctetString
>>   help: The vendor's authoritative identification of the network 
>> management subsystem
>> contained in the entity - 1.3.6.1.2.1.1.2
>> - name: sysUpTime
>>   oid: 1.3.6.1.2.1.1.3
>>   type: gauge
>>   help: The time (in hundredths of a second) since the network 
>> management portion
>> of the system was last re-initialized. - 1.3.6.1.2.1.1.3
>> - name: sysContact
>>   oid: 1.3.6.1.2.1.1.4
>>   type: DisplayString
>>   help: The textual identification of the contact person for this 
>> managed node,
>> together with information on how to contact this person - 
>> 1.3.6.1.2.1.1.4
>> - name: sysName
>>   oid: 1.3.6.1.2.1.1.5
>>   type: DisplayString
>>   help: An administratively-assigned name for this managed node - 
>> 1.3.6.1.2.1.1.5
>> - name: sysLocation
>>   oid: 1.3.6.1.2.1.1.6
>>   type: DisplayString
>>   help: The physical location of this node (e.g., 'telephone closet, 
>> 3rd floor')
>> - 1.3.6.1.2.1.1.6
>> - name: sysServices
>>   oid: 1.3.6.1.2.1.1.7
>>   type: gauge
>>   help: A value which indicates the set of services that this entity 
>> may potentially
>> offer - 1.3.6.1.2.1.1.7
>> - name: sysORLastChange
>>   oid: 1.3.6.1.2.1.1.8
>>   type: gauge
>>   help: The value of sysUpTime at 

[prometheus-users] Re: Prometheus Agent 2.51.1 - remote_write - write_relabel_configs

2024-04-17 Thread Alexander Wilke
Hello,

don't know exactly what the issue was - however this  is working now. 
Probably a visulization issue on my site with (bad) grafana query I used.

remote_write:

- url: "https://prometheus-q.domain.de:9009/api/v1/push;

basic_auth:

username: "tenant_02"

password: "tenant_02"

queue_config:

min_shards: 3

write_relabel_configs:

- source_labels: [job]

regex: "(node_exporter|windows_exporter)"

action: keep

Alexander Wilke schrieb am Mittwoch, 17. April 2024 um 09:51:57 UTC+2:

> Hello,
>
> i have several prometheus agents doing remote write to another central 
> prometheus Agent.
> on this central prometheus agent I Do two remote_rewrite to two 
> destinations.
>
> Destination A should receive all metrics, this seems to work.
> Destination B should only receive metrics from specific jobs.
>
> the job names are:
> windows_exporter
> node_exporter
>
> I tried several combinations but it looks like I still send all metrics to 
> destination B:
>
> In the example below there is only destination B.
>
>
>
>
>
> remote_write: 
>
> - url: "https://prometheus-q.domain.de:9009/api/v1/push; 
>
> basic_auth: 
>
> username: "tenant_02" 
>
> password: "tenant_02" 
>
> queue_config: 
>
> min_shards: 3 
>
> write_relabel_configs: 
>
> - source_labels: [job] 
>
> regex: "node_exporter" 
>
> action: keep 
>
> - source_labels: [job] 
>
> regex: "windows_exporter" 
>
> action: keep 
>
> - source_labels: [job] 
>
> regex: ".*" 
>
> action: drop
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/48156fd2-fdba-4a81-a44c-1669d129de7fn%40googlegroups.com.


[prometheus-users] Prometheus Agent 2.51.1 - remote_write - write_relabel_configs

2024-04-17 Thread Alexander Wilke
Hello,

i have several prometheus agents doing remote write to another central 
prometheus Agent.
on this central prometheus agent I Do two remote_rewrite to two 
destinations.

Destination A should receive all metrics, this seems to work.
Destination B should only receive metrics from specific jobs.

the job names are:
windows_exporter
node_exporter

I tried several combinations but it looks like I still send all metrics to 
destination B:

In the example below there is only destination B.





remote_write: 

- url: "https://prometheus-q.domain.de:9009/api/v1/push; 

basic_auth: 

username: "tenant_02" 

password: "tenant_02" 

queue_config: 

min_shards: 3 

write_relabel_configs: 

- source_labels: [job] 

regex: "node_exporter" 

action: keep 

- source_labels: [job] 

regex: "windows_exporter" 

action: keep 

- source_labels: [job] 

regex: ".*" 

action: drop

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f1ac967b-112a-48bc-ac36-5527983bb878n%40googlegroups.com.


Re: [prometheus-users] Correlation between snmp scrape time and massive rate output for ifHCInOctets

2024-03-16 Thread Alexander Wilke
https://github.com/prometheus/snmp_exporter/tree/main/generator

Alexander Wilke schrieb am Samstag, 16. März 2024 um 09:08:44 UTC+1:

> Check File Format example.
>
> Time Out, retries, max-repetition.
>
> I use Repetition 50 or 100 with Cisco, retries 0 and Time Out 1s or 500ms 
> below Prometheus timeout
>
> Ben Kochie schrieb am Samstag, 16. März 2024 um 06:31:17 UTC+1:
>
>> This is very likely a problem with counter resets or some other kind of 
>> duplicate data.
>>
>> The best way to figure this out is to perform the query, but without the 
>> `rate()` function.
>>
>> This can be done via the Prometheus UI (harder to do in Grafana) in the 
>> "Table" view.
>>
>> Here is an example demo query 
>> <https://prometheus.demo.do.prometheus.io/graph?g0.expr=process_cpu_seconds_total%7Bjob%3D%22prometheus%22%7D%5B2m%5D=1_mode=lines_exemplars=0_input=1h>
>>
>> The results is a list of the raw samples that are needed to debug.
>>
>> On Fri, Mar 15, 2024 at 11:41 PM Nick Carlton  
>> wrote:
>>
>>> Hello Everyone,
>>>
>>> I have just seen something weird in my environment where I saw interface 
>>> bandwidth on a gigabit switch reach about 1tbps on some of the 
>>> interfaces.
>>>
>>> Here is the query im using:
>>>
>>> rate(ifHCInOctets{ifHCInOctetsIntfName=~".*.\\/.*.",instance=""}[2m])
>>>  
>>> * 8
>>>
>>> Which ive never had a problem with. Here is an image of the graph 
>>> showing the massive increase in bandwidth and then decrease back to normal:
>>>
>>> [image: Screenshot 2024-03-15 222353.png]
>>>
>>> When Ive done some more investigation into what could have happened, I 
>>> can see that the 'snmp_scrape_duration_seconds' metric increases to around 
>>> 20s at the time. So the cisco switch is talking 20 seconds to respond to 
>>> the SNMP request.
>>>
>>> [image: Screenshot 2024-03-15 44.png]
>>>
>>> Im a bit confused as to how this could cause the rate query to give 
>>> completely false data? Could the delay in data have caused prometheus to 
>>> think there was more bandwidth on the interface? The switch certainly 
>>> cannot do the speeds the graph is claiming!
>>>
>>> Im on v0.25.0 on the SNMP exporter and its normally sat around 2s for 
>>> the scrapes. Im not blaming the exporter for the high response times, thats 
>>> probably the switch. Just wondering if in some way the high response time 
>>> could cause the rate query to give incorrect data. The fact the graph went 
>>> back to normal post the high reponse times makes me think it wasn't the 
>>> switch giving duff data.
>>>
>>> Anyone seen this before and is there any way to mitigate? Happy to 
>>> provide more info if required :)
>>>
>>> Thanks
>>> Nick
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to prometheus-use...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/6fd3dca6-2013-47ad-af8f-3344e79954a7n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/prometheus-users/6fd3dca6-2013-47ad-af8f-3344e79954a7n%40googlegroups.com?utm_medium=email_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4cd1e6b8-fa73-4ee0-92c0-c504c161870bn%40googlegroups.com.


Re: [prometheus-users] Correlation between snmp scrape time and massive rate output for ifHCInOctets

2024-03-16 Thread Alexander Wilke
Check File Format example.

Time Out, retries, max-repetition.

I use Repetition 50 or 100 with Cisco, retries 0 and Time Out 1s or 500ms 
below Prometheus timeout

Ben Kochie schrieb am Samstag, 16. März 2024 um 06:31:17 UTC+1:

> This is very likely a problem with counter resets or some other kind of 
> duplicate data.
>
> The best way to figure this out is to perform the query, but without the 
> `rate()` function.
>
> This can be done via the Prometheus UI (harder to do in Grafana) in the 
> "Table" view.
>
> Here is an example demo query 
> 
>
> The results is a list of the raw samples that are needed to debug.
>
> On Fri, Mar 15, 2024 at 11:41 PM Nick Carlton  
> wrote:
>
>> Hello Everyone,
>>
>> I have just seen something weird in my environment where I saw interface 
>> bandwidth on a gigabit switch reach about 1tbps on some of the 
>> interfaces.
>>
>> Here is the query im using:
>>
>> rate(ifHCInOctets{ifHCInOctetsIntfName=~".*.\\/.*.",instance=""}[2m])
>>  
>> * 8
>>
>> Which ive never had a problem with. Here is an image of the graph showing 
>> the massive increase in bandwidth and then decrease back to normal:
>>
>> [image: Screenshot 2024-03-15 222353.png]
>>
>> When Ive done some more investigation into what could have happened, I 
>> can see that the 'snmp_scrape_duration_seconds' metric increases to around 
>> 20s at the time. So the cisco switch is talking 20 seconds to respond to 
>> the SNMP request.
>>
>> [image: Screenshot 2024-03-15 44.png]
>>
>> Im a bit confused as to how this could cause the rate query to give 
>> completely false data? Could the delay in data have caused prometheus to 
>> think there was more bandwidth on the interface? The switch certainly 
>> cannot do the speeds the graph is claiming!
>>
>> Im on v0.25.0 on the SNMP exporter and its normally sat around 2s for the 
>> scrapes. Im not blaming the exporter for the high response times, thats 
>> probably the switch. Just wondering if in some way the high response time 
>> could cause the rate query to give incorrect data. The fact the graph went 
>> back to normal post the high reponse times makes me think it wasn't the 
>> switch giving duff data.
>>
>> Anyone seen this before and is there any way to mitigate? Happy to 
>> provide more info if required :)
>>
>> Thanks
>> Nick
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/6fd3dca6-2013-47ad-af8f-3344e79954a7n%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/5fc62bc2-42e0-46e6-8382-134f71836b9dn%40googlegroups.com.


Re: [prometheus-users] blackbox_exporter 0.24.0 and smokeping_prober 0.7.1 - DNS cache "nscd" not working

2024-03-15 Thread Alexander Wilke
Thanks for the hint. I checked the Go DNS feature and found these hints:


   1. export GODEBUG=netdns=go # force pure Go resolver 
   2. export GODEBUG=netdns=cgo # force cgo resolver 
   


I tried to set the cgo env variable and restarted services. however 
systemd-resolved and nscd seem not to be able to cache it.
May have to wait for a colleague who is more experienced in linux than me. 
perhaps we can figure it out why it's not working with the new behaviour.




Ben Kochie schrieb am Freitag, 15. März 2024 um 17:52:09 UTC+1:

> All of the Prometheus components you're talking about are 
> statically compiled Go binaries. These use Go's native DNS resolution. It 
> does not use glibc. So maybe looking for solutions related to Golang and 
> nscd would help. I've not looked into this myself.
>
> But on the subject of node local DNS caches. I can highly 
> recommend CoreDNS's cache plugin[0]. It even has built-in Prometheus 
> support so you can find how good your cache is working. The CoreDNS cache 
> specifically supports prefetching, which is important for making sure 
> there's no gap or latency in updating the cache when the TTL is close to 
> expiring.
>
> [0]: https://coredns.io/plugins/cache/
> [1]: https://coredns.io/plugins/metrics/
>
> On Fri, Mar 15, 2024 at 3:41 PM Alexander Wilke  
> wrote:
>
>> Hello,
>>
>> I am running blackbox_exporter and smokeping_prober on a RHEL8 
>> environment. Unfortunately with our configu wie have around 4-5 million DNS 
>> queries per 24hrs.
>>
>> The reason for that is that we do very frequent tcp queries to various 
>> destinations which results in many DNS requests.
>>
>> To reduce the DNS load on the DNS server we tried to implement "nscd" as 
>> a DNS cache.
>>
>> However running strace we notice that the blackbox_exporter is checking 
>> resolve.con, then nsswitch.conf then /etc/hosts and then send the query 
>> directly to the DNS server not using the DNS cache. Thats for every target 
>> of blackbox_exporter.
>>
>> For smokeping_prober I am aware that it resolves DNS only at restart and 
>> we notice the same. All requests are directly send to DNS server but not to 
>> the cache.
>>
>> anyone using nscd on RHEL8 to cache blackbox_exporter and/or 
>> smokeping_prober?
>>
>> If not has anyone a working, simple configuration with unbound for this 
>> specific scenario?
>>
>> Is blackbox and smokeping using glibc methods to resolve DNS or something 
>> else?
>>
>> Thank you very much!
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/7972e866-f6be-47f6-8807-65f560f2aa3fn%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/7972e866-f6be-47f6-8807-65f560f2aa3fn%40googlegroups.com?utm_medium=email_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/48a894c9-be27-4dfc-8b27-a01cd3b9aef3n%40googlegroups.com.


[prometheus-users] Re: Correlation between snmp scrape time and massive rate output for ifHCInOctets

2024-03-15 Thread Alexander Wilke
Hello,

1.) is the timeout of 50s the same on prometheus scrape_config and snmp.yml 
file?
2.) is this really the name of the interface? ifHCInOctetsIntfName
3) the =~".*.\\/.*." maybo shows many interfaces, maybe som internal 
loppback which may count traffic twice? Further it may show PortChannel 
(Po) and then VLAN (Po.xy) and physical interfaces !?

I am not sure but the screenshost show "stacked lines" - is it possible 
that in the first screenshot the throughput of all interfaces was stacked ?


Nick Carlton schrieb am Freitag, 15. März 2024 um 23:43:19 UTC+1:

> To clarify, my scrapes for this data run every 1m and have a timeout of 50s
>
> On Friday 15 March 2024 at 22:41:52 UTC Nick Carlton wrote:
>
>> Hello Everyone,
>>
>> I have just seen something weird in my environment where I saw interface 
>> bandwidth on a gigabit switch reach about 1tbps on some of the 
>> interfaces.
>>
>> Here is the query im using:
>>
>> rate(ifHCInOctets{ifHCInOctetsIntfName=~".*.\\/.*.",instance=""}[2m])
>>  
>> * 8
>>
>> Which ive never had a problem with. Here is an image of the graph showing 
>> the massive increase in bandwidth and then decrease back to normal:
>>
>> [image: Screenshot 2024-03-15 222353.png]
>>
>> When Ive done some more investigation into what could have happened, I 
>> can see that the 'snmp_scrape_duration_seconds' metric increases to around 
>> 20s at the time. So the cisco switch is talking 20 seconds to respond to 
>> the SNMP request.
>>
>> [image: Screenshot 2024-03-15 44.png]
>>
>> Im a bit confused as to how this could cause the rate query to give 
>> completely false data? Could the delay in data have caused prometheus to 
>> think there was more bandwidth on the interface? The switch certainly 
>> cannot do the speeds the graph is claiming!
>>
>> Im on v0.25.0 on the SNMP exporter and its normally sat around 2s for the 
>> scrapes. Im not blaming the exporter for the high response times, thats 
>> probably the switch. Just wondering if in some way the high response time 
>> could cause the rate query to give incorrect data. The fact the graph went 
>> back to normal post the high reponse times makes me think it wasn't the 
>> switch giving duff data.
>>
>> Anyone seen this before and is there any way to mitigate? Happy to 
>> provide more info if required :)
>>
>> Thanks
>> Nick
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/12ddad30-e5f8-4fb1-9869-45095b48b647n%40googlegroups.com.


[prometheus-users] blackbox_exporter 0.24.0 and smokeping_prober 0.7.1 - DNS cache "nscd" not working

2024-03-15 Thread Alexander Wilke
Hello,

I am running blackbox_exporter and smokeping_prober on a RHEL8 environment. 
Unfortunately with our configu wie have around 4-5 million DNS queries per 
24hrs.

The reason for that is that we do very frequent tcp queries to various 
destinations which results in many DNS requests.

To reduce the DNS load on the DNS server we tried to implement "nscd" as a 
DNS cache.

However running strace we notice that the blackbox_exporter is checking 
resolve.con, then nsswitch.conf then /etc/hosts and then send the query 
directly to the DNS server not using the DNS cache. Thats for every target 
of blackbox_exporter.

For smokeping_prober I am aware that it resolves DNS only at restart and we 
notice the same. All requests are directly send to DNS server but not to 
the cache.

anyone using nscd on RHEL8 to cache blackbox_exporter and/or 
smokeping_prober?

If not has anyone a working, simple configuration with unbound for this 
specific scenario?

Is blackbox and smokeping using glibc methods to resolve DNS or something 
else?

Thank you very much!

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/7972e866-f6be-47f6-8807-65f560f2aa3fn%40googlegroups.com.


[prometheus-users] Re: prometheus agent 2.49.1 - remote_write metrics while disconnected

2024-03-14 Thread Alexander Wilke
I found several posts which ask if it is possible that prometheus agent can 
store the metrics locally if remote_write destination is not available.

This post describes that it should work at least for 2hrs.:
https://prometheus.io/blog/2021/11/16/agent/

However here is a guthub issue which seems to describe the same issue I saw 
in my environment:
https://github.com/prometheus/prometheus/issues/11919


Can someone confirm this is working with 2.49.1 to store metrics for some 
time on prometheus agent until remotw_write is available again?
if yes how did you configure this?
Alexander Wilke schrieb am Dienstag, 12. März 2024 um 23:16:10 UTC+1:

> Hello,
>
> I am running a prometheus server as a central instance.
> I configured prometheus in agent mode on another system and send the logs 
> via remote_write from agent to central prometheus. This works.
>
> I changed the password in web.config.yml on the central server so it does 
> not match for the prometheus agent anymore. I could see that metrics did 
> not arrive anymore.
>
> I waited form 2-3 minutes and then reconfigured the correct password on 
> the central prometheus, di a reload and I coud see metrics arrive 
> immediately after that change. Unfortunately I did not receive the metrics 
> from the outage time.
>
> my main intention was exactly this - that I can get the metrics if there 
> is an issue in the network or whatever between agent and central prometheus.
>
> Is this expected behaviour or misconfiguration/misunderstanding on my site?
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/3f257c25-700a-4ed7-bdff-87e4fbcaabe9n%40googlegroups.com.


[prometheus-users] Re: Best practive: "job_name in prometheus agent? Same job_name allowed ?

2024-03-14 Thread Alexander Wilke
Thanks for your response.

What would you recommend in a situation with several hundredes or thousands 
of servers or systems within a kubernetes cluster which should have the 
node_exporter installed.
my idea was to install the node_exporter + prometheus agent. agent scrapes 
local node_exporter and then remotw_writes the results to a central 
prometheus server or a loadbalancer which distributes to different 
prometheus servers.
my idea was to user the same configu for alle node_exporter + prometheus 
agents. For that reason they all have the same job name which would be ok.

However I think I will have a problem because if I use "127.0.0.1:9100" as 
target to scrape then all instances are equal.

Is there any possibility to use a variable in the scrape_config which 
reflects any environment variable from linux system or any other mechanism 
to make this instance unique?


Brian Candler schrieb am Donnerstag, 14. März 2024 um 13:04:07 UTC+1:

> As long as all the time series have distinct label sets (in particular, 
> different "instance" labels), and you're not mixing scraping with 
> remote-writing for the same targets, then I don't see any problem with all 
> the agents using the same "job" label when remote-writing.
>
> On Tuesday 12 March 2024 at 22:30:22 UTC Alexander Wilke wrote:
>
>> At the moment I am running the job with name
>> "node_exporter" which has 20 different targets. (instances)
>> With this configuration there should not be any conflict.
>>
>> my idea is to install the prometheus agent on the nodes itself.
>> technically it looks like it work if I use the same job_name on the agent 
>> and central prometheus as long as the targets/instances are different.
>>
>> In general I avoid conflicting job_names but in this situation it may be 
>> ok from my point of view.
>>
>> what do you think, recommend in this specific scenario ?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/706133c4-aa70-4d60-b1a0-dc0d85bcd5een%40googlegroups.com.


[prometheus-users] Best practive: "job_name in prometheus agent? Same job_name allowed ?

2024-03-12 Thread Alexander Wilke
At the moment I am running the job with name
"node_exporter" which has 20 different targets. (instances)
With this configuration there should not be any conflict.

my idea is to install the prometheus agent on the nodes itself.
technically it looks like it work if I use the same job_name on the agent 
and central prometheus as long as the targets/instances are different.

In general I avoid conflicting job_names but in this situation it may be ok 
from my point of view.

what do you think, recommend in this specific scenario ?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8578d622-44dd-4ed6-9046-f6bf4d3db2f0n%40googlegroups.com.


[prometheus-users] prometheus agent 2.49.1 - remote_write metrics while disconnected

2024-03-12 Thread Alexander Wilke
Hello,

I am running a prometheus server as a central instance.
I configured prometheus in agent mode on another system and send the logs 
via remote_write from agent to central prometheus. This works.

I changed the password in web.config.yml on the central server so it does 
not match for the prometheus agent anymore. I could see that metrics did 
not arrive anymore.

I waited form 2-3 minutes and then reconfigured the correct password on the 
central prometheus, di a reload and I coud see metrics arrive immediately 
after that change. Unfortunately I did not receive the metrics from the 
outage time.

my main intention was exactly this - that I can get the metrics if there is 
an issue in the network or whatever between agent and central prometheus.

Is this expected behaviour or misconfiguration/misunderstanding on my site?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/e30e6837-8a26-4ecb-99f0-9091bea0be32n%40googlegroups.com.


[prometheus-users] Re: snmp_exporter adding trailing 0's to me oid's

2024-03-09 Thread Alexander Wilke
Hi,

try this instead of the OID:

inQueueSize

Neil Stottler schrieb am Freitag, 8. März 2024 um 21:17:45 UTC+1:

> Hey all,
>
> Weird issue I'm seeing. In my generator.yml file I'm looking to add a 
> barracuda module as follows:
>   #barracuda
>   barracuda-spam:
> walk:
>   - 1.3.6.1.4.1.20632.2.2
> When I do this in the snmp.yml it adds a trailing 0 to the end making this 
> walk invalid. Any idea why?
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4ec264c6-d0a8-4e6d-90fc-1e1a69c1bbffn%40googlegroups.com.


Re: [prometheus-users] Smokeping_prober CPU usage optimization possible?

2024-02-27 Thread Alexander Wilke
Hello Ben,

I googled a little bit and found this:
https://github.com/prometheus/prometheus/issues/2665#issuecomment-342149607

As far as I understand this variable is not working anymore or not used 
anymore !?
I tried in a test environment:

export GOGC=200

And the restarted (not reload) prometheus and in the UI --> "Runime & Build 
Information" the GOGC is still empty.

1.) Is this environment variable set correctly?
2.) Is the variable still working?
3.) If it is still working can I apply it only to smokeping_prober but not 
other services like prometheus? Sounds like higher GOGC has tradeoffs for 
queries in the prometheus tsdb ?

Ben Kochie schrieb am Dienstag, 27. Februar 2024 um 16:59:31 UTC+1:

> Interesting, thanks for the data. It does seem like the process is 
> spending a lot of time doing GC like I thought.
>
> One trick you could try is to increase the memory allocated to the prober, 
> which would reduce the time spent on GC.
>
> The default setting is is GOGC=100.
>
> You could try increasing this by setting the environment variable, GOGC.
>
> Try something like GOGC=200 or GOGC=300.
>
> This will make the process use more memory, but it should reduce the CPU 
> time spent.
>
> On Sun, Feb 25, 2024 at 11:18 PM Alexander Wilke  
> wrote:
>
>> Hello,
>>
>> I attached a few screenshots showing the results and graphs for 1h and 6h.
>> In addition I added a screenshot from node_exporter metrics to give you 
>> an overview of the system itself.
>> On the same system there is prometheus, grafana, snmp_exporter (200-800% 
>> CPU), smokeping prober, node_exporter, blackbox_exporter.
>> The main CPU consumers are snmp_exporter and smokeping.
>>
>> Ben Kochie schrieb am Sonntag, 25. Februar 2024 um 19:22:35 UTC+1:
>>
>>> Looking at the CPU profile, I'm seeing almost all the time spent in the 
>>> Go runtime. Mostly the ICMP packet receiving code and garbage collection. 
>>> I'm not sure there's a lot we can optimize here as it's core Go code for 
>>> ICMP packet handling.
>>>
>>> Can you also post me a graph of a few metrics queries?
>>>
>>> rate(process_cpu_seconds_total{job="smokeping_prober"}[30s])
>>> rate(go_gc_duration_seconds_count{job="smokeping_prober"}[5m])
>>> rate(go_gc_duration_seconds_sum{job="smokeping_prober"}[5m])
>>>
>>>
>>> On Sun, Feb 25, 2024 at 7:08 PM Alexander Wilke  
>>> wrote:
>>>
>>>> Hello,
>>>> any Chance to investigate the Reports and any suggestions?
>>>>
>>>> Alexander Wilke schrieb am Donnerstag, 22. Februar 2024 um 12:40:09 
>>>> UTC+1:
>>>>
>>>>> Hello,
>>>>>
>>>>> sorry for the delay. here are the results. to be honest - I do not 
>>>>> understand anything of it.
>>>>>
>>>>> Smokeping_Prober Heap:
>>>>>
>>>>>
>>>>> https://pprof.me/a1e7400d32859dbc217e2182398485df/?profileType=profile%3Aalloc_objects%3Acount%3Aspace%3Abytes_items=icicle
>>>>>
>>>>>  
>>>>>
>>>>> smokeping_prober profile30s
>>>>>
>>>>>
>>>>> https://pprof.me/340674b335e114e4b0df6b4582f0644e/?profileType=profile%3Asamples%3Acount%3Acpu%3Ananoseconds%3Adelta
>>>>>
>>>>> Ben Kochie schrieb am Dienstag, 20. Februar 2024 um 10:27:10 UTC+1:
>>>>>
>>>>>> Best thing you can do is capture some pprof data. That will show you 
>>>>>> what it's spending the time on.
>>>>>>
>>>>>> :9374/debug/pprof/heap 
>>>>>> :9374/debug/pprof/profile?seconds=30
>>>>>>
>>>>>> You can post the results to https://pprof.me/ for sharing.
>>>>>>
>>>>>> On Tue, Feb 20, 2024 at 6:22 AM Alexander Wilke  
>>>>>> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>> I am running smokeping_prober from one VM to Monitor around 500 
>>>>>>> destinations.
>>>>>>> Around 30 devices are monitored with 0.2s Intervall and Others with 
>>>>>>> 1.65s Intervall.
>>>>>>>
>>>>>>> Prometheus scrapes every 5s.
>>>>>>>
>>>>>>> So there are roughly 600 icmp ipv4 24byte pings per Seconds.
>>>>>>> CPU usage jumps between 700-1200% using "top"
>>>>>>>
>>>>>>> What Else e

[prometheus-users] Prometheus CPU usage for a query - Single threaded? Grafana Mimir?

2024-02-27 Thread Alexander Wilke
Hello,
is it true that Prometheus 2.49.1 only uses a single thread/CPU for queries 
Like mentioned Here:

https://grafana.com/blog/2022/07/20/how-we-improved-grafana-mimir-query-performance-by-up-to-10x/

How can I Check the Prometheus CPU usage for a query?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/311a8951-bebf-4ee9-9db4-b5ed5f9df908n%40googlegroups.com.


Re: [prometheus-users] Re: Metrics from PUSH Consumer - Relabeled Metrics? Check "Up" state?

2024-02-26 Thread Alexander Wilke
Hello,
absent_over_time() Looks good. It should allow me to alert If metrics is 
Not available for 10min. This will not alert If system reboots.

If i use a query with Offset I think it would alert only AeS Long as the 
Offset still hast a value, eg Offset 10m i will get an Alarm For 10m but 
after that the "now" is the Same as the Offset.

Will give it a try.

Thanks!!

Chris Siebenmann schrieb am Montag, 26. Februar 2024 um 17:22:10 UTC+1:

> > Will I run into issues with "staleness" if there aren't any metrics 
> anymore 
> > for (more) than 5 minutes?
> > Or perhaps can I use this "staleness" indicator in some way?
>
> Perhaps this is a use for absent() or absent_over_time(), if you know
> specific metrics that should always be present from the push sources
> (and you know the push sources).
>
> It might be possible to craft something clever with 'offset' and
> 'unless' to filter out metrics that are still present, eg:
>
> pushed_metric offset 10m unless pushed_metric
>
> I think this will give you every pushed_metric series that was present
> ten minutes ago and isn't now (because it's stale, since it hasn't been
> pushed recently enough). This is less clear than an explicit absent(),
> but means you don't have to statically know the job/instance/etc labels
> for all push sources that should be there.
>
> - cks
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/e659e395-8256-43f4-9031-d369dd8bc7efn%40googlegroups.com.


[prometheus-users] Re: Metrics from PUSH Consumer - Relabeled Metrics? Check "Up" state?

2024-02-26 Thread Alexander Wilke
Will I run into issues with "staleness" if there aren't any metrics anymore 
for (more) than 5 minutes?
Or perhaps can I use this "staleness" indicator in some way?

Brian Candler schrieb am Montag, 26. Februar 2024 um 16:15:57 UTC+1:

> > I am still looking for a solution to identify if a device which uses 
> "PUSH" method is not sending data anmore for e.g. 10 minutes.
>
> Push an additional metric which is "last push time", and check when that 
> value is more than 10 minutes earlier than the current time.
>
> If you already have a metric like "push_device_uptime", which I presume is 
> monotonically increasing, then you can check for when this stops increasing:
>
> expr: push_device_uptime <= push_device_uptime offset 10m
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/03f8b8b2-b908-4499-927c-70c6fb0e3af1n%40googlegroups.com.


[prometheus-users] Re: Metrics from PUSH Consumer - Relabeled Metrics? Check "Up" state?

2024-02-26 Thread Alexander Wilke
Hello,

I am still looking for a solution to identify if a device which uses "PUSH" 
method is not sending data anmore for e.g. 10 minutes.
The amount of devices is limited at around 20-30.

any idea how to check if there are not any new metrics from it? All metrics 
from these devices sharing the same labels like "host_name" so this is a 
label I could verify and they all share same metrics like 
"push_device_uptime(host_name: deviceName)"



Alexander Wilke schrieb am Samstag, 20. Januar 2024 um 10:15:08 UTC+1:

> Hello,
> I have some Enterprise Firewalls which unfortunately only can PUSH metrics 
> to remote_write API of Prometheus. Vendor has No Plans to offer PULL.
>
> Which possibilities do I have in Prometheus to Change some metrics at the 
> time they arrive on the API? I want to add some Custom Labels based in 
> existing Label information.
>
> In Addition ist there any possibility to Check If a devices is still 
> sending metrics?
> Because Prometheus is Not pulling it can Not Check the "Up" state i think 
> but is it possible to query If there are metrics in Prometheus with a 
> specific Label Set and Not older than e.g. 10minutes?
>
> And how can I Check If the Prometheus Server can ingest all the metrics 
> which arrive on the API? For Pull i have some dashboards which Show sampels 
> per Seconds and so on.
>
> I am a little Bit confused because from my understanding remote write is 
> in General used to write Data from one Prometheus to another Destination so 
> the configuation and Tuning Parameters seem Not to fit for incoming Traffic.
>
> Maybe you can Help me.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/65a07ea6-7f65-4743-9d03-9e9e4be50f79n%40googlegroups.com.


[prometheus-users] Blackbox_exporter TCP TLS - time for TLS Handshake ?

2024-02-25 Thread Alexander Wilke
Hello,
Blackbox_exporter with http Prober and TLS Shows how Long the TLS Handshake 
needs. Is this Not possible with TCP + TLS which is Not http/s ?

Is this Not possible or Not implemented?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/284e3a50-6179-4438-8e41-1fed220c76fcn%40googlegroups.com.


Re: [prometheus-users] Smokeping_prober CPU usage optimization possible?

2024-02-25 Thread Alexander Wilke
Hello,
any Chance to investigate the Reports and any suggestions?

Alexander Wilke schrieb am Donnerstag, 22. Februar 2024 um 12:40:09 UTC+1:

> Hello,
>
> sorry for the delay. here are the results. to be honest - I do not 
> understand anything of it.
>
> Smokeping_Prober Heap:
>
>
> https://pprof.me/a1e7400d32859dbc217e2182398485df/?profileType=profile%3Aalloc_objects%3Acount%3Aspace%3Abytes_items=icicle
>
>  
>
> smokeping_prober profile30s
>
>
> https://pprof.me/340674b335e114e4b0df6b4582f0644e/?profileType=profile%3Asamples%3Acount%3Acpu%3Ananoseconds%3Adelta
>
> Ben Kochie schrieb am Dienstag, 20. Februar 2024 um 10:27:10 UTC+1:
>
>> Best thing you can do is capture some pprof data. That will show you what 
>> it's spending the time on.
>>
>> :9374/debug/pprof/heap 
>> :9374/debug/pprof/profile?seconds=30
>>
>> You can post the results to https://pprof.me/ for sharing.
>>
>> On Tue, Feb 20, 2024 at 6:22 AM Alexander Wilke  
>> wrote:
>>
>>> Hello,
>>> I am running smokeping_prober from one VM to Monitor around 500 
>>> destinations.
>>> Around 30 devices are monitored with 0.2s Intervall and Others with 
>>> 1.65s Intervall.
>>>
>>> Prometheus scrapes every 5s.
>>>
>>> So there are roughly 600 icmp ipv4 24byte pings per Seconds.
>>> CPU usage jumps between 700-1200% using "top"
>>>
>>> What Else except reducing Interval or Host Count could Help to reduce 
>>> CPU usage?
>>> Is the UDP Socket "better" or any other optimization which could be 
>>> relevant for that Type of Traffic? Running on RHEL8
>>>
>>> Someone with similar CPU usage and this amount of pings per Seconds? 
>>> Maybe Others Ping 6.000 Destination every 10s?
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to prometheus-use...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/d803c1a2-64ee-48d1-8513-b864856f53c8n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/prometheus-users/d803c1a2-64ee-48d1-8513-b864856f53c8n%40googlegroups.com?utm_medium=email_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/62aeaacb-9fd6-4d64-8bec-d7171f592766n%40googlegroups.com.


[prometheus-users] Re: PromQL: understanding the and operator

2024-02-23 Thread Alexander Wilke
Another possibility could be

QueryA + queryB == 0  #both down

Or the other way
QueryA + querxB == 2 # both up



Alexander Wilke schrieb am Freitag, 23. Februar 2024 um 17:45:28 UTC+1:

> In Grafana i create query A and Query B and then an Expression C with 
> "Math" and then I can compare Like $A > 0 && B > 0.
> Maybe there is "Transform Data" and then a calcukation Option.
>
> Puneet Singh schrieb am Donnerstag, 22. Februar 2024 um 21:58:08 UTC+1:
>
>> okay, So I think should this be the correct way to perform the and 
>> operation ? - 
>> (sum without (USER, HOSTNAME ,instance ) (
>> *go_service_status{HOSTNAME="server1",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>  
>> < 1) and (sum without ( USER, HOSTNAME ,instance  ) (
>> *go_service_status{HOSTNAME="server2",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>  
>> < 1)
>>
>> Regards
>> P
>>
>>
>> On Friday 23 February 2024 at 00:58:52 UTC+5:30 Puneet Singh wrote:
>>
>>> Hi All, 
>>> I have a metric called go_service_status where  i use the "sum without" 
>>> operator to determine whether a service is up or down on a server. Now 
>>> there can be a situation where service can be down simultaneously on 2 
>>> master servers and I am unable to figure out a PromQL query to detect that 
>>> situation. Example -  
>>>
>>>
>>> *go_service_status{SERVICETYPE="grade1",SERVER_CATEGORY="db1",instance=~"server1:7878"}*
>>> and it can have 2 possible series -
>>> go_service_status{HOSTNAME="server1", SERVER_CATEGORY="db1", 
>>> SERVICETYPE="grade1", USER="admin", instance="server1:7878", 
>>> job="customprocessexporter01"} 0
>>> go_service_status{HOSTNAME="server1", SERVER_CATEGORY="db1", 
>>> SERVICETYPE="grade1", USER="root", instance="server1:7878", 
>>> job="customprocessexporter01"} 1
>>>
>>> and in the same way
>>>
>>> *go_service_status{SERVICETYPE="grade1",SERVER_CATEGORY="db1",instance=~"server2:7878"}*
>>> and it can have 2 possible series -
>>> go_service_status{HOSTNAME="server2", SERVER_CATEGORY="db1", 
>>> SERVICETYPE="grade1", USER="admin", instance="server2:7878", 
>>> job="customprocessexporter01"} 0
>>> go_service_status{HOSTNAME="server2", SERVER_CATEGORY="db1", 
>>> SERVICETYPE="grade1", USER="root", instance="server2:7878", 
>>> job="customprocessexporter01"} 0  
>>>
>>>
>>> Here;s the query using which i figure out status of the service on 
>>> server1.  Example - 
>>>
>>> (sum without (USER) (
>>> *go_service_status{HOSTNAME="server1",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>>  
>>> < 1)[image: Untitled.png]
>>>
>>> so the server1's service is momentarily 0
>>>
>>>
>>> and server2's service is always down , example - 
>>> (sum without (USER) (
>>> *go_lsf_service_status{HOSTNAME="server2",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>>  
>>> < 1)[image: Untitled.png]
>>>
>>>
>>> Now i tried to find the time duration where both these service were 
>>> simultaneously down / 0 on both server1 and server2 :
>>> (sum without (USER) (
>>> *go_service_status{HOSTNAME="server1",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>>  
>>> < 1) and (sum without (USER) (
>>> *go_service_status{HOSTNAME="server2",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>>  
>>> < 1)
>>>
>>>
>>> I was expecting a graph similar to the once for server2 , but i got :
>>> [image: Untitled.png]
>>>
>>> I think i need to ignore the HOSTNAME label , but unable to figure out 
>>> the way to ignore the HOSTNAME label in combination with sum without 
>>> clause.
>>>
>>> Any help/hint to improve this query will be very useful for me to 
>>> understand the and condition in context of sum without  clause.
>>>
>>> Thanks,
>>> Puneet
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/add10f5e-b354-4d7d-8b7c-fbbf6389f372n%40googlegroups.com.


[prometheus-users] Re: PromQL: understanding the and operator

2024-02-23 Thread Alexander Wilke
In Grafana i create query A and Query B and then an Expression C with 
"Math" and then I can compare Like $A > 0 && B > 0.
Maybe there is "Transform Data" and then a calcukation Option.

Puneet Singh schrieb am Donnerstag, 22. Februar 2024 um 21:58:08 UTC+1:

> okay, So I think should this be the correct way to perform the and 
> operation ? - 
> (sum without (USER, HOSTNAME ,instance ) (
> *go_service_status{HOSTNAME="server1",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>  
> < 1) and (sum without ( USER, HOSTNAME ,instance  ) (
> *go_service_status{HOSTNAME="server2",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>  
> < 1)
>
> Regards
> P
>
>
> On Friday 23 February 2024 at 00:58:52 UTC+5:30 Puneet Singh wrote:
>
>> Hi All, 
>> I have a metric called go_service_status where  i use the "sum without" 
>> operator to determine whether a service is up or down on a server. Now 
>> there can be a situation where service can be down simultaneously on 2 
>> master servers and I am unable to figure out a PromQL query to detect that 
>> situation. Example -  
>>
>>
>> *go_service_status{SERVICETYPE="grade1",SERVER_CATEGORY="db1",instance=~"server1:7878"}*
>> and it can have 2 possible series -
>> go_service_status{HOSTNAME="server1", SERVER_CATEGORY="db1", 
>> SERVICETYPE="grade1", USER="admin", instance="server1:7878", 
>> job="customprocessexporter01"} 0
>> go_service_status{HOSTNAME="server1", SERVER_CATEGORY="db1", 
>> SERVICETYPE="grade1", USER="root", instance="server1:7878", 
>> job="customprocessexporter01"} 1
>>
>> and in the same way
>>
>> *go_service_status{SERVICETYPE="grade1",SERVER_CATEGORY="db1",instance=~"server2:7878"}*
>> and it can have 2 possible series -
>> go_service_status{HOSTNAME="server2", SERVER_CATEGORY="db1", 
>> SERVICETYPE="grade1", USER="admin", instance="server2:7878", 
>> job="customprocessexporter01"} 0
>> go_service_status{HOSTNAME="server2", SERVER_CATEGORY="db1", 
>> SERVICETYPE="grade1", USER="root", instance="server2:7878", 
>> job="customprocessexporter01"} 0  
>>
>>
>> Here;s the query using which i figure out status of the service on 
>> server1.  Example - 
>>
>> (sum without (USER) (
>> *go_service_status{HOSTNAME="server1",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>  
>> < 1)[image: Untitled.png]
>>
>> so the server1's service is momentarily 0
>>
>>
>> and server2's service is always down , example - 
>> (sum without (USER) (
>> *go_lsf_service_status{HOSTNAME="server2",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>  
>> < 1)[image: Untitled.png]
>>
>>
>> Now i tried to find the time duration where both these service were 
>> simultaneously down / 0 on both server1 and server2 :
>> (sum without (USER) (
>> *go_service_status{HOSTNAME="server1",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>  
>> < 1) and (sum without (USER) (
>> *go_service_status{HOSTNAME="server2",SERVER_CATEGORY="db1",SERVICETYPE="grade1"}*)
>>  
>> < 1)
>>
>>
>> I was expecting a graph similar to the once for server2 , but i got :
>> [image: Untitled.png]
>>
>> I think i need to ignore the HOSTNAME label , but unable to figure out 
>> the way to ignore the HOSTNAME label in combination with sum without 
>> clause.
>>
>> Any help/hint to improve this query will be very useful for me to 
>> understand the and condition in context of sum without  clause.
>>
>> Thanks,
>> Puneet
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/91684962-e2f3-4ba2-ae74-9fd9bebce60en%40googlegroups.com.


Re: [prometheus-users] Smokeping_prober CPU usage optimization possible?

2024-02-22 Thread Alexander Wilke
Hello,

sorry for the delay. here are the results. to be honest - I do not 
understand anything of it.

Smokeping_Prober Heap:

https://pprof.me/a1e7400d32859dbc217e2182398485df/?profileType=profile%3Aalloc_objects%3Acount%3Aspace%3Abytes_items=icicle

 

smokeping_prober profile30s

https://pprof.me/340674b335e114e4b0df6b4582f0644e/?profileType=profile%3Asamples%3Acount%3Acpu%3Ananoseconds%3Adelta

Ben Kochie schrieb am Dienstag, 20. Februar 2024 um 10:27:10 UTC+1:

> Best thing you can do is capture some pprof data. That will show you what 
> it's spending the time on.
>
> :9374/debug/pprof/heap 
> :9374/debug/pprof/profile?seconds=30
>
> You can post the results to https://pprof.me/ for sharing.
>
> On Tue, Feb 20, 2024 at 6:22 AM Alexander Wilke  
> wrote:
>
>> Hello,
>> I am running smokeping_prober from one VM to Monitor around 500 
>> destinations.
>> Around 30 devices are monitored with 0.2s Intervall and Others with 1.65s 
>> Intervall.
>>
>> Prometheus scrapes every 5s.
>>
>> So there are roughly 600 icmp ipv4 24byte pings per Seconds.
>> CPU usage jumps between 700-1200% using "top"
>>
>> What Else except reducing Interval or Host Count could Help to reduce CPU 
>> usage?
>> Is the UDP Socket "better" or any other optimization which could be 
>> relevant for that Type of Traffic? Running on RHEL8
>>
>> Someone with similar CPU usage and this amount of pings per Seconds? 
>> Maybe Others Ping 6.000 Destination every 10s?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/d803c1a2-64ee-48d1-8513-b864856f53c8n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/d803c1a2-64ee-48d1-8513-b864856f53c8n%40googlegroups.com?utm_medium=email_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/dd03c0d3-9b52-4863-8e72-361a9bcfe20dn%40googlegroups.com.


Re: [prometheus-users] Alert Query

2024-02-19 Thread Alexander Wilke
Did you try with

topk(3,metrics) ?

sri L  schrieb am Di., 20. Feb. 2024, 05:19:

> Hi all,
>
> I am looking for a way to send out an alert with top 3 cpu/memory
> utilization processes when total cpu/memory utilization goes above 80% for
> a node.
>
> I can create an alert using node metrics to send notification when cpu
> utilization is above 80% but unable to find a way to include top 3 process
> details in the same alert
>
> Kindly suggest there is any method to achieve this
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/f88f6b04-546c-4a8b-ba67-98a1928d638dn%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAJuaemamk4XOWKJyskG%3D-JwvdB1LFYz%2B-LAeufXkdrhYVPbTZg%40mail.gmail.com.


[prometheus-users] Smokeping_prober CPU usage optimization possible?

2024-02-19 Thread Alexander Wilke
Hello,
I am running smokeping_prober from one VM to Monitor around 500 
destinations.
Around 30 devices are monitored with 0.2s Intervall and Others with 1.65s 
Intervall.

Prometheus scrapes every 5s.

So there are roughly 600 icmp ipv4 24byte pings per Seconds.
CPU usage jumps between 700-1200% using "top"

What Else except reducing Interval or Host Count could Help to reduce CPU 
usage?
Is the UDP Socket "better" or any other optimization which could be 
relevant for that Type of Traffic? Running on RHEL8

Someone with similar CPU usage and this amount of pings per Seconds? Maybe 
Others Ping 6.000 Destination every 10s?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/d803c1a2-64ee-48d1-8513-b864856f53c8n%40googlegroups.com.


[prometheus-users] Prometheus MultiTenancy / separation of metrics / separation of passwords

2024-02-19 Thread Alexander Wilke
Hello,

in our company I maintain a Prometheus Server to Monitor my own devices 
with so there is No need to separate metrics and credentials.

There May be interest from other Departments to Monitor their devices with 
the Same Prometheus instance so they do Not have to maintain the Server 
itself but Just use the metrics.

Is there a possibility to separate the View of metrics e.g. If i Connect 
Grafana tonthe Same Prometheus Server?

Is there a way to separate the credentials?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/a80ef4b9-6a8d-4a30-9d83-801225e3ae0cn%40googlegroups.com.


Re: [prometheus-users] blackbox_exporter - how to simplify my configuration

2024-02-19 Thread Alexander Wilke
Thanks for sharing your ideas. 

Chris Siebenmann schrieb am Montag, 19. Februar 2024 um 21:08:02 UTC+1:

> > In our DataCenter we have different security zones. In each zone I
> > want to place a blackbox_exporter. The goal is that each
> > blackbox_exporter monitors the same destinations eg. same DNS Server
> > or same webserver. All exporters are controlled by one single
> > prometheus server.
> >
> > If someone complains that something is wrong, slow or whatever I want
> > to have the possibility to compare each blackbox_exporter to
> > understand if there was an issue with the service (destnation) or the
> > newtwork or something else.
> >
> > I will have 26 blackbox exporter system in the DC. Each should use the
> > same destinations, protocols, etc. Is there a way to simplify my
> > configuration to not have 26 different configuration parts but merge
> > this into one part?
> >
> > at the moment the configuration is only different in three places:
> > - job name
> > - custom label
> > - replacement (IP of blackbox_exporter)
> >
> > Here an example which would result in 26 different configuration parts
> > and ideally I only want to have one configuration part because if I
> > want to add another test then I need to edit all 26 files again.
>
> It's possible to do somewhat better than what you've got today, but I
> don't think there's any native Prometheus way out of having 26 different
> stanzas, one for each Blackbox exporter, because there's no way of
> either iterating over something in a stanza or nesting service discovery
> (as far as I know). Doing better would require additional tools to
> automatically generate something.
>
> The simple way to improve things is to list all of the targets to be
> checked in a service discovery file. If you have multiple checks you
> want to do against each target, you can also include the module as a
> label:
>
> - targets:
> - https://hostname1.domain/
> - https://hostname2.domain/
> # hopefully this works, if not add a label rewrite for
> # 'module' or something
> labels:
> __param_module: http_post_2xx
>
> (Don't trust the exact YAML indentation here, I'm hand-writing it.)
>
> Then each blackbox DC job would look like:
> - job_name: 'blackbox_job_02'
> file_sd_configs:
> files:
> - yourstuff/blackbox-list.yaml
> relabel_configs:
> - source_labels: [__address__]
> target_label: __param_target
> - source_labels: [__param_target]
> target_label: instance
> - target_label: __address__
> replacement: blackbox_job_02.domain:9115
>
> Then at least you can add additional targets to be checked in some way
> in one place (the 'blackbox-list.yaml' file), instead of having to
> update all of the blackbox DC job configurations. But if you want to add
> or remove another blackbox DC instance, you'd still have to edit it in
> or out by hand.
>
> If you're in a position to automate generating the 'blackbox-list.yaml'
> file, you could also put the blackbox DC target in an additional label
> (say 'dc_blackbox: blackbox_job_02.domain:9115') and have a single
> Blackbox configuration stanza, which would set __address__ at the end
> with a label rewrite that would look something like this:
> - target_labels: [dc_blackbox]
> replacement: __address__
>
> You need to automatically generate the blackbox-list.yaml file because
> you're going to be repeating all of the targets N times, one for each
> blackbox DC.
>
> - cks
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/d866f1c7-23e4-42e2-8e3b-246f5d91074cn%40googlegroups.com.


[prometheus-users] blackbox_exporter - how to simplify my configuration

2024-02-17 Thread Alexander Wilke
Hello,

In our DataCenter we have different security zones. In each zone I want to 
place a blackbox_exporter. The goal is that each blackbox_exporter monitors 
the same destinations eg. same DNS Server or same webserver. All exporters 
are controlled by one single prometheus server.

If someone complains that something is wrong, slow or whatever I want to 
have the possibility to compare each blackbox_exporter to understand if 
there was an issue with the service (destnation) or the newtwork or 
something else.

I will have 26 blackbox exporter system in the DC. Each should use the same 
destinations, protocols, etc. Is there a way to simplify my configuration 
to not have 26 different configuration parts but merge this into one part?

at the moment the configuration is only different in three places:
- job name
- custom label
- replacement (IP of blackbox_exporter)

Here an example which would result in 26 different configuration parts and 
ideally I only want to have one configuration part because if I want to add 
another test then I need to edit all 26 files again.



#===

  - job_name: 'blackbox_job_01'
scrape_interval: 15s
scrape_timeout: 14s
metrics_path: /probe
params:
  module: [http_post_2xx]
static_configs:
  - targets:
- https://hostname1.domain
- https://hostname2.domain
- https://hostname3.domain
labels:
   source_group: 'blackbox_job_01'

relabel_configs:
  - source_labels: [__address__]
target_label: __param_target
  - source_labels: [__param_target]
target_label: instance
  - target_label: __address__
replacement: blackbox_job_01.domain:9115

#===

  - job_name: 'blackbox_job_02'
scrape_interval: 15s
scrape_timeout: 14s
metrics_path: /probe
params:
  module: [http_post_2xx]
static_configs:
  - targets:
- https://hostname1.domain
- https://hostname2.domain
- https://hostname3.domain
labels:
   source_group: 'blackbox_job_02'

relabel_configs:
  - source_labels: [__address__]
target_label: __param_target
  - source_labels: [__param_target]
target_label: instance
  - target_label: __address__
replacement: blackbox_job_02.domain:9115

#===

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/5495f9a4-e80a-4dba-a601-86199221f6f6n%40googlegroups.com.


Re: [prometheus-users] snmp_exporter 0.25.0 + and prometheus 2.49.1 with "%" in label value - format issue

2024-02-12 Thread Alexander Wilke
Hope this is sufficient. If Something else is needed i will try to provide 
it.

https://github.com/prometheus/snmp_exporter/issues/1115

Alexander Wilke schrieb am Montag, 12. Februar 2024 um 21:29:12 UTC+1:

> Prometheus, snmp_exporter, node_exporter, smokeping_prober, 
> blackbox_exporter, Grafana on the Same Linux VM.
>
> Microsoft Edge on Windows :-(
>
> Brian Candler schrieb am Montag, 12. Februar 2024 um 21:11:36 UTC+1:
>
>> Are you running either the Prometheus server or the web browser under 
>> Windows? STATUS_BREAKPOINT appears here:
>> https://pkg.go.dev/golang.org/x/s...@v0.17.0/windows#pkg-constants 
>> <https://pkg.go.dev/golang.org/x/sys@v0.17.0/windows#pkg-constants>
>>
>> On Monday 12 February 2024 at 15:58:44 UTC Ben Kochie wrote:
>>
>>> On Mon, Feb 12, 2024, 16:39 Alexander Wilke  wrote:
>>>
>>>> Hello,
>>>
>>> thanks for the fast response. Unfortunately the linux environment I have 
>>>> is very restricted and I first have to check which snmpwalk tool I can use 
>>>> because downloads are very limited.
>>>> Will take me some time but I think I will open the issue with the 
>>>> information I have.
>>>>
>>>
>>> The output from the exporter is fine, no need for other tools.
>>>
>>>
>>>> if I run ltmNodeAddresstype I can see a value of (1) for the IPs in the 
>>>> /Common partition which is the base partition and has no suffix like %xyz.
>>>> Other partitions I have the suffix and the address Type value is (3).
>>>>
>>>
>>> Yup, that's what I thought.
>>>
>>>
>>>> So it is probably as you said:
>>>> ipv4z(3) A non-global IPv4 address including a zone index as defined by 
>>>> the InetAddressIPv4z textual convention.
>>>>
>>>>
>>>> PS:
>>>> is it possible that this may cause instability of the prometheus webui? 
>>>> If I browse the "graph" page and searching for f5 metrics sometimes the 
>>>> rbwoser is showing a white error page "STATUS_BREAKPOINT".
>>>> This is a test environment and maybe there something else wrong - 
>>>> however - it feels like it started with the monitoring of f5 devices via 
>>>> SNMP.
>>>>
>>>
>>> No, this is just a failed string conversion. So you get the default hex 
>>> conversion instead. 
>>>
>>> I don't know what your error is, but I am fairly sure this is unrelated 
>>> to Prometheus or SNMP data.
>>>
>>>
>>>> Ben Kochie schrieb am Montag, 12. Februar 2024 um 15:20:05 UTC+1:
>>>>
>>>>> Looking at the MIB (F5-BIGIP-LOCAL-MIB), I see this MIB definition:
>>>>>
>>>>> ltmPoolMemberAddr OBJECT-TYPE
>>>>>   SYNTAX InetAddress
>>>>>   MAX-ACCESS read-only
>>>>>   STATUS current
>>>>>   DESCRIPTION
>>>>> "The IP address of a pool member in the specified pool.
>>>>> It is interpreted within the context of an ltmPoolMemberAddrType 
>>>>> value."
>>>>>   ::= { ltmPoolMemberEntry 3 }
>>>>>
>>>>> InetAddress syntax comes from INET-ADDRESS-MIB, which has several 
>>>>> conversion types. Without knowing what the device is exposing 
>>>>> for ltmPoolMemberAddrType it's hard to say, but I'm guessing it's type 
>>>>> 3, InetAddressIPv4z.
>>>>>
>>>>> I don't think we have this textual convention implemented in the 
>>>>> exporter.
>>>>>
>>>>> Would you mind filing this as an issue on GitHub?
>>>>> * It would also be helpful to have the sample data as text, rather 
>>>>> than a screenshot. This makes it easier to work with for creating test 
>>>>> cases.
>>>>> * Please also include walks of `ltmPoolMemberAddrType` as well as 
>>>>> `ltmPoolMemberAddr`
>>>>>
>>>>> https://github.com/prometheus/snmp_exporter/issues
>>>>>
>>>>> It would also be helpful to have the sample data as text, rather than 
>>>>> a screenshot. This makes it easier to work with for creating test cases.
>>>>>
>>>>> On Mon, Feb 12, 2024 at 2:54 PM Alexander Wilke  
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I am using the snmp_exporter 0.25.0 and prometheus 2

Re: [prometheus-users] snmp_exporter 0.25.0 + and prometheus 2.49.1 with "%" in label value - format issue

2024-02-12 Thread Alexander Wilke
Prometheus, snmp_exporter, node_exporter, smokeping_prober, 
blackbox_exporter, Grafana on the Same Linux VM.

Microsoft Edge on Windows :-(

Brian Candler schrieb am Montag, 12. Februar 2024 um 21:11:36 UTC+1:

> Are you running either the Prometheus server or the web browser under 
> Windows? STATUS_BREAKPOINT appears here:
> https://pkg.go.dev/golang.org/x/s...@v0.17.0/windows#pkg-constants 
> <https://pkg.go.dev/golang.org/x/sys@v0.17.0/windows#pkg-constants>
>
> On Monday 12 February 2024 at 15:58:44 UTC Ben Kochie wrote:
>
>> On Mon, Feb 12, 2024, 16:39 Alexander Wilke  wrote:
>>
>>> Hello,
>>
>> thanks for the fast response. Unfortunately the linux environment I have 
>>> is very restricted and I first have to check which snmpwalk tool I can use 
>>> because downloads are very limited.
>>> Will take me some time but I think I will open the issue with the 
>>> information I have.
>>>
>>
>> The output from the exporter is fine, no need for other tools.
>>
>>
>>> if I run ltmNodeAddresstype I can see a value of (1) for the IPs in the 
>>> /Common partition which is the base partition and has no suffix like %xyz.
>>> Other partitions I have the suffix and the address Type value is (3).
>>>
>>
>> Yup, that's what I thought.
>>
>>
>>> So it is probably as you said:
>>> ipv4z(3) A non-global IPv4 address including a zone index as defined by 
>>> the InetAddressIPv4z textual convention.
>>>
>>>
>>> PS:
>>> is it possible that this may cause instability of the prometheus webui? 
>>> If I browse the "graph" page and searching for f5 metrics sometimes the 
>>> rbwoser is showing a white error page "STATUS_BREAKPOINT".
>>> This is a test environment and maybe there something else wrong - 
>>> however - it feels like it started with the monitoring of f5 devices via 
>>> SNMP.
>>>
>>
>> No, this is just a failed string conversion. So you get the default hex 
>> conversion instead. 
>>
>> I don't know what your error is, but I am fairly sure this is unrelated 
>> to Prometheus or SNMP data.
>>
>>
>>> Ben Kochie schrieb am Montag, 12. Februar 2024 um 15:20:05 UTC+1:
>>>
>>>> Looking at the MIB (F5-BIGIP-LOCAL-MIB), I see this MIB definition:
>>>>
>>>> ltmPoolMemberAddr OBJECT-TYPE
>>>>   SYNTAX InetAddress
>>>>   MAX-ACCESS read-only
>>>>   STATUS current
>>>>   DESCRIPTION
>>>> "The IP address of a pool member in the specified pool.
>>>> It is interpreted within the context of an ltmPoolMemberAddrType 
>>>> value."
>>>>   ::= { ltmPoolMemberEntry 3 }
>>>>
>>>> InetAddress syntax comes from INET-ADDRESS-MIB, which has several 
>>>> conversion types. Without knowing what the device is exposing 
>>>> for ltmPoolMemberAddrType it's hard to say, but I'm guessing it's type 
>>>> 3, InetAddressIPv4z.
>>>>
>>>> I don't think we have this textual convention implemented in the 
>>>> exporter.
>>>>
>>>> Would you mind filing this as an issue on GitHub?
>>>> * It would also be helpful to have the sample data as text, rather than 
>>>> a screenshot. This makes it easier to work with for creating test cases.
>>>> * Please also include walks of `ltmPoolMemberAddrType` as well as 
>>>> `ltmPoolMemberAddr`
>>>>
>>>> https://github.com/prometheus/snmp_exporter/issues
>>>>
>>>> It would also be helpful to have the sample data as text, rather than a 
>>>> screenshot. This makes it easier to work with for creating test cases.
>>>>
>>>> On Mon, Feb 12, 2024 at 2:54 PM Alexander Wilke  
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am using the snmp_exporter 0.25.0 and prometheus 2.49.1.
>>>>>
>>>>> I am collecting metrics from F5 LTM Loadbalancers. I want to collect 
>>>>> the IP-Address.
>>>>>
>>>>>  
>>>>>
>>>>> in general it is working however some IP-address formats are looking 
>>>>> like that:
>>>>>
>>>>>  
>>>>>
>>>>> 10.10.10.10 which I can import in the correct fromat
>>>>>
>>>>>  
>>>>>
>>>>> Others a displayed by the F5 system like this:

Re: [prometheus-users] snmp_exporter 0.25.0 + and prometheus 2.49.1 with "%" in label value - format issue

2024-02-12 Thread Alexander Wilke
Hello,
thanks for the fast response. Unfortunately the linux environment I have is 
very restricted and I first have to check which snmpwalk tool I can use 
because downloads are very limited.
Will take me some time but I think I will open the issue with the 
information I have.

if I run ltmNodeAddresstype I can see a value of (1) for the IPs in the 
/Common partition which is the base partition and has no suffix like %xyz.
Other partitions I have the suffix and the address Type value is (3).

So it is probably as you said:
ipv4z(3) A non-global IPv4 address including a zone index as defined by the 
InetAddressIPv4z textual convention.


PS:
is it possible that this may cause instability of the prometheus webui? If 
I browse the "graph" page and searching for f5 metrics sometimes the 
rbwoser is showing a white error page "STATUS_BREAKPOINT".
This is a test environment and maybe there something else wrong - however - 
it feels like it started with the monitoring of f5 devices via SNMP.

Ben Kochie schrieb am Montag, 12. Februar 2024 um 15:20:05 UTC+1:

> Looking at the MIB (F5-BIGIP-LOCAL-MIB), I see this MIB definition:
>
> ltmPoolMemberAddr OBJECT-TYPE
>   SYNTAX InetAddress
>   MAX-ACCESS read-only
>   STATUS current
>   DESCRIPTION
> "The IP address of a pool member in the specified pool.
> It is interpreted within the context of an ltmPoolMemberAddrType 
> value."
>   ::= { ltmPoolMemberEntry 3 }
>
> InetAddress syntax comes from INET-ADDRESS-MIB, which has several 
> conversion types. Without knowing what the device is exposing 
> for ltmPoolMemberAddrType it's hard to say, but I'm guessing it's type 
> 3, InetAddressIPv4z.
>
> I don't think we have this textual convention implemented in the exporter.
>
> Would you mind filing this as an issue on GitHub?
> * It would also be helpful to have the sample data as text, rather than a 
> screenshot. This makes it easier to work with for creating test cases.
> * Please also include walks of `ltmPoolMemberAddrType` as well as 
> `ltmPoolMemberAddr`
>
> https://github.com/prometheus/snmp_exporter/issues
>
> It would also be helpful to have the sample data as text, rather than a 
> screenshot. This makes it easier to work with for creating test cases.
>
> On Mon, Feb 12, 2024 at 2:54 PM Alexander Wilke  
> wrote:
>
>> Hello,
>>
>> I am using the snmp_exporter 0.25.0 and prometheus 2.49.1.
>>
>> I am collecting metrics from F5 LTM Loadbalancers. I want to collect the 
>> IP-Address.
>>
>>  
>>
>> in general it is working however some IP-address formats are looking like 
>> that:
>>
>>  
>>
>> 10.10.10.10 which I can import in the correct fromat
>>
>>  
>>
>> Others a displayed by the F5 system like this:
>>
>>  
>>
>> 10.10.10.10%0
>>
>> or
>>
>> 10.10.10.10%1
>>
>>  
>>
>> The trailing  %0 or %1 ... represents a logical separation on the system.
>>
>>  
>>
>> The ingestion into prometheus works however the format is then different 
>> and looks like hex. Any chance to get the "raw" information or at least 
>> replace the trailing %0?
>>
>>
>> [image: ip_address_format_includes_percent.jpg]
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/dd89ed7e-a276-43ff-8bb1-5631ba98cfb7n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/dd89ed7e-a276-43ff-8bb1-5631ba98cfb7n%40googlegroups.com?utm_medium=email_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/a5658075-d685-487e-9cac-5d16d3cb0e15n%40googlegroups.com.


[prometheus-users] Metrics from PUSH Consumer - Relabeled Metrics? Check "Up" state?

2024-01-20 Thread Alexander Wilke
Hello,
I have some Enterprise Firewalls which unfortunately only can PUSH metrics 
to remote_write API of Prometheus. Vendor has No Plans to offer PULL.

Which possibilities do I have in Prometheus to Change some metrics at the 
time they arrive on the API? I want to add some Custom Labels based in 
existing Label information.

In Addition ist there any possibility to Check If a devices is still 
sending metrics?
Because Prometheus is Not pulling it can Not Check the "Up" state i think 
but is it possible to query If there are metrics in Prometheus with a 
specific Label Set and Not older than e.g. 10minutes?

And how can I Check If the Prometheus Server can ingest all the metrics 
which arrive on the API? For Pull i have some dashboards which Show sampels 
per Seconds and so on.

I am a little Bit confused because from my understanding remote write is in 
General used to write Data from one Prometheus to another Destination so 
the configuation and Tuning Parameters seem Not to fit for incoming Traffic.

Maybe you can Help me.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/06c8a95a-0ac9-45aa-b306-bde0e0691704n%40googlegroups.com.


Re: [prometheus-users] snmp exporter & snmpv3

2024-01-20 Thread Alexander Wilke
If you have a working SNMP.yml thrn Just add this at the top of the File

CustomName:
version: 3
security_level: authPriv
username: "username"
auth_protocol: SHA
password: 'password'
priv_protocol: AES

The "CustomName" is what you use in Prometheus.yml as auth Module.
If you use Cisco devices you have to use
AES-128C or AES-256C

But  est way is to use Generator and Geräte fresh snmp.yml.
In Generator add the Same Part for snmpv3 at the top of the File.

jin lukexjin schrieb am Freitag, 19. Januar 2024 um 08:21:48 UTC+1:

> So,If I want to use SNMP v3,I need regenerator snmp.yml,I can not edit the 
> default snmp.yml,yes?
> In generator.xml auth part, v1 ,v2c,v3 all is there ok? 
>
> 在2017年12月18日星期一 UTC+8 01:49:24 写道:
>
>> On 17 December 2017 at 17:44,  wrote:
>>
>>> Thanks. Do you know how I can pass the username and password for SNMPv3 
>>> to snmp_exporter, as I would do in snmpwalk?  
>>
>>
>> See 
>> https://github.com/prometheus/snmp_exporter/tree/master/generator#file-format
>>  
>>
>>>
>>>
>>> On 15/12/2017 at 5:16 PM, "Ben Kochie"  wrote:
>>
>> Yes, v1, v2c, v3 are all supported.
>>>
>>> On Dec 15, 2017 17:45, "Adrian Lewis"  wrote:
>>>
>>> Does SNMP exporter support SNMP version 3, as I would like to pass a 
 username and password to the SNMP target/agent? 

 Thanks

 Aidy

 -- 
 You received this message because you are subscribed to the Google 
 Groups "Prometheus Users" group.

>>> To unsubscribe from this group and stop receiving emails from it, send 
 an email to prometheus-use...@googlegroups.com.
 To post to this group, send email to promethe...@googlegroups.com.
>>>
>>>
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/prometheus-users/18d702ce-372b-40de-85ac-483014f6c915%40googlegroups.com
  
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>> -- 
>>>
>> You received this message because you are subscribed to the Google Groups 
>>> "Prometheus Users" group.
>>>
>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to prometheus-use...@googlegroups.com.
>>> To post to this group, send email to promethe...@googlegroups.com.
>>>
>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/20171217174415.350DE400FF%40smtp.hushmail.com
>>>  
>>> 
>>> .
>>
>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> -- 
>> Brian Brazil
>> www.robustperception.io
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/6bcfb447-bf7e-43c6-8ea8-e44d4ef9543en%40googlegroups.com.


Re: [prometheus-users] delta/increase on a counter return wrong value

2024-01-18 Thread Alexander Wilke
You May use

rate(metric{}[15m])


Jérôme Loyet  schrieb am Do., 18. Jan. 2024, 19:27:

> Hello,
>
> my previous was not clear, sorry for that. I don't want count the number
> of sample (count_over_time) but I want to calculate the difference (delta)
> or the increase (increase) of the metric value during the range (15
> minutes).
>
> As the metric is a counter that only grows (it counts the number of
> request the service has handled), it should be the last value of the sample
> range minus the first one.
>
> here is a screenshot of the corresponding metric:
> [image: image.png]
>
> There must be some black magic around increase/delta that I do not
> understand that gets me unexpected results.
>
> Le jeu. 18 janv. 2024 à 19:14, Alexander Wilke  a
> écrit :
>
>> Maybe you are looking for
>>
>> count_over_time
>>
>> https://promlabs.com/promql-cheat-sheet/
>>
>> Jérôme Loyet  schrieb am Do., 18. Jan. 2024, 18:56:
>>
>>> Hello,
>>>
>>> I have a counter and I want to counter the number of occurences on a
>>> duration (let's say 15m). I'm using delta() or increase but I'm not getting
>>> the result I'm expecting.
>>>
>>> value @t0: 30242494
>>> value @t0+15m: 30609457
>>> calculated diff: 366963
>>> round(max_over_time(metric[15m])) - round(min_over_time(metric[15m])):
>>> 366963
>>> round(delta(metric[15m])): 373183
>>> round(increase(metric[15m])): 373183
>>>
>>> increase and delta both return the same value but it appears to be wrong
>>> (+6220) while max_over_time - min_over_time return the expected value.
>>>
>>> I do not understand this behaviour. I must have miss something.
>>>
>>> Any help is appreciated, thx a lot.
>>>
>>> ++ Jerome
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to prometheus-users+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/e9864120-b1c2-4af9-91ee-9c9cbe0fb24an%40googlegroups.com
>>> <https://groups.google.com/d/msgid/prometheus-users/e9864120-b1c2-4af9-91ee-9c9cbe0fb24an%40googlegroups.com?utm_medium=email_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAJuaemZtpBzHk5mpvFaAcW1ccxCi%3DnnythUrHC0zhxDVQCxFqA%40mail.gmail.com.


Re: [prometheus-users] delta/increase on a counter return wrong value

2024-01-18 Thread Alexander Wilke
Maybe you are looking for

count_over_time

https://promlabs.com/promql-cheat-sheet/

Jérôme Loyet  schrieb am Do., 18. Jan. 2024, 18:56:

> Hello,
>
> I have a counter and I want to counter the number of occurences on a
> duration (let's say 15m). I'm using delta() or increase but I'm not getting
> the result I'm expecting.
>
> value @t0: 30242494
> value @t0+15m: 30609457
> calculated diff: 366963
> round(max_over_time(metric[15m])) - round(min_over_time(metric[15m])):
> 366963
> round(delta(metric[15m])): 373183
> round(increase(metric[15m])): 373183
>
> increase and delta both return the same value but it appears to be wrong
> (+6220) while max_over_time - min_over_time return the expected value.
>
> I do not understand this behaviour. I must have miss something.
>
> Any help is appreciated, thx a lot.
>
> ++ Jerome
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/e9864120-b1c2-4af9-91ee-9c9cbe0fb24an%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAJuaemaHoToizzuGqneVV8-QJ9p0ogwP2aUbf%2BbkgwOt%2Bd%3DqNw%40mail.gmail.com.


[prometheus-users] Re: Node_exporter 1.7.0 - http_server_config - Strict-Transport-Security

2024-01-17 Thread Alexander Wilke
Hello Brian,

I am very sorry. I missed that "headers:"  between the lines of description.
It is working now.

Brian Candler schrieb am Mittwoch, 17. Januar 2024 um 09:19:09 UTC+1:

> The YAML parsing error is simply saying that under "http_server_config", 
> you cannot put "Strict-Transport-Security".
>
> The documentation says that the only keys allowed under 
> "http_server_config" are "http2" and "headers". So it needs to be like this:
>
> http_server_config:
>   headers:
> Strict-Transport-Security: 
>
> On Wednesday 17 January 2024 at 15:43:06 UTC+8 Alexander Wilke wrote:
>
>> Hello,
>>
>> I am running:
>>
>> node_exporter, version 1.7.0 (branch: HEAD, revision: 
>> 7333465abf9efba81876303bb57e6fadb946041b)
>>   build date:   20231112-23:53:35
>>   go version:   go1.21.4
>>   platform: linux/amd64
>>   tags: netgo osusergo static_build
>>
>>
>>
>> Vulnerability scan complained that HSTS is not enabled so I wanted to 
>> enable it:
>>
>> tls_server_config:
>>   cert_file: "/opt/node_exporter/node_exporter.pem"
>>   key_file: "/opt/node_exporter/node_exporter.key"
>>
>>   min_version: "TLS12"
>>   max_version: "TLS13"
>>
>>   client_auth_type: "NoClientCert"
>>
>> basic_auth_users:
>> user: 'xxx'
>>
>> http_server_config:
>>   Strict-Transport-Security: max-age=31536000  # 1 year
>>
>>
>> Unfortunately I get this error:
>>
>> node_exporter: ts=2024-01-17T07:30:04.483Z caller=node_exporter.go:223 
>> level=error err="yaml: unmarshal errors:\n  line 14: field 
>> Strict-Transport-Security not found in type web.HTTPConfig"
>> systemd: node_exporter.service: main process exited, code=exited, 
>> status=1/FAILURE
>>
>>
>> I tried to configure it based on this documentation:
>> https://prometheus.io/docs/prometheus/latest/configuration/https/
>>
>> probably I need the other parameters, too like:
>> Strict-Transport-Security: max-age=; includeSubDomains; 
>> preload 
>> How to get this working?
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f4a72eeb-133b-495a-9a26-d8023038278cn%40googlegroups.com.


[prometheus-users] Node_exporter 1.7.0 - http_server_config - Strict-Transport-Security

2024-01-16 Thread Alexander Wilke
Hello,

I am running:

node_exporter, version 1.7.0 (branch: HEAD, revision: 
7333465abf9efba81876303bb57e6fadb946041b)
  build date:   20231112-23:53:35
  go version:   go1.21.4
  platform: linux/amd64
  tags: netgo osusergo static_build



Vulnerability scan complained that HSTS is not enabled so I wanted to 
enable it:

tls_server_config:
  cert_file: "/opt/node_exporter/node_exporter.pem"
  key_file: "/opt/node_exporter/node_exporter.key"

  min_version: "TLS12"
  max_version: "TLS13"

  client_auth_type: "NoClientCert"

basic_auth_users:
user: 'xxx'

http_server_config:
  Strict-Transport-Security: max-age=31536000  # 1 year


Unfortunately I get this error:

node_exporter: ts=2024-01-17T07:30:04.483Z caller=node_exporter.go:223 
level=error err="yaml: unmarshal errors:\n  line 14: field 
Strict-Transport-Security not found in type web.HTTPConfig"
systemd: node_exporter.service: main process exited, code=exited, 
status=1/FAILURE


I tried to configure it based on this documentation:
https://prometheus.io/docs/prometheus/latest/configuration/https/

probably I need the other parameters, too like:
Strict-Transport-Security: max-age=; includeSubDomains; 
preload 
How to get this working?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/51df262d-deb5-42ab-9e68-a7dd75e63cffn%40googlegroups.com.


Re: [prometheus-users] Smokeping_prober 0.7.1 - amount or buckets

2024-01-16 Thread Alexander Wilke
Thanks for your feedback.

I had a look at Prometheus and it looks like this feature is still in 
development and maybe will change.
So probably I will stay with what I have right now and enable it in the 
future if the feature is stable and has reached its final design.

Ben Kochie schrieb am Montag, 15. Januar 2024 um 16:54:54 UTC+1:

> More buckets cost more to store and process, but thankfully there is now 
> Prometheus "native histograms", which give you high resolution for less 
> cost.
>
>
> https://prometheus.io/docs/prometheus/latest/feature_flags/#native-histograms
>
> https://www.usenix.org/conference/srecon23emea/presentation/rabenstein
>
> This is already supported by the smokeping_prober, but you may need to 
> enable it in Prometheus.
>
> On Mon, Jan 15, 2024 at 4:35 PM Alexander Wilke  
> wrote:
>
>> Hello,
>>
>> I would Like to know If there are any technical disadvantages or 
>> limitations If I use a high(er) amount of buckets?
>>
>> And is there a possibility to Aggregate buckets later in PromQL/Grafana?
>> My Idea was so unsere more buckets to have detailed View in latencies If 
>> I want to investigate Something specific in a specific time frame.
>>
>> But as for Overview i do Not need all the fine grained bucket steps.
>>
>> For time range i can use rate.
>> But is there a possibility to lets say Aggregate buckets
>>
>> 0.5-1.0s
>> 1.0-2.0s
>> 2.0-3.0s
>>
>> To one bucket 0.5-3.0s
>>
>> TL;DR.
>> Any technical limitations or disadvantages with Higher amount of buckets
>> How to Aggregate buckets for Overview but expand it for detailed Analysis 
>> If needed?
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/345a4c69-11cb-4e72-8af8-22dab5dc23e0n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/345a4c69-11cb-4e72-8af8-22dab5dc23e0n%40googlegroups.com?utm_medium=email_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/a8bbdc60-ebc1-4084-b1f2-20d9470138f5n%40googlegroups.com.


[prometheus-users] Smokeping_prober 0.7.1 - amount or buckets

2024-01-15 Thread Alexander Wilke
Hello,

I would Like to know If there are any technical disadvantages or 
limitations If I use a high(er) amount of buckets?

And is there a possibility to Aggregate buckets later in PromQL/Grafana?
My Idea was so unsere more buckets to have detailed View in latencies If I 
want to investigate Something specific in a specific time frame.

But as for Overview i do Not need all the fine grained bucket steps.

For time range i can use rate.
But is there a possibility to lets say Aggregate buckets

0.5-1.0s
1.0-2.0s
2.0-3.0s

To one bucket 0.5-3.0s

TL;DR.
Any technical limitations or disadvantages with Higher amount of buckets
How to Aggregate buckets for Overview but expand it for detailed Analysis 
If needed?


-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/345a4c69-11cb-4e72-8af8-22dab5dc23e0n%40googlegroups.com.


[prometheus-users] Re: Weird node_exporter network metrics behaviour - NIC problem?

2024-01-14 Thread Alexander Wilke
Do you have the same scrape_interval for both machines?
Are you running irate on both queties or "rate" on the one and "irate" on 
the other?
Are the iperf intervals the same for both tests?

Dito Windyaksa schrieb am Montag, 15. Januar 2024 um 00:02:26 UTC+1:

> Hi,
>
> We're migrating to a new bare metal provider and noticed that the network 
> metrics doesnt add up.
>
> We conducted an iperf test between A and B, and noticed there are "drops" 
> on the new machine during an ongoing iperf test.
>
> We also did not see any bandwidth drops from both iperf server/client side.
>
> [image: Screenshot 2024-01-13 at 06.27.43.png]
>
> Both are running similar queries:
> irate(node_network_receive_bytes_total{instance="xxx", 
> device="eno1"}[1m])*8
>
> One thing is certain: green line machine is running an Intel 10G NIC, 
> while blue line machine is running an Broadcom 10G NIC.
>
> Any ideas?
> Dito
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/b1cded13-dc77-451c-a106-fb6d61785799n%40googlegroups.com.


Re: [prometheus-users] Why is snmp_exporter not recognizing the custom OID added?

2024-01-14 Thread Alexander Wilke
Tell me the MIBs you want to use and maybe I can Tell you whats needed for 
Generator.yml and which additional MIBs are needed.

Awemnhd schrieb am Sonntag, 14. Januar 2024 um 13:02:32 UTC+1:

> Thank you. I will try to download the MIB file from this website later, 
> but I have a question. Why does this website not mark the MIB file 
> corresponding to the network device software version? The Huawei network 
> device I use is the same model, but the OID is different in different 
> software versions!
>
> 在2024年1月13日星期六 UTC+8 19:30:39 写道:
>
>> Keep in mind that you so Not only need the vendor specific MIBs but you 
>> need in Addition all MIBs which are listet at the to of the MIB as 
>> "IMPORTED".  So you maybe need other MIBs in the folder to have the 
>> complete OID path.
>>
>> Maybe post:
>> - Your MIBs folder with the MIBs you use
>> - The Generator.yml you use
>> - The Generator command you use.
>> - The Output of the Generator 
>>
>>
>> I Like this Website to Download MIBs and IT tells me the other needed 
>> MIBs as "IMPORTED". I need These MIBs, too in the folder.
>>
>> https://www.circitor.fr/Mibs/Html/C/CISCO-IP-IF-MIB.php
>>
>> Ben Kochie  schrieb am Sa., 13. Jan. 2024, 11:17:
>>
>>> You need to look at your generator output. I'm guessing there were 
>>> errors.
>>>
>>> The HUAWEI-ENTITY-EXTENT-MIB OIDs you listed are in a table, so you 
>>> can't *get* them, you need to *walk* them. When you generate, the 
>>> output should have indexes that need to be used.
>>>
>>> You need to make sure your vendor MIBs are added to your MIBDIRS path(s).
>>>
>>> The output works just fine when I added your example MIB to a generator 
>>> config.
>>>
>>> https://github.com/SuperQ/tools/tree/master/snmp_exporter/huawei
>>>
>>> On Sat, Jan 13, 2024 at 10:46 AM Awemnhd  wrote:
>>>
 Currently using snmp_exporter version 0.25.0

 The problem I had with SNMPv3 has been solved by recompiling 
 generator.yml!

 However, the if_mib used by default uses the standard SMP management 
 protocol. Some functions on network devices cannot be monitored. You need 
 to add the manufacturer's private MIB OID value to be monitored, for 
 example:
 get:
 - 1.3.6.1.4.1.2011.5.25.31.1.1.1.1.5
- 1.3.6.1.4.1.2011.5.25.31.1.1.1.1.7

 -name: hwEntityCpuUsage
   oid: 1.3.6.1.4.1.2011.5.25.31.1.1.1.1.5
type: Integer
help: entity CPU usage value range: 0~100
 -name: hwEntityMemUsage
oid: 1.3.6.1.4.1.2011.5.25.31.1.1.1.1.7
 type:Integer
help:Machine memory usage, value range: 0~100

 I loaded the above configuration section into the generator.yml file 
 and recompiled it. The generated snmp.yml did not have the above 
 configuration section. I manually added it to the compiled snmp.yml file. 
 There was no error message at startup. Check /snmp?target= 
 xx.xxx.xxx.xxx=snmp_v3=if_mib, but I can’t see the oid I added 
 get, even if I add the following configuration

 walk:
 - 1.3.6.1.4.1.2011.5.25.31

 Still can't see it after startup!

 Final note: I can obtain the above OID values normally using snmpwalk 
 in the same environment. How should I set snmp_exporter correctly?

 -- 
 You received this message because you are subscribed to the Google 
 Groups "Prometheus Users" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to prometheus-use...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/prometheus-users/9fcf5d63-20ca-45bb-8263-2a6e15302023n%40googlegroups.com
  
 
 .

>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to prometheus-use...@googlegroups.com.
>>>
>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/CABbyFmpzoq11NEz5FXkpZgWhts17rW%3DbGVgh%2BDNumzWmDYg%2BPg%40mail.gmail.com
>>>  
>>> 
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/b7fd8384-6722-4dde-b6db-ef757cdb49c6n%40googlegroups.com.


[prometheus-users] Windows_exporter what permissions are needed

2024-01-14 Thread Alexander Wilke
Hello,
is there any documentation which Type of permissions are needed for the 
specific collectors? In node_exporter there is a hint it can Run without 
root and as a general User.

If i Install the windows_exporter as MSI I probably need Higher rights 
because it Runs as a service. If I Run the .exe with the additional 
Parameters as a User it was working at least for a few collectors I tested.

However - is there any documentation what collector need which rights or 
can everything be collected as a User?

PS:
Has anyone eyxperience how expensive it is to collect the MS Exchange 
metrics on a Server with 10.000+ mailboxes and 8.000+ User working 
concurrently with Outlook Client?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/0b986053-4128-41b8-b154-8231bc90d982n%40googlegroups.com.


Re: [prometheus-users] Maximum targets for exporter

2024-01-13 Thread Alexander Wilke
Thank you for clarification. I was interested in If there are any 
disadvantages If the amount of CPU cores is too high maybe because of 
Overhead to share the load.

Good to know i can scale it easily If i run it on VMs

Ben Kochie schrieb am Samstag, 13. Januar 2024 um 10:51:49 UTC+1:

> No, Go is not specifically limited to a number of cores. For the 
> exporters, they should scale vertically just fine as well as horizontally.
>
> The only limit I've seen is how well the SNMP exporter's UDP packet 
> handling works. IIRC you may run into UDP packets per second limits before 
> you run into actual CPU limits.
>
> It's not something a lot of people have tested/used in production that 
> scale. At least not enough that I've gotten any good feedback.
>
> On Sat, Jan 13, 2024 at 1:20 AM Alexander Wilke  
> wrote:
>
>> Hello,
>> sorry to hijack this thread a little bit but Brian talks about "4 CPU 
>> cores" and Ben says "scale horizontally".
>>
>> Just for interest - why not just use 8, 16, or 32 CPU cores? Is Go 
>> limited at a specific CPU amount or is there a disadvantage to have to many 
>> cores?
>> I think if someone is monitoring so many devices this is enterprise 
>> network and servers/VMs with more CPUs are no problem.
>>
>> Ben Kochie schrieb am Freitag, 12. Januar 2024 um 21:50:57 UTC+1:
>>
>>> Those sound like reasonable amounts for those exporters.
>>>
>>> I've heard of people hitting thousands of SNMP devices from the 
>>> snmp_exporter.
>>>
>>> Since the exporters are in Go, they scale well. But if it's not enough, 
>>> the advantage of their design means they can be deployed horizontally. You 
>>> could run several exporters in parallel and use a simple http load balancer 
>>> like Envoy or HAProxy. 
>>>
>>> On Fri, Jan 12, 2024, 02:32 'Elliott Balsley' via Prometheus Users <
>>> promethe...@googlegroups.com> wrote:
>>>
>>>> I'm curious if anyone has experimented to find out how many targets can 
>>>> reasonably be scraped by a single instance of blackbox and snmp exporters. 
>>>>  
>>>> I know Prometheus itself can handle tens of thousands of targets, but I'm 
>>>> wondering at what point it becomes necessary to split up the scraping.  
>>>> I'll find out for myself soon enough, I just wanted to check and see if 
>>>> anyone has tested this already.  I'm thinking I would have around 10K 
>>>> targets for blackbox, and 1K for snmp.
>>>>
>>>> I'm using http_sd_config with a 15 second refresh interval, so that's 
>>>> another potential bottleneck I'll have to test.
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Prometheus Users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to prometheus-use...@googlegroups.com.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/prometheus-users/CALajkdh7EhHAVN5nJNYqJjKvcH_rfT1L7ZaPvPR4L-xjypKSbg%40mail.gmail.com
>>>>  
>>>> <https://groups.google.com/d/msgid/prometheus-users/CALajkdh7EhHAVN5nJNYqJjKvcH_rfT1L7ZaPvPR4L-xjypKSbg%40mail.gmail.com?utm_medium=email_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/ae1c9448-82ea-4bf3-b2d8-f620de2444a6n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/ae1c9448-82ea-4bf3-b2d8-f620de2444a6n%40googlegroups.com?utm_medium=email_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2dffb8bc-c071-4328-89f8-f60366fad29fn%40googlegroups.com.


Re: [prometheus-users] Why is snmp_exporter not recognizing the custom OID added?

2024-01-13 Thread Alexander Wilke
Keep in mind that you so Not only need the vendor specific MIBs but you
need in Addition all MIBs which are listet at the to of the MIB as
"IMPORTED".  So you maybe need other MIBs in the folder to have the
complete OID path.

Maybe post:
- Your MIBs folder with the MIBs you use
- The Generator.yml you use
- The Generator command you use.
- The Output of the Generator


I Like this Website to Download MIBs and IT tells me the other needed MIBs
as "IMPORTED". I need These MIBs, too in the folder.

https://www.circitor.fr/Mibs/Html/C/CISCO-IP-IF-MIB.php

Ben Kochie  schrieb am Sa., 13. Jan. 2024, 11:17:

> You need to look at your generator output. I'm guessing there were errors.
>
> The HUAWEI-ENTITY-EXTENT-MIB OIDs you listed are in a table, so you can't
> *get* them, you need to *walk* them. When you generate, the output should
> have indexes that need to be used.
>
> You need to make sure your vendor MIBs are added to your MIBDIRS path(s).
>
> The output works just fine when I added your example MIB to a generator
> config.
>
> https://github.com/SuperQ/tools/tree/master/snmp_exporter/huawei
>
> On Sat, Jan 13, 2024 at 10:46 AM Awemnhd  wrote:
>
>> Currently using snmp_exporter version 0.25.0
>>
>> The problem I had with SNMPv3 has been solved by recompiling
>> generator.yml!
>>
>> However, the if_mib used by default uses the standard SMP management
>> protocol. Some functions on network devices cannot be monitored. You need
>> to add the manufacturer's private MIB OID value to be monitored, for
>> example:
>> get:
>> - 1.3.6.1.4.1.2011.5.25.31.1.1.1.1.5
>>- 1.3.6.1.4.1.2011.5.25.31.1.1.1.1.7
>>
>> -name: hwEntityCpuUsage
>>   oid: 1.3.6.1.4.1.2011.5.25.31.1.1.1.1.5
>>type: Integer
>>help: entity CPU usage value range: 0~100
>> -name: hwEntityMemUsage
>>oid: 1.3.6.1.4.1.2011.5.25.31.1.1.1.1.7
>> type:Integer
>>help:Machine memory usage, value range: 0~100
>>
>> I loaded the above configuration section into the generator.yml file and
>> recompiled it. The generated snmp.yml did not have the above configuration
>> section. I manually added it to the compiled snmp.yml file. There was no
>> error message at startup. Check /snmp?target=
>> xx.xxx.xxx.xxx=snmp_v3=if_mib, but I can’t see the oid I added
>> get, even if I add the following configuration
>>
>> walk:
>> - 1.3.6.1.4.1.2011.5.25.31
>>
>> Still can't see it after startup!
>>
>> Final note: I can obtain the above OID values normally using snmpwalk in
>> the same environment. How should I set snmp_exporter correctly?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to prometheus-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/9fcf5d63-20ca-45bb-8263-2a6e15302023n%40googlegroups.com
>> 
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CABbyFmpzoq11NEz5FXkpZgWhts17rW%3DbGVgh%2BDNumzWmDYg%2BPg%40mail.gmail.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAJuaemY%2BLd%2BOrmm0RXa1U4K0NxGSz8T%3DYJjUiyiDcTi3aaqPEA%40mail.gmail.com.


Re: [prometheus-users] Maximum targets for exporter

2024-01-12 Thread Alexander Wilke
Hello,
sorry to hijack this thread a little bit but Brian talks about "4 CPU 
cores" and Ben says "scale horizontally".

Just for interest - why not just use 8, 16, or 32 CPU cores? Is Go limited 
at a specific CPU amount or is there a disadvantage to have to many cores?
I think if someone is monitoring so many devices this is enterprise network 
and servers/VMs with more CPUs are no problem.

Ben Kochie schrieb am Freitag, 12. Januar 2024 um 21:50:57 UTC+1:

> Those sound like reasonable amounts for those exporters.
>
> I've heard of people hitting thousands of SNMP devices from the 
> snmp_exporter.
>
> Since the exporters are in Go, they scale well. But if it's not enough, 
> the advantage of their design means they can be deployed horizontally. You 
> could run several exporters in parallel and use a simple http load balancer 
> like Envoy or HAProxy. 
>
> On Fri, Jan 12, 2024, 02:32 'Elliott Balsley' via Prometheus Users <
> promethe...@googlegroups.com> wrote:
>
>> I'm curious if anyone has experimented to find out how many targets can 
>> reasonably be scraped by a single instance of blackbox and snmp exporters.  
>> I know Prometheus itself can handle tens of thousands of targets, but I'm 
>> wondering at what point it becomes necessary to split up the scraping.  
>> I'll find out for myself soon enough, I just wanted to check and see if 
>> anyone has tested this already.  I'm thinking I would have around 10K 
>> targets for blackbox, and 1K for snmp.
>>
>> I'm using http_sd_config with a 15 second refresh interval, so that's 
>> another potential bottleneck I'll have to test.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/CALajkdh7EhHAVN5nJNYqJjKvcH_rfT1L7ZaPvPR4L-xjypKSbg%40mail.gmail.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ae1c9448-82ea-4bf3-b2d8-f620de2444a6n%40googlegroups.com.


[prometheus-users] Re: snmp_exporter 0.25.0 - IF-MIB and CISCO-IF-EXTENSION-MIB

2024-01-11 Thread Alexander Wilke
Hello Brian,

thank you for that snippet. I could use it to solve my issue:
(sysUpTime - on (instance) group_right () ifLastChange) / 100

However I need to find some time and try to better understand how these 
operations work.

PS:
Is there some sort of script builder for promQL ?

Brian Candler schrieb am Dienstag, 9. Januar 2024 um 15:27:54 UTC+1:

> For the first part, you should look at how ifTable and ifXTable are 
> handled in the default generator.yml:
>
>   if_mib:
> walk: [sysUpTime, interfaces, ifXTable]
>
> Note that you don't walk the individual columns, you walk the whole table.
>
> For the second part:
>
> (sysUpTime - on (instance) group_right () ifLastChange) / 100
>
> Reference:
>
> https://prometheus.io/docs/prometheus/latest/querying/operators/#many-to-one-and-one-to-many-vector-matches
>
> Also useful for understanding:
> https://www.robustperception.io/how-to-have-labels-for-machine-roles
> https://www.robustperception.io/exposing-the-software-version-to-prometheus
> https://www.robustperception.io/left-joins-in-promql
>
> On Tuesday 9 January 2024 at 10:04:36 UTC Alexander Wilke wrote:
>
>> Hello,
>>
>> I am using snmp-exporter to monitor CISCO IOS and IOS-XE devices.
>> However I have issues with merging the "IF-MIB" and 
>> "CISCO-IF-EXTENSION-MIB".
>>
>> IF-MIB provides information with the following labels:
>> ifIndex
>> ifName
>> ifDescr
>> ifAlias
>>
>>
>> if I add the "CISCO-IF-EXTENSION-MIB" to the generator I get the results 
>> from the Cisco device but the metrics do not caontain the ifIndex, ifName, 
>> ifDesc, ifAlias information.
>>
>> Unfortunately I do not know if this can be configured in the 
>> generator.yml file or not and in addition I do not really understand the 
>> lookup and override configuration.
>>
>>
>>
>> This is the part of IF-MIB and CISCO-IF-EXTENSION-MIB in my generator.yml.
>> I get the metrics but the CISOC MIB is missing - for me relevant - labels 
>> I have in the IF-MIB.
>>
>>   if_mib_15s:
>> walk: 
>> [ifName,ifAlias,ifDescr,ifIndex,ifMtu,ifHighSpeed,ifAdminStatus,ifOperStatus,ifLastChange,ifConnectorPresent,ifHCInOctets,ifHCInUcastPkts,ifHCInMulticastPkts,ifHCInBroadcastPkts,ifHCOutOctets,ifHCOutUcastPkts,ifHCOutMulticastPkts,ifHCOutBroadcastPkts,ifInDiscards,ifOutDiscards,ifInErrors,ifOutErrors,ifInUnknownProtos]
>> lookups:
>>   - source_indexes: [ifIndex]
>> lookup: ifAlias
>>   - source_indexes: [ifIndex]
>> # Uis OID to avoid conflict with PaloAlto PAN-COMMON-MIB.
>> lookup: 1.3.6.1.2.1.2.2.1.2 # ifDescr
>>   - source_indexes: [ifIndex]
>> # Use OID to avoid conflict with Netscaler NS-ROOT-MIB.
>> lookup: 1.3.6.1.2.1.31.1.1.1.1 # ifName
>> overrides:
>>   ifAlias:
>> ignore: true # Lookup metric
>>   ifDescr:
>> ignore: true # Lookup metric
>>   ifName:
>> ignore: true # Lookup metric
>>   ifType:
>> type: EnumAsInfo
>> max_repetitions: 50
>> timeout: 5s
>> retries: 3
>>
>>   # CISCO-IF-EXTENSION-MIB
>>   ciscoIfExtension_15s:
>> walk: 
>> [cieIfIndex,cieInterfacesIndex,cieIfName,cieIfNameMappingEntry,cieIfNameMappingTable,cieIfInRuntsErrs,cieIfInGiantsErrs,cieIfInFramingErrs,cieIfInOverrunErrs,cieIfInIgnored,cieIfInputQueueDrops,cieIfOutputQueueDrops,cieIfStateChangeReason,cieIfOperStatusCause,cieIfOperStatusCauseDescr]
>> max_repetitions: 50
>> timeout: 5s
>> retries: 3
>>
>>
>>
>> The second part of the question is:
>> In IF-MIB i can get the system uptime of "ifLastChange". it contains 
>> interface information (ifIndex, ifName, ...).
>>
>> the other metric ist "sysUpTime"
>>
>> I wanto to generate a PromQL query in grafana which shows me the time 
>> which passed since the last change of the interface. So if I open my 
>> dashboard I want to see that the interface's status changed 3min earlier. I 
>> do not want to have the information that the last changed happend when the 
>> system was up for 128d 18h 25min.
>>
>> any chance to calculate this and if yes can you provide the query?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/111bf709-c5d0-4d61-8d82-a51513174c8an%40googlegroups.com.


[prometheus-users] Re: smokeping_prober - $(target:raw) - help with ":raw" and how to use multiple targets

2024-01-11 Thread Alexander Wilke
Hello,

thank you for clarification. I though this :raw was something specific with 
histograms/buckets/heatmaps.
In the past I tried adding the "=~" instead of "=" but did not work. tried 
again today and it was a combination of things to change I did not before.

I changed it like this and it works like I exect:

host="${target:raw}" >  host=~"$target"

Brian Candler schrieb am Donnerstag, 11. Januar 2024 um 09:56:24 UTC+1:

> This is a question about Grafana and/or the smokeping_exporter Grafana 
> dashboard, not Prometheus.
>
> ${target:raw} is a Grafana variable expansion, and the :raw suffix is a 
> format specifier:
>
> https://grafana.com/docs/grafana/latest/dashboards/variables/variable-syntax/#variable-syntax
>
> https://grafana.com/docs/grafana/latest/dashboards/variables/variable-syntax/#raw
>
> If you want multiple Grafana selections to be active at once in a PromQL 
> query, then in general you need to use regex: *foo{host=~"${target}"}*
> Because Grafana understands PromQL it shouldn't be necessary to add a 
> :regex suffix here, although it's probably OK to add it. It should expand 
> to something like
> *foo{host=~"1\.1\.1\.1|8\.8\.8\.8"}*
> The important thing is that you use =~ instead of =.
>
> All this is standard Grafana functionality, and therefore further 
> questions about this would best be asked in the Grafana Community forum.
>
> If the published smokeping_exporter dashboard allows multiple selections 
> in its target var, but uses host= instead of host=~, then that's a bug in 
> the dashboard which you'd need to raise with the author.
>
> However, if the published dashboard only allows a single target selection 
> and you *modified* it to allow multiple selections, then you broke it. At 
> this point you've become a Grafana dashboard developer, and again, the 
> Grafana Community would be the best place to ask for help. It's Grafana 
> that builds the query; Prometheus can only process whatever query it's 
> given.
>
> On Thursday 11 January 2024 at 08:01:15 UTC Alexander Wilke wrote:
>
>> Hello,
>>
>> I am using the smokeping_prober (
>> https://github.com/SuperQ/smokeping_prober) v0.7.1 and the provided 
>> dashboard.json.
>>
>> For whatever reason the queries contain ".raw" endings for the targets.
>> This leads to a problem if I want to show several targets in the same 
>> graph because targets are not added with "|" in between but with ","
>>
>> Here is the query with one selected target which is working:
>> [image: smoke_ping_one_target.JPG]
>>
>>
>> If I select two or more targets than the query looks like this but not 
>> data anymore:
>> [image: smoke_ping_more_targets_no_data.JPG]
>>
>> If I remove the ":raw" at the end of the target I do not get any data no 
>> difference if one or more clients. So this is somehow relevant.
>>
>>
>> My idea was to have any overview panel which shows the latency of several 
>> smokeping_probes and to compare them. I want to place several "sensors" in 
>> our DataCenter and they should ping each other. If someone tells me he has 
>> performance issues with an appliacation I can select the relevant 
>> zones/probers and can compare if the latency changed or not.
>>
>> If I want to compare 6 probers and have to scroll through 6 pannels it is 
>> not so elegant because depending on the latency the scale of the panels is 
>> different and may lead to wrong assumptions.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/54dc0292-2746-46fe-ab62-325661666ffen%40googlegroups.com.


[prometheus-users] Blackbox exporter - add Header dynamically after First requests to server

2024-01-10 Thread Alexander Wilke
Hello,
i Connect with Blackbox exporter and http POST to an API.
I Connect with body_file which contains credentials to Login and some other 
Parameters.

The API answers with two Header
sid: XYZ
uid: ABC

For further actions in the API I have to add These dynamic Headers to each 
requests.

Is it possible with Blackbox exporter to read this content and proceed and 
add IT to the next body_file?


-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/5946d008-d2c3-415b-9865-74893d6f40f2n%40googlegroups.com.


[prometheus-users] Re: snmp_exporter-0.20 cannot monitor SNMP V3?

2024-01-10 Thread Alexander Wilke
If you use Cisco devices then you have to use a "C" at the end of the 
privacy protocol because it seems Cisco has specific impelementation.

I use

*priv_protocol: AES256C*

for Cisco IOS and IOS XE devices running 17.x.y version.


Brian Candler schrieb am Mittwoch, 10. Januar 2024 um 12:32:08 UTC+1:

> > Please list the SNMP V3 instance configuration in generator.yml. I want 
> to know where the configuration error is!
>
> It's in the documentation:
>
> https://github.com/prometheus/snmp_exporter/blob/main/generator/README.md#file-format
>
> However, you don't need to compile anything to get started. Just use the 
> supplied snmp.yml, and edit the section under "auths" so it looks like this:
>
> auths:
>   public_v1:
> community: public
> security_level: noAuthNoPriv
> auth_protocol: MD5
> priv_protocol: DES
> version: 1
>   public_v2:
> community: public
> security_level: noAuthNoPriv
> auth_protocol: MD5
> priv_protocol: DES
> version: 2
>
>
>
>
>
>
>
> *  prod_v3:version: 3security_level: authPrivusername: admin  
>   auth_protocol: SHApassword: XXXpriv_protocol: AES
> priv_password: YYY*
>
> And you're done.
>
> The next simplest option is to load multiple config files. This means you 
> can use the existing snmp.yml completely unchanged, and a separate yml file 
> that has just your auth(s) in it.  I use the following:
>
> *snmp_exporter --config.file=/etc/prometheus/snmp.d/*.yml*
>
> Then I have /etc/prometheus/snmp.d/auth.yml (which is mine) 
> and /etc/prometheus/snmp.d/snmp.yml (which is the standard one).
>
> You only need to use the generator if you want to scrape MIBs other than 
> the supplied example ones. You can do this by starting with the supplied 
> generator.yml 
> 
>  
> and modifying it. But if all you want to do is change the auths, I wouldn't 
> bother, since the generator essentially just copies the auths from its 
> input to its output.
>
> On Wednesday 10 January 2024 at 10:36:09 UTC Awemnhd wrote:
>
>> I tried using snmp_exporter-0.25.0, using SNMP v3 mode, SHA and AES still 
>> not successful, and I have to recompile the generator.yml file, otherwise 
>> using the default snmp.yml file will have no effect!
>>
>> Please list the SNMP V3 instance configuration in generator.yml. I want 
>> to know where the configuration error is!
>>
>> 在2024年1月9日星期二 UTC+8 22:54:36 写道:
>>
>>> > Why is SNMP v3 so difficult to implement?
>>>
>>> It's not. It's dead easy. Do you have a working snmpwalk command line 
>>> which talks to your device? Then you just transfer the settings to your 
>>> snmp_exporter configuration.
>>>
>>> This has been made easier since snmp_exporter v0.23.0 
>>> , 
>>> because the "modules" which define the OID walking and the "auths" which 
>>> provide the credentials have been made orthogonal. You can add new auths, 
>>> without touching modules. You can also put them in separate files.
>>>
>>> So you end up with e.g.
>>>
>>> auths:
>>>   prod_v3:
>>> version: 3
>>> security_level: authPriv
>>> username: admin
>>> auth_protocol: SHA
>>> password: XXX
>>> priv_protocol: AES
>>> priv_password: YYY
>>>
>>> then you call /snmp?target=x.x.x.x=if_mib=prod_v3
>>>
>>> The default is indeed still public_v2. The only other option would be to 
>>> have no default, i.e. snmp_exporter would fail unless you provide an 
>>> explicit set of credentials.
>>>
>>> Hence I'd definitely recommend moving to snmp_exporter 0.25.0. If you 
>>> can't do that, then there is a YAML trick you can do to make adding new 
>>> auths easier:
>>>
>>> modules:
>>>   if_mib: *_mib*
>>>    etc
>>>
>>> # Append to end of file
>>>
>>> *if_mib_prod_v3:  <<: *if_mib*
>>>   version: 3
>>>   timeout: 3s
>>>   retries: 3
>>>   auth:
>>> security_level: authPriv
>>> username: admin
>>> auth_protocol: SHA
>>> password: 
>>> ... etc
>>>
>>> This effectively "clones" the if_mib module under a new module 
>>> "if_mib_prod_v3", and then overrides parts of it.
>>>
>>> On Tuesday 9 January 2024 at 10:04:57 UTC Awemnhd wrote:
>>>
 see 
 https://github.com/prometheus/snmp_exporter/tree/main/generator#file-format

 Tried various ways to achieve some parameter passing
 username:
 security_level:
 password: SHA
 auth_protocol: AES
 priv_protocol:
 priv_password:

 As a result, when the service is started, the default access method is 
 community: public_v2!

 Why is SNMP v3 so difficult to implement? Why are they all in SNMP V2 
 mode? Why?

>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.

[prometheus-users] Re: Blackbox_Exporter 0.24.0 - probe endpoint every 15s with30s timeout ?

2024-01-09 Thread Alexander Wilke
Hello,
it's only working partly I think. If I add the same target several times to 
the same job then prometheus treats targets with the exact naming as on.
This results in one target on prometheus' webui target list and tcpdump 
confirms onle one scrape per 60s

  - targets:
- pfsense.oberndorf.ca:443# pfsense webui tcp tls test
- pfsense.oberndorf.ca:443# pfsense webui tcp tls test
- pfsense.oberndorf.ca:443# pfsense webui tcp tls test
- pfsense.oberndorf.ca:443# pfsense webui tcp tls test

If I use this I have 4 different namings for the same target which results 
in 4 scrapes. However with this max 4 permutations are possible I think and 
with http only 2.

scheme: https
  - targets:
- pfsense.oberndorf.ca:443# pfsense webui tcp tls test
- https://pfsense.oberndorf.ca# pfsense webui tcp tls test
- https://pfsense.oberndorf.ca:443# pfsense webui tcp tls 
test
- pfsense.oberndorf.ca# pfsense webui tcp tls test


And at least I they do not spread as equal as I hoped and in addition I now 
have 4 different instances.
Maybe I could fix this with relabling the "instance" field but this sound 
as wrong as relabeling the "job".

[image: same_target_4times.JPG]


Back to your question:
"Does it really matter whether it was 20 seconds or 25 seconds?"

I don't know if this is relevant. It's a rare issue and I am in discussion 
with the vendor of the API/appliance. However it maybe could give me some 
more indication if the API would respond after lets say 50s oder 3 minutes.
If scrape_timeout is reached the exporter sends a RST if I remember 
correctly which is good to close the connections but will also close the 
connection to the API and API server maybe just writes "client closed 
connection" or something similar to the log.

I don't know if this is really a problem if the answers of two parallel 
probes overlap (timeout longer than duration) because the connection uses 
different source ports and prometheus allows the "out-of-order" ingestion 
if I remember correctly.
Perhaps it could lead to many unclosed connections which need memory. lt's 
say interval is 1s and timeout is 60s there could be 60 connections in 
parallel.

Maybe a longer timeout than scrape_interval could be handled like this:

scrape_interval: 15s
scrape_timeout: 60s

if scrape_time is longer than scrape_interval check if probe duration 
succeeded before scrape_timeout and do the next scrape according to 
scrape_interval.
if scrape_duration is longer than scrape_interval and shorter than 
scrape_timeout skip next scrape until timeout reached or scrape succeeded.

However this would not allow parallel scrapes.


Probably this is a rare scenario and debugging an API with 
blackbox_exporter was only an idea. I just wanted to ask if I miss 
something :-)

Thanks for sharing your ideas.





Brian Candler schrieb am Dienstag, 9. Januar 2024 um 15:45:42 UTC+1:

> (Thinks: maybe it's *not* necessary to apply distinct labels? This feels 
> wrong somehow, but I can't pinpoint exactly why it would be bad)
>
> On Tuesday 9 January 2024 at 14:43:51 UTC Brian Candler wrote:
>
>> Unfortunately, the timeout can't be longer than the scrape interval, 
>> firstly because this would require overlapping scrapes, and secondly the 
>> results could be returned out-of-order: e.g.
>>
>> xx:yy:00 scrape 1: takes 25 seconds, gives result at xx:yy:25
>> xx:yy:15 scrape 2: takes 5 seconds, gives result at xx:yy:20
>>
>> > If I run two blackbox_probes in parallel with scrape_interval: 30s and 
>> scrape_timeout: 30s this will work but both probes will start more or less 
>> at the same time.
>>
>> Actually I think you'll find they'd be evenly spread out over the scrape 
>> interval - try it.
>>
>> For example, make a single scrape job with a 60 second scrape interval, 
>> and list the same target 4 times - but make sure you apply some distinct 
>> label to each instance, so that they generate 4 separate timeseries.  You 
>> can then look at the raw timestamps in the database to check the actual 
>> scrape times: easiest way is by using the PromQL web interface and 
>> supplying a range vector query, like probe_success{instance="foo"}[5m].  
>> This has to be in table view, not graph view.  Don't mix any other targets 
>> into that scrape job, because they'll be spread together.
>>
>> Alternatively, KISS: use a 15 second scrape interval, and simply accept 
>> that "scrape failed" = "took longer than 15 seconds". Does it really matter 
>> whether it was 20 seconds or 25 seconds? Can you get that information from 
>> somewhere else if needed, e.g. web server logs?
>>

[prometheus-users] Re: prometheus 2.48.1 - web-config.yml - cipher_suites "unknown cipher"

2024-01-09 Thread Alexander Wilke
Hello Brian,
thank you for investigation.

I tried several ciphers some days ago. Every time I cut more and more 
ciphers from the cnfiguration but it did not work - probably because it 
were ciphers which are insecure. However for the first tyr I wanted to 
allow all and check if all exporters work and then narrow it down.

As I can see you already opened a post here:
https://groups.google.com/g/golang-nuts/c/niIG6PaTXZg

I will proceed with these ciphers which should be secure:
  cipher_suites:
   - TLS_RSA_WITH_RC4_128_SHA  uint16 = 0x0005
   - TLS_RSA_WITH_3DES_EDE_CBC_SHA uint16 = 0x000a
   - TLS_RSA_WITH_AES_128_CBC_SHA  uint16 = 0x002f
   - TLS_RSA_WITH_AES_256_CBC_SHA  uint16 = 0x0035

However - if the default library allows insecure ciphers then any deault 
configuration lower than TLS 1.3 is "insecure" and this should be fixed

Thanks again! I appreciate it!

Brian Candler schrieb am Dienstag, 9. Januar 2024 um 22:57:52 UTC+1:

> Only the first cipher you listed is rejected.
>
> The code in exporter_toolkit just iterates over tls.CipherSuites():
>
> https://github.com/prometheus/exporter-toolkit/blob/v0.11.0/web/tls_config.go#L401-L407
>
> which you can replicate like this:
> https://go.dev/play/p/yFl-V5MrGHh
>
> It turns out that TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA exists, but 
> TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 does not.
>
> The one you want is instead listed in InsecureCipherSuites:
> https://go.dev/play/p/ey1z_wG4Ezw
>
> Why is the cipher with SHA(1) secure, but SHA256 insecure??! I have no 
> idea. Maybe worth asking on golang-nuts.
>
> On Tuesday 9 January 2024 at 10:04:21 UTC Alexander Wilke wrote:
>
>> Hello,
>> I am running prometheus 2.48.1 and I have problems to find the correct 
>> syntax for the "cipher_suites" in web.config.yml file:
>>
>>
>> https://cs.opensource.google/go/go/+/refs/tags/go1.21.5:src/crypto/tls/cipher_suites.go;l=656
>> https://pkg.go.dev/crypto/tls#CipherSuitesi
>>
>> web-config.yml
>>
>>   cipher_suites:
>> - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
>> - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
>> - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
>> - TLS_AES_128_GCM_SHA256
>> - TLS_AES_256_GCM_SHA384
>>
>> /opt/prometheus# ./promtool check web-config web-config.yml
>> web-config.yml FAILED: unknown cipher: 
>> TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
>>
>> If I remove the ciper_suites block the configuration file works.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/dde2a446-44e3-4fd4-b9e3-bcdbd7a92a06n%40googlegroups.com.


[prometheus-users] Blackbox_Exporter 0.24.0 - probe endpoint every 15s with30s timeout ?

2024-01-09 Thread Alexander Wilke
Hello,
I want to use blackbox_exporter and http prober to login to an API.

My goal is to do the login every 15s which could be

xx:yy:00
xx:yy:15
xx:yy:30
xx:yy:45

I could solve this with scrape_interval: 15s.
But in addition I want to allow a scrape timeout of 30s which is longer 
than the scrape_timeout.

If I run two blackbox_probes in parallel with scrape_interval: 30s and 
scrape_timeout: 30s this will work but both probes will start more or less 
at the same time.

xx:yy:00
xx:yy:30

The idea behind tha is:
In general the API response for login is very fast. For whatever reason 
sometimes it takes 30s or more. I do not want the probe to just fail after 
15s but want to see and understand how long a login request takes.

If I abort a long lasting request or do a parellel login this may work very 
fast. So it is probably not a problem with the API in general but with 
specific user session or other unknown circumstances. So I want many scrape 
intervals but the imeout sometimes needs to be higher OR I need several 
blackbox_probes which do not start at the same time but are spread equally.

Any ideas?
Is this possible with prometheus 2.48.1 and blackbox_exporter 0.24.0 ?



-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ef7dead8-d26e-48de-af2e-28e2919410e3n%40googlegroups.com.


[prometheus-users] snmp_exporter 0.25.0 - IF-MIB and CISCO-IF-EXTENSION-MIB

2024-01-09 Thread Alexander Wilke
Hello,

I am using snmp-exporter to monitor CISCO IOS and IOS-XE devices.
However I have issues with merging the "IF-MIB" and 
"CISCO-IF-EXTENSION-MIB".

IF-MIB provides information with the following labels:
ifIndex
ifName
ifDescr
ifAlias


if I add the "CISCO-IF-EXTENSION-MIB" to the generator I get the results 
from the Cisco device but the metrics do not caontain the ifIndex, ifName, 
ifDesc, ifAlias information.

Unfortunately I do not know if this can be configured in the generator.yml 
file or not and in addition I do not really understand the lookup and 
override configuration.



This is the part of IF-MIB and CISCO-IF-EXTENSION-MIB in my generator.yml.
I get the metrics but the CISOC MIB is missing - for me relevant - labels I 
have in the IF-MIB.

  if_mib_15s:
walk: 
[ifName,ifAlias,ifDescr,ifIndex,ifMtu,ifHighSpeed,ifAdminStatus,ifOperStatus,ifLastChange,ifConnectorPresent,ifHCInOctets,ifHCInUcastPkts,ifHCInMulticastPkts,ifHCInBroadcastPkts,ifHCOutOctets,ifHCOutUcastPkts,ifHCOutMulticastPkts,ifHCOutBroadcastPkts,ifInDiscards,ifOutDiscards,ifInErrors,ifOutErrors,ifInUnknownProtos]
lookups:
  - source_indexes: [ifIndex]
lookup: ifAlias
  - source_indexes: [ifIndex]
# Uis OID to avoid conflict with PaloAlto PAN-COMMON-MIB.
lookup: 1.3.6.1.2.1.2.2.1.2 # ifDescr
  - source_indexes: [ifIndex]
# Use OID to avoid conflict with Netscaler NS-ROOT-MIB.
lookup: 1.3.6.1.2.1.31.1.1.1.1 # ifName
overrides:
  ifAlias:
ignore: true # Lookup metric
  ifDescr:
ignore: true # Lookup metric
  ifName:
ignore: true # Lookup metric
  ifType:
type: EnumAsInfo
max_repetitions: 50
timeout: 5s
retries: 3

  # CISCO-IF-EXTENSION-MIB
  ciscoIfExtension_15s:
walk: 
[cieIfIndex,cieInterfacesIndex,cieIfName,cieIfNameMappingEntry,cieIfNameMappingTable,cieIfInRuntsErrs,cieIfInGiantsErrs,cieIfInFramingErrs,cieIfInOverrunErrs,cieIfInIgnored,cieIfInputQueueDrops,cieIfOutputQueueDrops,cieIfStateChangeReason,cieIfOperStatusCause,cieIfOperStatusCauseDescr]
max_repetitions: 50
timeout: 5s
retries: 3



The second part of the question is:
In IF-MIB i can get the system uptime of "ifLastChange". it contains 
interface information (ifIndex, ifName, ...).

the other metric ist "sysUpTime"

I wanto to generate a PromQL query in grafana which shows me the time which 
passed since the last change of the interface. So if I open my dashboard I 
want to see that the interface's status changed 3min earlier. I do not want 
to have the information that the last changed happend when the system was 
up for 128d 18h 25min.

any chance to calculate this and if yes can you provide the query?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/71b3ab4b-8710-4a09-83aa-640715ca844cn%40googlegroups.com.


[prometheus-users] prometheus 2.48.1 - web-config.yml - cipher_suites "unknown cipher"

2024-01-09 Thread Alexander Wilke
Hello,
I am running prometheus 2.48.1 and I have problems to find the correct 
syntax for the "cipher_suites" in web.config.yml file:

https://cs.opensource.google/go/go/+/refs/tags/go1.21.5:src/crypto/tls/cipher_suites.go;l=656
https://pkg.go.dev/crypto/tls#CipherSuitesi

web-config.yml

  cipher_suites:
- TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_AES_128_GCM_SHA256
- TLS_AES_256_GCM_SHA384

/opt/prometheus# ./promtool check web-config web-config.yml
web-config.yml FAILED: unknown cipher: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256

If I remove the ciper_suites block the configuration file works.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8453dfb7-6cd2-4780-a70b-4f19fe38341bn%40googlegroups.com.