More precisely the problem is the following:

If I set period=1 for a "rate" sensor (network speed, NSD read/write speed, 
PDisk read/write speed) everything is correct because every second the sensors 
get the valuess of the cumulative counters (and do not divide it by 1, which is 
not affecting anything for 1 second).
If I set the period=2, the "rate" sensors collect the values from the 
cumulative counters every two seconds but they do not divide by 2 those values 
(because pmsensors do not actually divide; they seem to silly report what they 
read which is understand-able from a performance point of view); then grafana 
receives as double as the real speed.

I've to correct myself: here the point is not how sampling/downsampling is done 
by grafana/grafana-bridge/whatever as I wrongly wrote in my first email.
The point is: if I collect data every N seconds (because I do not want to 
overloads the pmcollector node), how can I divide (in grafana) the reported 
collected data by N to get real avg speed in that N-seconds time interval ??

At the moment it seems that the only option is using N=1, which is bad because, 
as I stated, it overloads the collector when many nodes run many pmsensors...

  A

________________________________
From: [email protected] 
[[email protected]] on behalf of IBM Spectrum Scale 
[[email protected]]
Sent: Friday, July 27, 2018 8:27 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] How Zimon/Grafana-bridge process data


Hi,
as there are more often similar questions rising, we just put an article about 
the topic on the Spectrum Scale Wiki
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Downsampling%2C%20Upsampling%20and%20Aggregation%20of%20the%20performance%20data

While there will be some minor updates on the article in the next time, it 
might already explain your questions.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.

If your query concerns a potential software error in Spectrum Scale (GPFS) and 
you have an IBM software maintenance contract please contact 1-800-237-5511 in 
the United States or your local IBM Service Center in other countries.

The forum is informally monitored as time permits and should not be used for 
priority messages to the Spectrum Scale (GPFS) team.

[Inactive hide details for "Dorigo Alvise (PSI)" ---13.07.2018 12:08:59---Hi, 
I've a GL2 cluster based on gpfs 4.2.3-6, with 1 s]"Dorigo Alvise (PSI)" 
---13.07.2018 12:08:59---Hi, I've a GL2 cluster based on gpfs 4.2.3-6, with 1 
support node and 2 IO/NSD nodes.

From: "Dorigo Alvise (PSI)" <[email protected]>
To: "[email protected]" <[email protected]>
Date: 13.07.2018 12:08
Subject: [gpfsug-discuss] How Zimon/Grafana-bridge process data
Sent by: [email protected]

________________________________



Hi,
I've a GL2 cluster based on gpfs 4.2.3-6, with 1 support node and 2 IO/NSD 
nodes.

I've the following perfmon configuration for the metric-group GPFSNSDDisk:

{
name = "GPFSNSDDisk"
period = 2
restrict = "nsdNodes"
},

that, as far as I know sends data to the collector every 2 seconds (correct ?). 
But how ? does it send what it reads from the counter every two seconds ? or 
does it aggregated in some way ? or what else ?

In the collector node pmcollector, grafana-bridge and grafana-server run.

Now I need to understand how to play with the grafana parameters:
- Down sample (or Disable downsampling)
- Aggregator (following on the same row the metrics).

See attached picture 4s.png as reference.

In the past I had the period set to 1. And grafana used to display correct data 
(bytes/s for the metric gpfs_nsdds_bytes_written) with aggregator set to "sum", 
which AFAIK means "sum all that metrics that match the filter below" (again see 
the attached picture to see how the filter is set to only collect data from the 
IO nodes).

Today I've changed to "period=2"... and grafana started to display funny data 
rate (the double, or quad of the real rate).

I had to play (almost randomly) with "Aggregator" (from sum to avg, which as 
fas as I undestand doesn't mean anything in my case... average between the two 
IO nodes ? or what ?) and "Down sample" (from empty to 2s, and then to 4s) to 
get back real data rate which is compliant with what I do get with dstat.

Can someone kindly explain how to play with these parameters when zimon 
sensor's period is changed ?

Many thanks in advance
Regards,

Alvise Dorigo[attachment "4s.png" deleted by Manfred Haubrich/Germany/IBM] 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to