I'm assuming based on the errors and your configuration that you're using a single instance of an FPing probe? A quick and easy way, network capacity permitting, to fix this is to use multiple probes and split up your targets among them. This way instead of trying to poll >1000 devices in 5 minutes, you're only polling, say, 200 in 5 minutes, but you're doing 5 batches of 200 concurrently.
Details on how to set this up are in the Smokeping docs, and in previous threads on this mailing list. - Pete On 03/12/2013 10:45 AM, Christoph Schwarz wrote: > Hello List, > > I experienced some interrupts in all our graphs at the same time, which is > very unusal and tends to be a general problem of smokeping or the server > where it is running on. > > The log file (/var/log/messages) logs: > > ... > Mar 12 13:57:52 monitoring smokeping[31332]: FPing: WARNING: smokeping took > 301 seconds to complete 1 round of polling. It should complete polling in 300 > seconds. You may have unresponsive devices in your setup. > Mar 12 14:07:51 monitoring smokeping[31332]: FPing: NOTE: smokeping took 300 > seconds to complete 1 round of polling. This is over 80%% of the max time > available for a polling cycle (300 seconds). > Mar 12 14:17:51 monitoring smokeping[31332]: FPing: NOTE: smokeping took 300 > seconds to complete 1 round of polling. This is over 80%% of the max time > available for a polling cycle (300 seconds). > Mar 12 14:27:51 monitoring smokeping[31332]: FPing: NOTE: smokeping took 300 > seconds to complete 1 round of polling. This is over 80%% of the max time > available for a polling cycle (300 seconds). > Mar 12 14:37:52 monitoring smokeping[31332]: FPing: WARNING: smokeping took > 301 seconds to complete 1 round of polling. It should complete polling in 300 > seconds. You may have unresponsive devices in your setup. > Mar 12 14:47:50 monitoring smokeping[31332]: FPing: NOTE: smokeping took 299 > seconds to complete 1 round of polling. This is over 80%% of the max time > available for a polling cycle (300 seconds). > Mar 12 14:52:50 monitoring smokeping[31332]: FPing: NOTE: smokeping took 299 > seconds to complete 1 round of polling. This is over 80%% of the max time > available for a polling cycle (300 seconds). > Mar 12 14:57:51 monitoring smokeping[31332]: FPing: NOTE: smokeping took 300 > seconds to complete 1 round of polling. This is over 80%% of the max time > available for a polling cycle (300 seconds). > Mar 12 15:07:52 monitoring smokeping[31332]: FPing: WARNING: smokeping took > 301 seconds to complete 1 round of polling. It should complete polling in 300 > seconds. You may have unresponsive devices in your setup. > ... > > Environment: > CPU: 2x Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz > Mem.: 2048M > > Debian Squeeze (6.0.7) > Linux 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64 GNU/Linux > Smokeping Version: 2.003006 > Probes: 1057 > > I tried to increase the step from 300 to 400 but I got an error because the > old RRDs have a step of 300. > After removing all unresponsive devices I got still those log messages. Do I > need to increase the step and how can I achieve that (I still need the > graphs)? > > Thanks. > > Regards > Christoph > > _______________________________________________ > smokeping-users mailing list > [email protected] > https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users _______________________________________________ smokeping-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
