Hi Niko, thanks for your prompt reply. Responses inline.
Niko Tyni wrote: > Hi, > > some clarifications: > > - Do you have any alerts enabled in the Targets section? Yes, I cut the Target section for reasons of brevity, but the alerts are set up in the following fashion on both servers: *** Targets *** probe = FPing menu = Top title = Network Latency Grapher remark = Welcome to SmokePing + Server B menu = Server B title = Server B ++ core menu = Core title = Core alerts = bigloss,someloss,startloss +++ router1 menu = router1 title = router1 rawlog=%Y-%m-%d host = <IP Address> and so on... > - Is the above quote from server A or server B? If from A, please include > it from server B too. (Server A is not interesting here; it's working > 'well enough' and is an ancient version.) It was Server B, but as I pointed out, the only difference in the base config was the addition of the concurrentprobes line on Server B. > - When server B stops logging, does the smokeping daemon die or is it > just doing nothing? Does it recover when the unresponsible devices > come back? It's still running, but doing nothing, as soon as the unresponsive device recovers it starts logging data again. > There are two problems here: the parameters should be tuned so that > you never get the 'smokeping took ... seconds' message, even when > the targets are down, but obviously Smokeping should recover from it. I thought that too. Our step times are only 60 seconds on both systems, so with 127 targets on Server A it's probably no surprise we get a lot of log messages to this extent, but the time to check the 11 targets on server B when one was not responding was sitting fairly consistently at 130 seconds, which seems unfeasibly long. > I don't quite understand why the messages show up in the first place > with the FPing parameters you have, but I'll look into that. > > The best help would be the output of 'smokeping --debug-daemon' at > outage time, but of course I realize that might be a bit hard to get > given the verboseness of it even when everything is OK. > > The output of 'smokeping --debug' when everything works would be good > to have too. Ok, catching it when it's failing will indeed be difficult but if the opportunity presents itself I will try. Do you mind if I send you the debug output offlist as it's large and would need to be fairly extensively edited for confidentiality reasons. Thanks again, Craig -- Unsubscribe mailto:[EMAIL PROTECTED] Help mailto:[EMAIL PROTECTED] Archive http://lists.ee.ethz.ch/smokeping-users WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
