Hi,
In my original message I mentioned several other timeouts, which may have
misled other readers into thinking that it's safe to actually change them
all - it isn't.
I've had best results when I only changed the ProcessManager.pm's
worker_timeout to 10+ seconds less than global timeout. That had the effect
such as this in the logs:
2016/02/06 10:53:42 [FATAL] Socket read timed out to server. Terminating
process. at /usr/share/perl5/Munin/Master/UpdateWorker.pm line 254.
2016/02/06 10:53:42 [ERROR] Munin::Master::UpdateWorker<badnode> died with
'[FATAL] Socket read timed out to server. Terminating process. at
/usr/share/perl5/Munin/Master/UpdateWorker.pm line 254.
'
2016/02/06 10:53:50 [INFO] Remaining workers: badnode
2016/02/06 10:53:50 [INFO] Reaping Munin::Master::UpdateWorker<badnode>. Exit
value/signal: 18/0
2016/02/06 10:53:51 [INFO]: Munin-update finished (49.94 sec)
When I messed with other variables, it had nasty side-effects such as
thousands of repeated lines in the logs in case of downtimes, effectively
causing log disk DoS. So just don't do that.(TM)
--
2. That which causes joy or happiness.