Bug#786997: [Packaging] Bug#786997: munin-update doesn't work reliably with 4 min update intervals because of hardcoded timeouts

2015-06-01 Thread Holger Levsen
control: tags -1 + upstream

Hi Josip,

On Mittwoch, 27. Mai 2015, Josip Rodin wrote:
 This feels like #464880 all over again :) You can easily change the cron
 jobs and it generally works. However, as soon as any one of your nodes
 croaks, everything basically grinds to a halt way too easily.
[...]
 So you *think* you set a timeout smaller than a minute, but actually the
 socket read function waits for much longer than that.
 
 Please make these configurable just like the other timeout is. TIA.

thanks for the bug report and analysis! Upstream is reading the BTS.


cheers,
Holger




signature.asc
Description: This is a digitally signed message part.


Bug#786997: munin-update doesn't work reliably with 4 min update intervals because of hardcoded timeouts

2015-05-27 Thread Josip Rodin
Package: munin
Version: 2.0.25-1

Hi,

This feels like #464880 all over again :) You can easily change the cron
jobs and it generally works. However, as soon as any one of your nodes
croaks, everything basically grinds to a halt way too easily.

...
2015/05/27 14:47:53 [INFO] Remaining workers: backends;foo5n
2015/05/27 14:47:53 [INFO] Reaping Munin::Master::UpdateWorkerbackends;foo5p. 
 Exit value/signal: 0/0
2015/05/27 14:48:01 [FATAL ERROR] Lock already exists: 
/var/run/munin/munin-update.lock. Dying.
2015/05/27 14:48:01  at /usr/share/perl5/Munin/Master/Update.pm line 128.
2015/05/27 14:48:03 [INFO] Remaining workers: backends;foo5n
2015/05/27 14:48:13 [INFO] Remaining workers: backends;foo5n
2015/05/27 14:48:23 [INFO] Remaining workers: backends;foo5n
2015/05/27 14:48:33 [INFO] Remaining workers: backends;foo5n
2015/05/27 14:48:43 [INFO] Remaining workers: backends;foo5n
2015/05/27 14:48:53 [INFO] Remaining workers: backends;foo5n
2015/05/27 14:49:02 [FATAL ERROR] Lock already exists: 
/var/run/munin/munin-update.lock. Dying.
2015/05/27 14:49:02  at /usr/share/perl5/Munin/Master/Update.pm line 128.
2015/05/27 14:49:03 [INFO] Remaining workers: backends;foo5n
2015/05/27 14:49:10 [FATAL] Socket read from foo5n failed.  Terminating 
process. at /usr/share/perl5/Munin/Master/UpdateWorker.pm line 254.
2015/05/27 14:49:10 [ERROR] Munin::Master::UpdateWorkerbackends;foo5n died 
with '[FATAL] Socket read from foo5n failed.  Terminating process. at 
/usr/share/perl5/Munin/Master/UpdateWorker.pm line 254.
'
2015/05/27 14:49:13 [INFO] Remaining workers: backends;foo5n
2015/05/27 14:49:13 [INFO] Reaping Munin::Master::UpdateWorkerbackends;foo5n. 
 Exit value/signal: 18/0
2015/05/27 14:49:14 [INFO]: Munin-update finished (132.65 sec)
...

The problem is that there are several hardcoded timeout variables,
*independent* of the value you can pass via munin-update --timeout=N

/usr/share/perl5/Munin/Master/Node.pm:io_timeout = 120,
/usr/share/perl5/Munin/Master/ProcessManager.pm:worker_timeout  = 180,
/usr/share/perl5/Munin/Master/ProcessManager.pm:timeout = 240,

So you *think* you set a timeout smaller than a minute, but actually the
socket read function waits for much longer than that.

Please make these configurable just like the other timeout is. TIA.

-- 
 2. That which causes joy or happiness.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org