Issue #11360 has been updated by Josh Cooper.
Hi Jo, The strace output from this report is identical to the hung agent in [https://projects.puppetlabs.com/issues/10418#note-1](https://projects.puppetlabs.com/issues/10418#note-1). The agent is in a select loop, waiting on two file descriptors. I understand that #10418 was redhat specific, but the fact that these are identical is a bit uncanny. Can you try setting `listen=false` on at least one system that is hung and let us know if the problem goes away? ---------------------------------------- Bug #11360: puppet client hangs after period of being unable to contact server https://projects.puppetlabs.com/issues/11360 Author: Jo Rhett Status: Re-opened Priority: Normal Assignee: Category: agent Target version: Affected Puppet version: 2.6.12 Keywords: Branch: We had some serious memory/swap issues with the puppet master today. I spent a few hours getting that worked out, and upgrading to passenger 3.0.11. After clearing up the issues we found that some 50 systems weren't up to date. Logging into these systems I found a puppetdlock file which was about 4.5 hours old and a running puppetd which was looping doing nothing. <pre> [03:30 root@ald002 ~]$ ls -la /var/lib/puppet/state total 164 drwxr-xr-t 3 root root 4096 Dec 12 22:45 . drwxr-xr-x 10 puppet puppet 4096 Oct 24 18:28 .. drwxr-xr-x 2 root root 4096 Sep 23 05:03 graphs -rw-rw---- 1 root root 1448 Dec 12 22:15 last_run_report.yaml -rw-rw---- 1 root root 38 Dec 12 22:15 last_run_summary.yaml -rw-r--r-- 1 root root 4 Dec 12 22:45 puppetdlock -rw-rw---- 1 root root 109285 Dec 12 22:15 state.yaml </pre> Here's an example log: <pre> Dec 12 21:12:27 ald002 puppet-agent[6945]: Could not retrieve catalog from remote server: Connection refused - connect(2) Dec 12 21:12:27 ald002 puppet-agent[6945]: Using cached catalog Dec 12 21:12:27 ald002 puppet-agent[6945]: Could not retrieve catalog; skipping run Dec 12 21:12:27 ald002 puppet-agent[6945]: Could not send report: Connection refused - connect(2) Dec 12 21:42:30 ald002 puppet-agent[6945]: Could not retrieve catalog from remote server: Connection refused - connect(2) Dec 12 21:42:30 ald002 puppet-agent[6945]: Using cached catalog Dec 12 21:42:30 ald002 puppet-agent[6945]: Could not retrieve catalog; skipping run Dec 12 21:42:30 ald002 puppet-agent[6945]: Could not send report: Connection refused - connect(2) Dec 12 22:15:57 ald002 puppet-agent[6945]: Could not run Puppet configuration client: execution expired </pre> <pre> [03:32 root@ald002 ~]$ strace -p 6945 Process 6945 attached - interrupt to quit select(11, [7 9], [], [], {1, 107000}) = 0 (Timeout) select(11, [7 9], [], [], {0, 0}) = 0 (Timeout) select(11, [9], [], [], {0, 0}) = 0 (Timeout) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 select(11, [7 9], [], [], {1, 999999}) = 0 (Timeout) select(11, [7 9], [], [], {0, 0}) = 0 (Timeout) select(11, [9], [], [], {0, 0}) = 0 (Timeout) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 select(11, [7 9], [], [], {1, 999999}) = 0 (Timeout) select(11, [7 9], [], [], {0, 602}) = 0 (Timeout) select(11, [7 9], [], [], {0, 0}) = 0 (Timeout) select(11, [9], [], [], {0, 0}) = 0 (Timeout) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 select(11, [7 9], [], [], {1, 999997}) = 0 (Timeout) select(11, [7 9], [], [], {0, 566}) = 0 (Timeout) select(11, [7 9], [], [], {0, 0}) = 0 (Timeout) select(11, [9], [], [], {0, 0}) = 0 (Timeout) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 select(11, [7 9], [], [], {1, 999998}) = 0 (Timeout) select(11, [7 9], [], [], {0, 0}) = 0 (Timeout) select(11, [9], [], [], {0, 0}) = 0 (Timeout) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 select(11, [7 9], [], [], {1, 999999} <unfinished ...> Process 6945 detached </pre> -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here: http://projects.puppetlabs.com/my/account -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/puppet-bugs?hl=en.
