Issue #2888 has been updated by Jo Rhett.
Hm. I thought I posted a long update here yesterday, but it seems to be getting lost. Here's some state information from a locked node: 1. puppetdlock file is usually mtime 30 minutes later than the previous run 2. Nothing was logged in /var/log/messages 3. The previous run was aborted due to puppet already running (likely 'puppet agent --test' by hand) <pre> [root@q1-m1 ~]# sudo puppet agent --test notice: Ignoring --listen on onetime run notice: Run of Puppet configuration client already in progress; skipping [root@q1-m1 ~]# ls -la /var/lib/puppet/state total 160 drwxr-xr-t 3 root root 4096 Jun 18 21:13 . drwxr-xr-x 10 puppet puppet 4096 Jun 18 17:11 .. drwxr-xr-x 2 root root 4096 Jun 18 17:09 graphs -rw-r--r-- 1 root root 105644 Jun 18 20:43 last_run_report.yaml -rw-r--r-- 1 root root 607 Jun 18 20:43 last_run_summary.yaml -rw-r--r-- 1 root root 4 Jun 18 21:13 puppetdlock -rw-r--r-- 1 root root 3318 Jun 18 20:43 resources.txt -rw-rw---- 1 root root 26840 Jun 18 20:43 state.yaml [root@q1-m1 ~]# date Tue Jun 19 17:33:17 UTC 2012 [root@q1-m1 ~]# cat /var/lib/puppet/date cat: /var/lib/puppet/date: No such file or directory [root@q1-m1 ~]# cat /var/lib/puppet/state/puppetdlock 5602[root@q1-m1 ~]# grep 5602 /var/log/messages Jun 18 20:43:11 q1-m1 puppet-agent[5602]: Reopening log files Jun 18 20:43:11 q1-m1 puppet-agent[5602]: Starting Puppet client version 2.7.14 Jun 18 20:43:11 q1-m1 puppet-agent[5602]: Run of Puppet configuration client already in progress; skipping [root@q1-m1 ~]# </pre> Thus logic here says that your replication scenario is to run "puppet agent --test" just seconds before a normal puppet run to recreate the exact same scenario. ---------------------------------------- Bug #2888: puppetd doesn't always cleanup lockfile properly https://projects.puppetlabs.com/issues/2888#change-65375 Author: Peter Meier Status: Accepted Priority: Normal Assignee: Andrew Parker Category: plumbing Target version: Affected Puppet version: 0.25.1 Keywords: Branch: ok I had the patch #2661 now running for some weeks and I had nearly no problems anymore. However from time to time (maybe once,twice a week) a random client doesn't remove its lockfile (@/var/lib/puppet/state/puppetdlock@), hence future runs fail. I assume this might still happen due to a uncatched exception (as in #2261), however the problem is a) hard or nearly impossible to reproduce and b) it occurs really by random. The only thing I can see in the logs: <pre> Nov 30 19:27:41 foobar puppetd[26228]: Finished catalog run in 98.79 seconds Nov 30 20:00:02 foobar puppetd[3000]: Could not retrieve catalog from remote server: Error 502 on SERVER: <html>^M <head><title>502 Bad Gateway</title></head>^M <body bgcolor="white">^M <center><h1>502 Bad Gateway</h1></center>^M <hr><center>nginx/0.6.39</center>^M </body>^M </html>^M Nov 30 20:00:03 foobar puppetd[3000]: Using cached catalog Nov 30 20:00:03 foobar puppetd[3000]: Could not retrieve catalog; skipping run Nov 30 20:00:04 foobar puppetd[12169]: Run of Puppet configuration client already in progress; skipping Nov 30 20:30:04 foobar puppetd[21230]: Run of Puppet configuration client already in progress; skipping </pre> as I run puppetd by cron twice an hour with --splay I assume that the run between 19:30 and 20:00 got delayed till 20:00. At this time (20:00) a puppetmaster restart happens and due to that the 502 occured. This was the run of pid 3000, the next run (pid 12169) failed, this could either be as pid 3000 was still running or because there was already no puppetd anymore running and the lock file haven't been removed. However every future run failed as well as the lockfile wasn't removed. So somehow puppet doesn't remove lockfiles properly under certain conditions. PS: If you think it's better to reopen the old bugreport, close this one and duplicate and re-open #2261 -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here: http://projects.puppetlabs.com/my/account -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/puppet-bugs?hl=en.
