Issue #5139 has been updated by Daniel Drake.
We are running puppet in 50 primary schools (and counting) in Nicaragua for the OLPC project. We are seeing various schools stop updating due to a puppetdlock file of 0 bytes, which results in the behaviour described here. We don't really know how/why this happens. This could be due to filesystem corruption related to the bad electricity conditions that we work under, but I'm a little doubtful, as we have yet to find damage to other areas of the filesystem. Anyway, I agree with the comment above separating the puppet lock and the admin-defined lock. That way if puppet finds the puppet lock as 0 bytes it could delete it and continue. ---------------------------------------- Bug #5139: puppetdlock file can be empty https://projects.puppetlabs.com/issues/5139#change-62431 Author: Alan Barrett Status: Needs More Information Priority: High Assignee: Andrew Forgue Category: agent Target version: 2.7.x Affected Puppet version: Keywords: mcollective enabledisable Branch: There seems to be something wrong with the way the $statedir/puppetdlock file is created. Under normal circumstances, when puppetd is running, the file contains the PID of the running puppetd process. For example: <pre> $ cat /var/puppet/state/puppetdlock 26898 [no newline at end of file] </pre> If puppetd crashes or is killed, then the file may be empty: <pre> $ cat /var/puppet/state/puppetdlock [empty] $ ls -l /var/puppet/state/puppetdlock -rw-r--r-- 1 root root 0 Oct 28 14:32 /var/pupp/state/puppetdlock </pre> This causes future puppetd runs to fail with "notice: Run of Puppet configuration client already in progress; skipping". I suspect that the problem is that the lock file is created in such a way that a crash between creating the file and writing to the file leaves the file empty. A safe technique is to write to a temporary file and then rename the temporary file, so that the actual lock file either does not exist, or exists with correct contents, but never exists with partial contents. Also, the code that complains about "client already in progress" could be smarter; it should read the PID from the lock file and verify that the process is actually running. If you need sample code, then see NetBSD's shlock(1) utility (http://cvsweb.netbsd.org/bsdweb.cgi/usr.bin/shlock/), which is derived from the code that was in HoneyDanBer UUCP. -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here: http://projects.puppetlabs.com/my/account -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/puppet-bugs?hl=en.
