Issue #5139 has been updated by Daniel Drake.

We are running puppet in 50 primary schools (and counting) in Nicaragua for the 
OLPC project. We are seeing various schools stop updating due to a puppetdlock 
file of 0 bytes, which results in the behaviour described here. We don't really 
know how/why this happens.

This could be due to filesystem corruption related to the bad electricity 
conditions that we work under, but I'm a little doubtful, as we have yet to 
find damage to other areas of the filesystem.

Anyway, I agree with the comment above separating the puppet lock and the 
admin-defined lock. That way if puppet finds the puppet lock as 0 bytes it 
could delete it and continue.
----------------------------------------
Bug #5139: puppetdlock file can be empty
https://projects.puppetlabs.com/issues/5139#change-62431

Author: Alan Barrett
Status: Needs More Information
Priority: High
Assignee: Andrew Forgue
Category: agent
Target version: 2.7.x
Affected Puppet version: 
Keywords: mcollective enabledisable
Branch: 


There seems to be something wrong with the way the $statedir/puppetdlock file 
is created.  Under normal circumstances, when puppetd is running, the file 
contains the PID of the running puppetd process.  For example:

<pre>
$ cat /var/puppet/state/puppetdlock
26898 [no newline at end of file]
</pre>

If puppetd crashes or is killed, then the file may be empty:

<pre>
$ cat /var/puppet/state/puppetdlock
[empty]
$ ls -l /var/puppet/state/puppetdlock
-rw-r--r--   1 root     root           0 Oct 28 14:32 
/var/pupp/state/puppetdlock
</pre>

This causes future puppetd runs to fail with "notice: Run of Puppet 
configuration client already in progress; skipping".

I suspect that the problem is that the lock file is created in such a way that 
a crash between creating the file and writing to the file leaves the file 
empty.  A safe technique is to write to a temporary file and then rename the 
temporary file, so that the actual lock file either does not exist, or exists 
with correct contents, but never exists with partial contents.

Also, the code that complains about "client already in progress" could be 
smarter; it should read the PID from the lock file and verify that the process 
is actually running.

If you need sample code, then see NetBSD's shlock(1) utility 
(http://cvsweb.netbsd.org/bsdweb.cgi/usr.bin/shlock/), which is derived from 
the code that was in HoneyDanBer UUCP.


-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://projects.puppetlabs.com/my/account

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en.

Reply via email to