Issue #14108 has been reported by John Vestrum.
----------------------------------------
Bug #14108: puppet agent stuck in poll(), getting POLLERR
https://projects.puppetlabs.com/issues/14108
Author: John Vestrum
Status: Unreviewed
Priority: Normal
Assignee:
Category:
Target version:
Affected Puppet version: 2.7.13
Keywords:
Branch:
On a couple of my nodes (running Debian 2.6.32-41squeeze2), the puppet agent
appears to start normally and "puppet agent --test" completes fine. However
within a couple hours or less, the agent is running at 99% CPU and when I
strace it I see this:
<pre>
poll([{fd=7, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=7,
revents=POLLIN|POLLERR|POLLHUP}])
poll([{fd=7, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=7,
revents=POLLIN|POLLERR|POLLHUP}])
poll([{fd=7, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=7,
revents=POLLIN|POLLERR|POLLHUP}])
poll([{fd=7, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=7,
revents=POLLIN|POLLERR|POLLHUP}])
... etc, 1000's per second ...
</pre>
/proc/pid/fd shows that fd=7 (and it is _always_ fd=7 when this happens) is
pointing to a socket:
<pre>
# ls -l /proc/18026/fd
total 0
lr-x------ 1 root root 64 Apr 20 08:30 0 -> /dev/null
l-wx------ 1 root root 64 Apr 20 08:30 1 -> /dev/null
l-wx------ 1 root root 64 Apr 20 08:30 2 -> /dev/null
lr-x------ 1 root root 64 Apr 20 08:30 3 -> pipe:[3687384]
l-wx------ 1 root root 64 Apr 20 08:30 4 -> pipe:[3687384]
lrwx------ 1 root root 64 Apr 20 08:30 5 -> socket:[3687393]
lrwx------ 1 root root 64 Apr 20 08:30 7 -> socket:[3689060]
lr-x------ 1 root root 64 Apr 20 08:30 8 -> /dev/urandom
</pre>
But that socket doesn't seem to exist:
<pre>
# lsof -p 18026 | grep 3689060
puppet 18026 root 7u sock 0,6 0t0 3689060 can't
identify protocol
# netstat -lanep | grep 3689060
#
</pre>
So it appears that poll() is returning POLLERR but puppet continues to attempt
to use the fd. Which I think is a bug but I'm not really a socket programmer.
I'm still trying to figure out why this happens on only a couple of my nodes
and not others. I upgraded the client from 2.6.2 to 2.7.13, same behaviour.
--
You have received this notification because you have either subscribed to it,
or are involved in it.
To change your notification preferences, please click here:
http://projects.puppetlabs.com/my/account
--
You received this message because you are subscribed to the Google Groups
"Puppet Bugs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/puppet-bugs?hl=en.