Issue #14108 has been updated by Daniel Pittman.

Status changed from Unreviewed to Investigating

Hey.  I don't think we actually call poll directly anywhere in Puppet; that 
seems like it is down in the bowels of the Ruby interpreter.  I don't know why 
it is failing, and will take a look to see if I can figure out a more direct 
cause, but I can't say for sure we can actually do anything about this.
----------------------------------------
Bug #14108: puppet agent stuck in poll(), getting POLLERR
https://projects.puppetlabs.com/issues/14108#change-61060

Author: John Vestrum
Status: Investigating
Priority: Normal
Assignee: 
Category: 
Target version: 
Affected Puppet version: 2.7.13
Keywords: 
Branch: 


On a couple of my nodes (running Debian 2.6.32-41squeeze2), the puppet agent 
appears to start normally and "puppet agent --test" completes fine. However 
within a couple hours or less, the agent is running at 99% CPU and when I 
strace it I see this:

<pre>
poll([{fd=7, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=7, 
revents=POLLIN|POLLERR|POLLHUP}])
poll([{fd=7, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=7, 
revents=POLLIN|POLLERR|POLLHUP}])
poll([{fd=7, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=7, 
revents=POLLIN|POLLERR|POLLHUP}])
poll([{fd=7, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=7, 
revents=POLLIN|POLLERR|POLLHUP}])
... etc, 1000's per second ...
</pre>

/proc/pid/fd shows that fd=7 (and it is _always_ fd=7 when this happens) is 
pointing to a socket:

<pre>
# ls -l /proc/18026/fd
total 0
lr-x------ 1 root root 64 Apr 20 08:30 0 -> /dev/null
l-wx------ 1 root root 64 Apr 20 08:30 1 -> /dev/null
l-wx------ 1 root root 64 Apr 20 08:30 2 -> /dev/null
lr-x------ 1 root root 64 Apr 20 08:30 3 -> pipe:[3687384]
l-wx------ 1 root root 64 Apr 20 08:30 4 -> pipe:[3687384]
lrwx------ 1 root root 64 Apr 20 08:30 5 -> socket:[3687393]
lrwx------ 1 root root 64 Apr 20 08:30 7 -> socket:[3689060]
lr-x------ 1 root root 64 Apr 20 08:30 8 -> /dev/urandom
</pre>

But that socket doesn't seem to exist:

<pre>
# lsof -p 18026 | grep 3689060
puppet  18026 root    7u  sock                0,6      0t0  3689060 can't 
identify protocol
     
# netstat -lanep | grep 3689060
#
</pre>

So it appears that poll() is returning POLLERR but puppet continues to attempt 
to use the fd. Which I think is a bug but I'm not really a socket programmer. 
I'm still trying to figure out why this happens on only a couple of my nodes 
and not others. I upgraded the client from 2.6.2 to 2.7.13, same behaviour.



-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://projects.puppetlabs.com/my/account

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Bugs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-bugs?hl=en.

Reply via email to