Issue #1095 has been updated by Ricky Zhou. Status changed from Closed to Re-opened Affected version changed from 0.22.1 to 0.25.0beta2
Hi, I am currently testing 0.25beta2 and I am also seeing this issue (specifically when stored configs is enabled). I have ruby-mysql and I don't see any warnings about using the built-in mysql functions, so that doesn't seem to be the problem in this case. Specifically, I am seeing the same symptom where after starting puppetmasterd (I use mongrel behind apache/mod_proxy_balancer) and attempting a puppet run, the puppet run times out, with the connection stuck in CLOSE_WAIT. The mongrel instance then stops handling requests (so something like a curl http://127.0.0.1:18141/ will hang after this happens). The output of puppetd -t I get (after a long wait) is: <pre> notice: Ignoring --listen on onetime run /usr/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill': execution expired (Timeout::Error) from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil' from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline' from /usr/lib/ruby/1.8/net/http.rb:2020:in `read_status_line' from /usr/lib/ruby/1.8/net/http.rb:2009:in `read_new' from /usr/lib/ruby/1.8/net/http.rb:1050:in `request' from /usr/lib/ruby/1.8/net/http.rb:1037:in `request' from /usr/lib/ruby/1.8/net/http.rb:543:in `start' from /usr/lib/ruby/1.8/net/http.rb:1035:in `request' ... 18 levels... from /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:226:in `send' from /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:226:in `run_command' from /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:217:in `run' from /usr/sbin/puppetd:159 </pre> and a netstat -pan | grep 1814 gives: <pre> tcp 0 0 127.0.0.1:18140 0.0.0.0:* LISTEN 23420/ruby tcp 0 0 127.0.0.1:18141 0.0.0.0:* LISTEN 23472/ruby tcp 0 0 127.0.0.1:18142 0.0.0.0:* LISTEN 23524/ruby tcp 0 0 127.0.0.1:18143 0.0.0.0:* LISTEN 23576/ruby tcp 0 0 127.0.0.1:18140 127.0.0.1:46870 ESTABLISHED 23420/ruby tcp 0 0 127.0.0.1:47823 127.0.0.1:18141 FIN_WAIT2 - tcp 0 0 127.0.0.1:46870 127.0.0.1:18140 ESTABLISHED 6050/httpd tcp 1 0 127.0.0.1:18141 127.0.0.1:47823 CLOSE_WAIT 23472/ruby tcp 1 0 127.0.0.1:18141 127.0.0.1:47829 CLOSE_WAIT 23472/ruby </pre> I did one more test where I pointed mod_proxy_balancer at only one mongrel instance, then did an strace -p on it and tried a puppet run. The strace output I got from that is available at http://ricky.fedorapeople.org/puppetmaster.strace. If I can help with providing any more debugging information, I'm ricky in #puppet on Freenode - feel free to ping me any time. ---------------------------------------- Bug #1095: Puppetmaster leaving half-open connections http://projects.reductivelabs.com/issues/1095 Author: Frank Sweetser Status: Re-opened Priority: High Assigned to: Luke Kanies Category: network Target version: 0.24.8 Complexity: Medium Patch: None Affected version: 0.25.0beta2 Keywords: After a period of time ranging from a few hours to several days, puppetmaster begins leaving half open TCP connections in a CLOSE_WAIT state. It usually seems to happen to connections from clients, though at least once I've seen it hit the database connection (MySQL). Here's an example: <pre> [r...@lorien ~]# lsof -i | grep 8140 puppetd 13420 root 7u IPv4 48150014 TCP lorien.wpi.edu:52225->lorien.wpi.edu:8140 (ESTABLISHED) puppetmas 13744 puppet 10u IPv4 47981997 TCP *:8140 (LISTEN) puppetmas 13744 puppet 205u IPv4 48146861 TCP lorien.wpi.edu:8140->DELENN.WPI.EDU:63688 (CLOSE_WAIT) puppetmas 13744 puppet 206u IPv4 48145681 TCP lorien.wpi.edu:8140->IVANOVA.WPI.EDU:54630 (CLOSE_WAIT) puppetmas 13744 puppet 208u IPv4 48146636 TCP lorien.wpi.edu:8140->DELENN.WPI.EDU:63687 (CLOSE_WAIT) puppetmas 13744 puppet 210u IPv4 48146848 TCP lorien.wpi.edu:8140->IVANOVA.WPI.EDU:58605 (CLOSE_WAIT) </pre> Once puppetmaster starts leaking sockets like this, it seems unable to answer any new requests. In this example, you can see that the puppet client on the local machine (lorien) has opened a connection to puppetmaster, but puppetmaster has not responded. None of the log files on either master or client show that any progress has been made. Sending a HUP to the server generates "Restarting" and "Shutting down" messages in syslog, but it never restarts. lsof shows that there are puppetmaster processes hanging around keeping the original set of half open sockets open, but nothing is listening for new connections anymore: <pre> [r...@lorien ~]# lsof -i | grep 8140 puppetmas 13744 puppet 205u IPv4 48146861 TCP lorien.wpi.edu:8140->DELENN.WPI.EDU:63688 (CLOSE_WAIT) puppetmas 13744 puppet 206u IPv4 48145681 TCP lorien.wpi.edu:8140->IVANOVA.WPI.EDU:54630 (CLOSE_WAIT) puppetmas 13744 puppet 208u IPv4 48146636 TCP lorien.wpi.edu:8140->DELENN.WPI.EDU:63687 (CLOSE_WAIT) puppetmas 13744 puppet 210u IPv4 48146848 TCP lorien.wpi.edu:8140->IVANOVA.WPI.EDU:58605 (CLOSE_WAIT) </pre> A full restart of puppetmaster appears to be the only way to get things flowing again. This is on 0.24.1 plus the patch from ticket 959. Let me know what other debugging info you'd like me to gather up. -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here: http://reductivelabs.com/redmine/my/account --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/puppet-bugs?hl=en -~----------~----~----~----~------~----~------~--~---
