Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Josh Cooper commented on PUP-1965 Re: Clients are hung when server has intermittent service Given the changes in PUP-7517 and PUP-8683, and lack of feedback, I'm going to close this ticket. Please reopen if you see this issue when using a non-infinite http_read_timeout. Add Comment This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-bugs/JIRA.28890.1395087652000.152.1582677780651%40Atlassian.JIRA.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Josh Cooper updated an issue Puppet / PUP-1965 Clients are hung when server has intermittent service Change By: Josh Cooper Team: Coremunity Add Comment This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Josh Cooper commented on PUP-1965 Re: Clients are hung when server has intermittent service We have runtimeout set to 1800s and it is ineffective. Michael Marod Did you mean runtimeout is not ineffective if http_read_timeout is left as the default infinite value? That would make sense since ruby is stuck in the call to @io.to_io.wait_readable(@read_timeout). Also Timeout.timeout is evil and shouldn't be used. Btw, we will be changing http_read_timeout and runtimeout to default to a non-infinite value in puppet 6, see PUP-8683. Add Comment This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Michael Marod commented on PUP-1965 Re: Clients are hung when server has intermittent service Suffering from the same symptoms and am led to believe that setting http_read_timeout will resolve the problem. I hooked up gdb to one of our stuck puppet agents and grabbed a backtrace: (gdb) call (void) rb_backtrace() from /opt/puppetlabs/puppet/bin/puppet:5:in `' from /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/command_line.rb:73:in `execute' from /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/command_line.rb:137:in `run' from /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/application.rb:375:in `run' from /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util.rb:661:in `exit_on_fail' from /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/application.rb:375:in `block in run' from /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/application/agent.rb:352:in `run_command' from /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/application/agent.rb:390:in `main' from /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/daemon.rb:149:in `start' from /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/daemon.rb:193:in `run_event_loop'
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Josh Cooper assigned an issue to Sam McLeod Puppet / PUP-1965 Clients are hung when server has intermittent service Change By: Josh Cooper Assignee: jon yeargers Sam McLeod Add Comment This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Josh Cooper commented on PUP-1965 Re: Clients are hung when server has intermittent service Sam McLeod can you verify that setting http_read_timeout to a non-infinite value resolves the issue? If not please capture strace output when the agent is stuck. Add Comment This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Josh Cooper commented on PUP-1965 Re: Clients are hung when server has intermittent service We've added a watchdog/supervisor to the agent (in PUP-7517), but it not activated by default in 5.x. I recommend setting runtimeout to a non-infinite value to activate it. Also note you may want to change http_read_timeout to a non-infinite value. We will be changing the default value of both settings in 6.0 in PUP-8683. Add Comment This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title R.I.Pienaar commented on PUP-1965 Re: Clients are hung when server has intermittent service This can probably be closed after PUP-7517 Add Comment This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Sam McLeod updated an issue Puppet / PUP-1965 Clients are hung when server has intermittent service Change By: Sam McLeod Comment: We have this issue as well.We can reproduce it on Puppet Enterprise 2018.1 and CentOS 7.5 clients. Add Comment This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Sam McLeod commented on PUP-1965 Re: Clients are hung when server has intermittent service We have this issue as well. We can reproduce it on Puppet Enterprise 2018.1 and CentOS 7.5 clients. Add Comment This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Moses Mendoza updated an issue Puppet / PUP-1965 Clients are hung when server has intermittent service Change By: Moses Mendoza Labels: triaged Add Comment This message was sent by Atlassian JIRA (v6.4.14#64029-sha1:ae256fe) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Maggie Dreyer updated an issue Puppet / PUP-1965 Clients are hung when server has intermittent service Change By: Maggie Dreyer Labels: triaged Add Comment This message was sent by Atlassian JIRA (v6.4.14#64029-sha1:ae256fe) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at https://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Mike Fitzgerald commented on an issue Re: Clients are hung when server has intermittent service I have updated ruby on these clients to version 1.9.3 and the problem looks to no longer be occurring. Add Comment Puppet / PUP-1965 Clients are hung when server has intermittent service Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. ... This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups Puppet Bugs group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Henrik Lindberg commented on an issue Re: Clients are hung when server has intermittent service Hm, I believe the timeout is the parameter after the two NULL's in the second output you showed - it would be interesting to see that, it is revealing if it is the 2min timeout, or something else. Your Ruby 1.8.7 version is a bit out of date (on my machine I have p370, and p371, and I think there are even later patches). Are all your machines that hang Ubunty Precise? Do you have any that runs Ruby 1.9.3 (that hangs) ? Add Comment Puppet / PUP-1965 Clients are hung when server has intermittent service Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. ... This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups Puppet
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Mike Fitzgerald commented on an issue Re: Clients are hung when server has intermittent service I don't currently have anything that is running with a newer version of ruby. I will update some hosts to 1.9.3 and see if I can reproduce the problem. Add Comment Puppet / PUP-1965 Clients are hung when server has intermittent service Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. ... This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups Puppet Bugs group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Henrik Lindberg commented on an issue Re: Clients are hung when server has intermittent service What have you set the timeout to be? Currently if the agent loses the connection in a read, but never gets the packet that tells it about the packet loss, it will wait on the read until the timeout. If timeout is infinite then the agent hangs. In versions before 3.7, the timeout is also applied to the total time it takes to do a plugin-sync. Hence, the timeout must be set to a value longer than that, but to a shorter time than the time you are willing to wait until manually restarting the agent. (In 3.7, the timeout is only applied to connect and read.) Add Comment Puppet / PUP-1965 Clients are hung when server has intermittent service Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. ... This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede)
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Henrik Lindberg updated an issue Puppet / PUP-1965 Clients are hung when server has intermittent service Change By: Henrik Lindberg Assignee: jonyeargers Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups Puppet Bugs group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Mike Fitzgerald commented on an issue Re: Clients are hung when server has intermittent service I don't have a timeout specified in my puppet agent or server configuration. I'm under the impression that it will default to 2 minutes if this is not set. Add Comment Puppet / PUP-1965 Clients are hung when server has intermittent service Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. ... This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups Puppet Bugs group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs.
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Henrik Lindberg commented on an issue Re: Clients are hung when server has intermittent service Hm, that suggests that timeouts are not working; if they did the agent should timeout a read after 2 minutes of no activity and then fail. Maybe it manages to get a read through every 2 minutes when transferring a larger file, and thus will take a very long time to complete a file transfer. Do you have any / many larger files being transferred? I am not 100% sure how the read_timeout in Ruby Net::HTTP translates to packet reads, if we assume this is at the Ethernet MTU level (i.e. ~1500 bytes), the the max transfer time of 1 Mb (with a 2 min read timeout) would be something like 23 hours. It is unclear when reading the Ruby docs if the read_timeout applies to per packet (as HTTP Session suggests), or to an entire read (the Net::HTTP simply states that it is the read_timeout without further explanation). It is also unclear if the timeout is per the underlying media packet size, or the size requested from the socket (consisting of potentially many media packets over Ethernet). It is unclear if the Ruby implementation relies on the socket option SO_RCVTIMEO (which seems to not be supported on all platforms). It may be that we run into this: http://stackoverflow.com/questions/9853516/set-socket-timeout-in-ruby-via-so-rcvtimeo-socket-option (again, I am not sure if what is stated there is correct timeout does not work under Ruby 1.9, that seems pretty major, and is now perhaps fixed). It raises the questions though regarding which version of Ruby that is used (on agents), since we do see timeouts occurring in our test environments (we are using 1.9.3-p194 or later). Unfortunately, until 3.7, the read_timeout was also applied to the overall transfer time of all plugins. That issue meant that plugins that took more than 2 minutes to complete in total failed. Thus if you try to set the timeout to a shorter value (to fail if the transfer is going to take a very long time) you may always fail in versions before 3-7 because there is not enough time to complete the plugin-sync. I guess that was a long way of asking which Ruby version the hanging agents are using. More information about what the hanging agent is doing can be obtained by running strace, try to look for the call to setsockopt to see what options have been set, and what the process is hanging on; a recvfrom or ppoll or something like that. (It is not meaningful to send an strace for the entire agent run as it will potentially be huge). Add Comment
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Henrik Lindberg commented on an issue Re: Clients are hung when server has intermittent service With a bit of digging, it turns out that it likely reads in chunks of 16k, and thus a 1Mb transfer can max take ~2hours. It may be hard to distinguish that from a hung agent. It would be valuable to try to figure out if the agent is receiving at all, or if it is indeed hung by tracing it at the system level. Add Comment Puppet / PUP-1965 Clients are hung when server has intermittent service Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. ... This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups Puppet Bugs group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group,
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Mike Fitzgerald commented on an issue Re: Clients are hung when server has intermittent service Thank you for the detailed reply. I'm running ruby 1.8.7 (2011-06-30 patchlevel 352) The behaviour is occurring on puppet runs that have no updates to the configuration/catalog - so there are no large files being transferred. Here is an example of an agent process, and a stuck 'applying configuration' process (running since yesterday) root 2200 0.0 0.1 127444 49436 ? S Jul21 0:00 puppet agent: applying configuration root 30667 0.0 0.1 114712 45568 ? Ss Jul18 0:09 /usr/bin/ruby1.8 /usr/local/bin/puppet agent Attaching strace to process 30667 (puppet agent): wait4(2200, Attaching strace to process 2200 (sits here forever): select(5, [4], NULL, NULL If I attach strace to the puppet agent process (following any child process), and kill the stuck 'applying configuration' process, it will spawn another which runs through the puppet configuration successfully. Because of this, I am unable to reproduce the problem with strace attached. Add Comment Puppet / PUP-1965 Clients are hung when server has intermittent service Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. ...
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Mike Fitzgerald commented on an issue Re: Clients are hung when server has intermittent service I am experiencing this issue with Puppet 3.5.1. I have about 50 nodes on the end of an unreliable connection, and several of them get into this state each day. Add Comment Puppet / PUP-1965 Clients are hung when server has intermittent service Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. ... This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups Puppet Bugs group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title Michael Bottoms commented on an issue Re: Clients are hung when server has intermittent service I have ~250 machines in my environment running Ubuntu Precise and updating over a WAN link. We see this issue regularly when connection issues interrupt an update process. I tend to restart puppet processes on 10-15 servers a week. Add Comment Puppet / PUP-1965 Clients are hung when server has intermittent service Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. ... This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups Puppet Bugs group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at
Jira (PUP-1965) Clients are hung when server has intermittent service
Title: Message Title jon created an issue Puppet / PUP-1965 Clients are hung when server has intermittent service Issue Type: Bug Affects Versions: 3.4.0 Assignee: Unassigned Created: 17/Mar/14 1:20 PM Environment: Clients: debian 6.x (ARM) Server: CentOS 6.4 x64 Priority: Normal Reporter: jon Unrelated server in data center (same subnet) comes under DOS attack. Network service to other machines in subnet is slow, sparse and subject to failure during this time. All puppet clients that attempt to connect / update during this period are hung. Process list shows puppet agent service process and 'puppet agent: applying configuration' process. Doing a 'kill -9' on the 'update' process without stopping the puppet agent service results in a new 'update' process being started. This process will hang also. If the agent process is stopped as well as the 'kill' on the update process then it is possible to run 'puppet agent --test' successfully. (note: problem at datacenter must be resolved first). Every client must be touched as a result. NOTE: both client(s) and server are running open source build(s). Server is running apache2 / passenger to service puppet requests.