One addendum - the box is absolutely not io or cpu bound: Cpu(s): 83.0%us, 13.1%sy, 0.0%ni, 2.5%id, 0.0%wa, 0.1%hi, 1.3%si, 0.0%st
(64bit kvm vm w/ 6 3.5ghz amd64 cpus, on an lvm partition - raw disk - with 5G ram, but only using 3 gigs. PLENTY of power, and monitoring supports that..) On Wed, Dec 15, 2010 at 1:35 PM, Disconnect <dc.disconn...@gmail.com> wrote: > "me too". All the logs show nice quick compilations but the actual wall > clock to get anything done is HUGE. > > Dec 15 13:10:29 puppet puppet-master[31406]: Compiled catalog for > puppet.foo.com in environment production in 21.52 seconds > Dec 15 13:10:51 puppet puppet-agent[8251]: Caching catalog for > puppet.foo.com > > That was almost 30 minutes ago. Since then, it has sat there doing > nothing... > $ sudo strace -p 8251 > Process 8251 attached - interrupt to quit > select(7, [6], [], [], {866, 578560} > > lsof shows: > puppetd 8251 root 6u IPv4 11016045 0t0 TCP > puppet.foo.com:33065->puppet.foo.com:8140 (ESTABLISHED) > > > > On Wed, Dec 15, 2010 at 1:27 PM, Ashley Penney <apen...@gmail.com> wrote: > >> This issue is definitely a problem. I have a support ticket in with >> Puppet Labs about the same thing. My CPU remains at 100% almost constantly >> and it slows things down significantly. If you strace it you can see that >> very little appears to be going on. This is absolutely not normal behavior. >> Even when I had 1 client checking in I had all cores fully used. >> >> >> On Wed, Dec 15, 2010 at 12:15 PM, Brice Figureau < >> brice-pup...@daysofwonder.com> wrote: >> >>> On Wed, 2010-12-15 at 05:28 -0800, Chris wrote: >>> > >>> > On Dec 15, 12:42 pm, Brice Figureau <brice-pup...@daysofwonder.com> >>> > wrote: >>> > > On Tue, 2010-12-14 at 00:24 -0800, Chris wrote: >>> > > > Hi >>> > > >>> > > > I recently upgraded my puppet masters (and clients) from 0.24.8 to >>> > > > 2.6.4 >>> > > >>> > > > Previously, my most busy puppet master would hover around about 0.9 >>> > > > load average, after the upgrade, its load hovers around 5 >>> > > >>> > > > I am running passenger and mysql based stored configs. >>> > > >>> > > > Checking my running processes, ruby (puppetmasterd) shoots up to >>> 99% >>> > > > cpu load and stays there for a few seconds before dropping again. >>> > > > Often there are 4 of these running simultaneously, pegging each >>> core >>> > > > at 99% cpu. >>> > > >>> > > I would say it is perfectly normal. Compiling the catalog is a hard >>> and >>> > > complex problem and requires CPU. >>> > > >>> > > The difference between 0.24.8 and 2.6 (or 0.25 for what matters) is >>> that >>> > > some performance issues have been fixed. Those issues made the master >>> be >>> > > more I/O bound under 0.24, but now mostly CPU bound in later >>> versions. >>> > >>> > If we were talking about only cpu usage, I would agree with you. But >>> > in this case, the load average of the machine has gone up over 5x. >>> > And as high load average indicates processes not getting enough >>> > runtime, in this case it is an indication to me that 2.6 is performing >>> > worse than 0.24 (previously, on average, all processes got enough >>> > runtime and did not have to wait for system resources, now processes >>> > are sitting in the run queue, waiting to get a chance to run) >>> >>> Load is not necessarily an indication of an issue. It can also mean some >>> tasks are waiting for I/O not only CPU. >>> The only real issue under load is if service time is beyond an >>> admissible value, otherwise you can't say it's bad or not. >>> If you see some hosts reporting timeouts, then it's an indication that >>> service time is not good :) >>> >>> BTW, do you run your mysql storedconfig instance on the same server? >>> You can activate thin_storeconfigs to reduce the load on the mysql db. >>> >>> > > >>> > > I don't really get what is the issue about using 100% of CPU? >>> > Thats not the issue, just an indication of what is causing it >>> > >>> > > >>> > > You're paying about the same price when your CPU is used and when >>> it's >>> > > idle, so that shouldn't make a difference :) >>> > Generally true, but this is a on VM which is also running some of my >>> > radius and proxy instances, amongst others. >>> > >>> > > >>> > > If that's an issue, reduce the concurrency of your setup (run less >>> > > compilation in parallel, implement splay time, etc...). >>> > splay has been enabled since 0.24 >>> > >>> > My apache maxclients is set to 15 to limit concurrency. >>> >>> I think this is too many except if you have 8 cores. As Trevor said in >>> another e-mail in this thread, 2PM/Core is the best. >>> >>> Now it all depends on your number of nodes and sleeptime. I suggest you >>> use ext/puppet-load to find your setup real concurrency. >>> -- >>> Brice Figureau >>> Follow the latest Puppet Community evolutions on www.planetpuppet.org! >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Puppet Users" group. >>> To post to this group, send email to puppet-us...@googlegroups.com. >>> To unsubscribe from this group, send email to >>> puppet-users+unsubscr...@googlegroups.com<puppet-users%2bunsubscr...@googlegroups.com> >>> . >>> For more options, visit this group at >>> http://groups.google.com/group/puppet-users?hl=en. >>> >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Puppet Users" group. >> To post to this group, send email to puppet-us...@googlegroups.com. >> To unsubscribe from this group, send email to >> puppet-users+unsubscr...@googlegroups.com<puppet-users%2bunsubscr...@googlegroups.com> >> . >> For more options, visit this group at >> http://groups.google.com/group/puppet-users?hl=en. >> > > -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-us...@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.