On Dec 2, 2011, at 1:30 PM, Nigel Kersten wrote:
> On Fri, Dec 2, 2011 at 1:03 PM, Jo Rhett <[email protected]> wrote:
> Okay, this has happened again.  Puppet master stopped logging catalog 
> compiles, every server stopped returning results and the global queue went 
> quickly through the roof in like 9 minutes.  It appears puppet master is 
> stopping dead in its tracks without logging any errors.
> 
> A really quick test would be to start a webrick puppetmaster on an alternate 
> port with the same configuration file in debug mode and then puppet against 
> it to see if there's a problem at that level,
> 
> (on master)
> puppet master --no-daemonize --verbose --debug --masterport 9140 (for example)
> 
> (on an agent)
> puppet agent --test --masterport 9140

This works perfectly fine.

> If that doesn't show anything, let us know whether you're running Apache 
> prefork or worker, and your relevant pool regulation settings like:
> 
> StartServers
> MinSpareServers
> MaxSpareServers
> ServerLimit
> MaxClients
> MaxRequestsPerChild

pre fork  with the following settings:

StartServers       8
MinSpareServers    5
MaxSpareServers   20
ServerLimit      256
MaxClients       256
MaxRequestsPerChild  4000

> # passenger-status
> ----------- General information -----------
> max      = 20
> count    = 20
> active   = 20
> inactive = 0
> Waiting on global queue: 209
> 
> ----------- Domains -----------
> /etc/puppet/rack: 
>   PID: 25783   Sessions: 1    Processed: 329     Uptime: 2h 52m 7s
>   PID: 25831   Sessions: 1    Processed: 4       Uptime: 2h 52m 5s
>   PID: 28517   Sessions: 1    Processed: 6       Uptime: 2h 22m 0s
>   PID: 25802   Sessions: 1    Processed: 714     Uptime: 2h 52m 6s
>   PID: 30905   Sessions: 1    Processed: 13      Uptime: 1h 50m 27s
>   PID: 25864   Sessions: 1    Processed: 709     Uptime: 2h 52m 4s
>   PID: 31028   Sessions: 1    Processed: 347     Uptime: 1h 50m 21s
>   PID: 28944   Sessions: 1    Processed: 377     Uptime: 2h 21m 50s
>   PID: 31090   Sessions: 1    Processed: 266     Uptime: 1h 50m 18s
>   PID: 577     Sessions: 1    Processed: 400     Uptime: 1h 27m 27s
>   PID: 418     Sessions: 1    Processed: 647     Uptime: 1h 28m 2s
>   PID: 1247    Sessions: 1    Processed: 133     Uptime: 1h 19m 3s
>   PID: 1474    Sessions: 1    Processed: 52      Uptime: 1h 18m 9s
>   PID: 594     Sessions: 1    Processed: 378     Uptime: 1h 27m 26s
>   PID: 4706    Sessions: 1    Processed: 414     Uptime: 48m 5s
>   PID: 4775    Sessions: 1    Processed: 218     Uptime: 47m 28s
>   PID: 4854    Sessions: 1    Processed: 584     Uptime: 47m 23s
>   PID: 7774    Sessions: 1    Processed: 165     Uptime: 14m 27s
>   PID: 7902    Sessions: 1    Processed: 44      Uptime: 13m 44s
>   PID: 8149    Sessions: 1    Processed: 541     Uptime: 11m 21s
> 
> 
> On Dec 2, 2011, at 10:58 AM, Jo Rhett wrote:
>> I came in this morning to find all the servers all locked up solid:
>> 
>> # passenger-status
>> ----------- General information -----------
>> max      = 20
>> count    = 20
>> active   = 20
>> inactive = 0
>> Waiting on global queue: 236
>> 
>> ----------- Domains -----------
>> /etc/puppet/rack: 
>>  PID: 2720    Sessions: 1    Processed: 939     Uptime: 9h 22m 18s
>>  PID: 1615    Sessions: 1    Processed: 947     Uptime: 9h 23m 14s
>>  PID: 1596    Sessions: 1    Processed: 607     Uptime: 9h 23m 15s
>>  PID: 1722    Sessions: 1    Processed: 953     Uptime: 9h 23m 9s
>>  PID: 2218    Sessions: 1    Processed: 378     Uptime: 9h 22m 43s
>>  PID: 4286    Sessions: 1    Processed: 178     Uptime: 8h 50m 58s
>>  PID: 5749    Sessions: 1    Processed: 708     Uptime: 8h 20m 20s
>>  PID: 4253    Sessions: 1    Processed: 820     Uptime: 8h 51m 1s
>>  PID: 5624    Sessions: 1    Processed: 126     Uptime: 8h 20m 24s
>>  PID: 7328    Sessions: 1    Processed: 811     Uptime: 7h 49m 17s
>>  PID: 7274    Sessions: 1    Processed: 984     Uptime: 7h 49m 20s
>>  PID: 8761    Sessions: 1    Processed: 85      Uptime: 7h 18m 50s
>>  PID: 9135    Sessions: 1    Processed: 907     Uptime: 7h 16m 27s
>>  PID: 8777    Sessions: 1    Processed: 342     Uptime: 7h 18m 49s
>>  PID: 10508   Sessions: 1    Processed: 51      Uptime: 6h 47m 6s
>>  PID: 10853   Sessions: 1    Processed: 603     Uptime: 6h 43m 9s
>>  PID: 10620   Sessions: 1    Processed: 939     Uptime: 6h 45m 52s
>>  PID: 11438   Sessions: 1    Processed: 870     Uptime: 6h 30m 8s
>>  PID: 12582   Sessions: 1    Processed: 448     Uptime: 6h 9m 59s
>>  PID: 12670   Sessions: 1    Processed: 400     Uptime: 6h 8m 46s
>> 
>> For comparison, most of our server processes recycle within 20 minutes 
>> normally, as they hit 1000 really fast.
>> 
>> # you probably want to tune these settings
>> PassengerHighPerformance on
>> PassengerUseGlobalQueue on
>> PassengerMaxPoolSize 20
>> PassengerPoolIdleTime 1800
>> PassengerMaxRequests 1000
>> #PassengerStatThrottleRate 120
>> RackAutoDetect Off
>> RailsAutoDetect Off
>> 
>> There is nothing useful in the system logs.  They just stopped:
>> 
>> Dec  2 12:06:34 axxats003 puppet-master[12670]: Compiled catalog for 
>> axxamx001.sjc.company.com in environment production 
>> in 1.76 seconds
>> Dec  2 12:06:37 axxats003 puppet-master[12670]: Compiled catalog for 
>> axxatn016.sjc.company.com in environment production 
>> in 1.64 seconds
>> Dec  2 12:06:40 axxats003 puppet-master[12670]: Compiled catalog for 
>> axaafc001.company.com in environment production i
>> n 1.70 seconds
>> Dec  2 14:10:02 axxats003 puppet-agent[16965]: Reopening log files
>> Dec  2 14:10:02 axxats003 puppet-agent[16965]: Starting Puppet client 
>> version 2.6.12
>> Dec  2 14:12:04 axxats003 puppet-agent[16965]: Could not retrieve catalog 
>> from remote server: execution expired
>> Dec  2 14:12:04 axxats003 puppet-agent[16965]: Using cached catalog
>> 
>> (every 30 minutes puppet agent says the same thing until I restart the 
>> puppet master)
>> 
>> Dec  2 18:06:09 axxats003 puppet-master[25783]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:10 axxats003 puppet-master[25802]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:11 axxats003 puppet-master[25831]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:12 axxats003 puppet-master[25864]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:13 axxats003 puppet-master[25897]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:14 axxats003 puppet-master[25922]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:15 axxats003 puppet-master[25947]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:16 axxats003 puppet-master[25972]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:17 axxats003 puppet-master[25997]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:18 axxats003 puppet-master[26019]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:19 axxats003 puppet-master[26056]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:20 axxats003 puppet-master[26081]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:06:21 axxats003 puppet-master[26115]: Starting Puppet master 
>> version 2.6.12
>> Dec  2 18:14:32 axxats003 puppet-master[26115]: Compiled catalog for 
>> axxatn018.sjc.company.com in environment production in 3.63 seconds
>> Dec  2 18:14:37 axxats003 puppet-master[26115]: Compiled catalog for 
>> axxamb002.sjc.company.com in environment production in 1.47 seconds
>> Dec  2 18:14:50 axxats003 puppet-master[26115]: Compiled catalog for 
>> axxasn001.sjc.company.com in environment production in 1.57 seconds
>> 
>> There are no other messages in /var/log/messages -- the system was otherwise 
>> not busy.  Apache error log only observed max clients get hit:
>> [Fri Dec 02 08:42:43 2011] [notice] Apache/2.2.3 (CentOS) configured -- 
>> resuming normal operations
>> [Fri Dec 02 12:23:46 2011] [error] server reached MaxClients setting, 
>> consider raising the MaxClients setting
>> [Fri Dec 02 18:06:07 2011] [notice] caught SIGTERM, shutting down
>> [Fri Dec 02 18:06:08 2011] [notice] suEXEC mechanism enabled (wrapper: 
>> /usr/sbin/suexec)
>> [Fri Dec 02 18:06:08 2011] [warn] RSA server certificate CommonName (CN) 
>> `puppetmaster.company.com' does NOT match server name!?
>> [Fri Dec 02 18:06:08 2011] [notice] Digest: generating secret for digest 
>> authentication ...
>> [Fri Dec 02 18:06:08 2011] [notice] Digest: done
>> [Fri Dec 02 18:06:08 2011] [warn] RSA server certificate CommonName (CN) 
>> `puppetmaster.company.com' does NOT match server name!?
>> [Fri Dec 02 18:06:08 2011] [notice] Apache/2.2.3 (CentOS) configured -- 
>> resuming normal operations
>> 
>> 
>> -- 
>> Jo Rhett
>> [email protected]
>> (415) 999-1798
>> 
>> -- 
>> Jo Rhett
>> Net Consonance : consonant endings by net philanthropy, open source and 
>> other randomness
>> 
> 
> -- 
> Jo Rhett
> Net Consonance : consonant endings by net philanthropy, open source and other 
> randomness
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Puppet Users" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/puppet-users?hl=en.
> 
> 
> 
> -- 
> Nigel Kersten
> Product Manager, Puppet Labs
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Puppet Users" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/puppet-users?hl=en.

-- 
Jo Rhett
Net Consonance : consonant endings by net philanthropy, open source and other 
randomness

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.

Reply via email to