Arnau Bria schrieb:
> My current conf splits 188 clients execution in one hour, and puppet
> runs as a cron job. My server (2cpu 2 GB RAM) runs with mongrel (with 8
> puppetmasterd) and this conf works fine.
> 
> We'd like puppet to run clients all at same time (force a change, i.e.),
> so we're testing several things. (Previous conf do not support
> massive execution).
> 
> First and most important, is moving our server to a better host: 4cpus
> 8GB RAM. We also run mongrel there, but now with 4 puppetmasterd, so
> each one has its own cpu.
> 
> With first server, we could run up to 40 clients at one time, now,
> 135. So we're improving.
> 
> The error in the nodes where put did not run is:
> 
> err: Connection timeout calling puppetmaster.getconfig: execution expired
> err: Could not retrieve catalog: Connection Timeout
> warning: Not using cache on failed catalog
> 
> 
> **What step of
> http://reductivelabs.com/trac/puppet/wiki/PuppetInternals gives this error?
> Compiling? It's not clear for me, I see no transfer action in that
> schema. We have no network errors.

This error means that the client timed out when waiting for the 
configuration from the puppetmasterd, marked as "Request to apply the 
configuration" in the diagram and "Configuration Transport" in the text.

> so, inorder to avoid this error, our first idea was an increment of
> timeout var in both sides, cleint and server:
> 
>   # How long the client should wait for the configuration to be 
>   retrieved
>     # before considering it a failure.  This can help reduce flapping if too
>     # many clients contact the server at one time.
>     # The default value is '120'.
> 
> 
> server:
> # puppetmasterd --genconf|grep timeout
>      configtimeout = 360
> 
> client:
> # puppetd --genconf|grep timeout
>      configtimeout = 360
> 
> but then we get more errors! only 1 client is able to run its conf.
 >
> Does it make sense for anyone?

Not to me. I'm wondering whether those are the same "getconfig" errors 
or do the clients already having a configuration time out on trying to 
fetch file resources?

> What other tunning could we test in order to reduce connection timeouts?
> 
> server:
> # rpm -qa|grep puppet
> puppet-0.24.7-4.el5
> puppet-server-0.24.7-4.el5
> # rpm -qa|grep mongrel
> rubygem-mongrel-1.0.1-6.el5
> 
> 
> client:
> # rpm -qa|grep puppet
> puppet-0.24.7-4.el4.x86_64


Two things I would recommend:

1) Do not start all your clients at once. Look at the fqdn_rand 
function[1] or --splay[2]. Even spreading the updates over only a few 
minutes might make much of a difference for you.


2) Upgrade to 0.24.8. If you are using storeconfigs, this is an absolute 
must.

Regards, DavidS

[1] http://reductivelabs.com/trac/puppet/wiki/FunctionReference#fqdn-rand
[2] 
http://reductivelabs.com/trac/puppet/wiki/ConfigurationReference#configuration-parameter-reference

-- 
dasz.at OG              Tel: +43 (0)664 2602670     Web: http://dasz.at
Klosterneuburg                                         UID: ATU64260999

        FB-Nr.: FN 309285 g          FB-Gericht: LG Korneuburg

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to