On Mar 10, 2010, at 1:38 PM, Douglas Garstang wrote:

On Wed, Mar 10, 2010 at 1:34 PM, Douglas Garstang
<[email protected]> wrote:
On Wed, Mar 10, 2010 at 1:17 PM, Brice Figureau
<[email protected]> wrote:
On 10/03/10 22:06, Douglas Garstang wrote:
So, it became apparent to me, after emailing someone off list, that
managing a lot of files in deep directory structures might be part of
the cause.

We are running 10 instances of JBOSS and 10 instances of tomcat on
each of these servers. Don't ask me why, it's just the way it was done
before I arrived and changing it is not trivial.

On disk, each instance of JBOSS starts at
/opt/jboss/current/server/tfelN (where N is the instance number)

and each instance of tomcat starts at:
/opt/tomcat/tfelN/starterkit/current (where N is the instance number)

Do you source the whole hierarchy?
Or do you only manage it?

I manually looked through the puppet config and counted 25 unique
files that are being managed for jboss and tomcat within these paths.
If you do the math, 25 x 10 x 2 = 500. That's therefore (currently)
500 unique files that are being managed in these deep directory
structures. Could that potentially be the reason behind puppets crap
performance?

What do you manage for those files?
But no, 500 doesn't seem like a high number to me.

You mentioned in another e-mail in this thread that the problem is more
the 20 minutes run than the CPU.
Could it be possible you have many "slow" execs?
Or you manage many packages?

This also reminds me Ohad's bug:
http://projects.reductivelabs.com/issues/1719

At this stage you should probably run puppetd on the console in -- debug
to see what happens (and run with --summarize too) and if it stalls.

I just ran puppet in debug mode and it was obvious that most of the
puppet run time was spent in checksumming files.

Eg:

debug: //Node[app01.fr.xxx.com]/Jboss::Instance[tfel8]/File[/opt/ jboss/current/server/tfel8/conf/jboss.web/localhost/ rewrite.properties]:
Creating checksum {md5}f5d16bcc20b92631eb59514018fd34e5

... takes a long time to run. Multiple that by several hundred files...

However, when I run this on the command line:
md5sum /opt/jboss/current/server/tfel8/conf/jboss.web/localhost/ rewrite.properties

... the result is instananeous... So... is puppet using a ruby library
for performing md5 checksums? Is that where the performance bottle
neck could be?

Doug


Also...

I just grabbed an example online of performing an md5 checksum on a
file in ruby.
Ran it on the same file above.
Result was instananeous... the question remains... what is puppet doing???

The short answer is, more than md5sum is. All you're seeing is the log message, you don't really know if that's what's taking all of the time.

We've always known about the performance problems of using Puppet to manage large file heirarchies, which is why we generally recommend you don't do it unless you've tested that it works for your use cases.

I've basically only ever seen two non-pathological cases where client runs take a long time: Either you're using a lot of yum (which we've mostly resolved), or you're doing a lot of large file heirarchies (or a few very large ones).

You can look at the reports coming out of your systems to see where time is being spent, and that should tell you almost immediately. It already looks like it's files, so I'd start by trying to trim back recursion where you can, and try not to manage large files where you can avoid it.

--
Silence is a text easy to misread.
    -- A. A. Attanasio, 'The Eagle and the Sword'
---------------------------------------------------------------------
Luke Kanies  -|-   http://reductivelabs.com   -|-   +1(615)594-8199

--
You received this message because you are subscribed to the Google Groups "Puppet 
Users" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.

Reply via email to