On Mar 10, 2010, at 1:38 PM, Douglas Garstang wrote:
On Wed, Mar 10, 2010 at 1:34 PM, Douglas Garstang
<[email protected]> wrote:
On Wed, Mar 10, 2010 at 1:17 PM, Brice Figureau
<[email protected]> wrote:
On 10/03/10 22:06, Douglas Garstang wrote:
So, it became apparent to me, after emailing someone off list, that
managing a lot of files in deep directory structures might be
part of
the cause.
We are running 10 instances of JBOSS and 10 instances of tomcat on
each of these servers. Don't ask me why, it's just the way it was
done
before I arrived and changing it is not trivial.
On disk, each instance of JBOSS starts at
/opt/jboss/current/server/tfelN (where N is the instance number)
and each instance of tomcat starts at:
/opt/tomcat/tfelN/starterkit/current (where N is the instance
number)
Do you source the whole hierarchy?
Or do you only manage it?
I manually looked through the puppet config and counted 25 unique
files that are being managed for jboss and tomcat within these
paths.
If you do the math, 25 x 10 x 2 = 500. That's therefore (currently)
500 unique files that are being managed in these deep directory
structures. Could that potentially be the reason behind puppets
crap
performance?
What do you manage for those files?
But no, 500 doesn't seem like a high number to me.
You mentioned in another e-mail in this thread that the problem is
more
the 20 minutes run than the CPU.
Could it be possible you have many "slow" execs?
Or you manage many packages?
This also reminds me Ohad's bug:
http://projects.reductivelabs.com/issues/1719
At this stage you should probably run puppetd on the console in --
debug
to see what happens (and run with --summarize too) and if it stalls.
I just ran puppet in debug mode and it was obvious that most of the
puppet run time was spent in checksumming files.
Eg:
debug: //Node[app01.fr.xxx.com]/Jboss::Instance[tfel8]/File[/opt/
jboss/current/server/tfel8/conf/jboss.web/localhost/
rewrite.properties]:
Creating checksum {md5}f5d16bcc20b92631eb59514018fd34e5
... takes a long time to run. Multiple that by several hundred
files...
However, when I run this on the command line:
md5sum /opt/jboss/current/server/tfel8/conf/jboss.web/localhost/
rewrite.properties
... the result is instananeous... So... is puppet using a ruby
library
for performing md5 checksums? Is that where the performance bottle
neck could be?
Doug
Also...
I just grabbed an example online of performing an md5 checksum on a
file in ruby.
Ran it on the same file above.
Result was instananeous... the question remains... what is puppet
doing???
The short answer is, more than md5sum is. All you're seeing is the
log message, you don't really know if that's what's taking all of the
time.
We've always known about the performance problems of using Puppet to
manage large file heirarchies, which is why we generally recommend you
don't do it unless you've tested that it works for your use cases.
I've basically only ever seen two non-pathological cases where client
runs take a long time: Either you're using a lot of yum (which we've
mostly resolved), or you're doing a lot of large file heirarchies (or
a few very large ones).
You can look at the reports coming out of your systems to see where
time is being spent, and that should tell you almost immediately. It
already looks like it's files, so I'd start by trying to trim back
recursion where you can, and try not to manage large files where you
can avoid it.
--
Silence is a text easy to misread.
-- A. A. Attanasio, 'The Eagle and the Sword'
---------------------------------------------------------------------
Luke Kanies -|- http://reductivelabs.com -|- +1(615)594-8199
--
You received this message because you are subscribed to the Google Groups "Puppet
Users" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/puppet-users?hl=en.