Forum: Cfengine Help
Subject: Re: db errors
Author: sauer
Link to topic: https://cfengine.com/forum/read.php?3,22382,22472#msg-22472

Mikhail Gusarov Wrote:
-------------------------------------------------------
> On 06/07/2011 07:20 PM, Michael Stevens wrote:
> > We're only running cf3 on a subset of our
> machines for now, so I'm inclined to wait for
> 3.1.6, as functionality doesn't seem to be
> impaired. Any adverse impact from this beyond
> warning messages?
> 
> Well if you don't see any problems, then you're
> lucky -- given it's only
> locks database which is seriously affected, the
> only side effect I can
> think of is verifying some promises more often
> than expected.
> 
> The problem manifests itself badly on high-loaded
> cf-serverd instances
> where different threads stomp on each other toes
> trying to verify and
> write to same database often.

With several thousand machines running, I'm pretty frequently seeing cf-agent 
processes which will just hang up, leaving 2-10 cf-agent processes taking 100% 
CPU and just running "forever" since moving to from 3.0.5 to 3.1.5.  This is 
with a promise set which takes about a minute to run; it's not complicated, but 
it still explodes seemingly randomly.  Sometimes it just dies, other times it 
hangs up and blocks.

On a couple of machines where I have really long-running promises (~20 minutes) 
the cf-agent processes build up until the system is completely brought down - 
usually within 24-36 hours.

In both cases, I'm actually only seeing problems which appear related to db 
manifesting in cf-agent; I haven't upgraded any of the policy servers to 3.1.5 
yet, and probably won't.  But since the question was asked "how does this show 
up"... :)

I'm pretty sure we'll have to stop using the community RPM and start building 
our own, since the fixes are always in SVN.  This issue really seems like it 
justifies a patch-level package be built, but I guess time will tell what 
happens with that.  Unfortunately, it's not a trivial task (mostly politically) 
to deploy a new binary across a very large network, so I'm screwed either way. 
:)

_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to