Hi,

I'm running CF-Engine 3.0.2, community edition on a grid of around 600 
systems. We use this for distributing /etc/passwd, et al. and have 
encountered a couple of issues. The most persistent problem we have 
encountered is that the cf-serverd process will segfault at random 
intervals but quite often (several times per day). Load does not appear 
to be a direct factor, since I can force an update and the service will 
usually "survive".

I did try recompiling with "CFLAGS=-D_FORTIFY_SOURCE=0" which has 
improved things but not enough. Currently I'm running a cron job to 
check to see if the service is running as a hold-over until I can check 
a new version. Does this tally with anyone else's experience? I can send 
the configuration if that is of use.

The second issue occurred yesterday on a client system that was under 
heavy load. The cf-agent service failed with the following error:

May 19 05:30:59 n123 cf-agent:  Can't stat new file /etc/passwd.cfnew
May 19 05:30:59 n123 cf-agent:   !!! System error for stat: "No such 
file or directory"
May 19 05:30:59 n123 cf-agent:  Unknown user root
May 19 05:30:59 n123 last message repeated 15 times
May 19 05:31:25 n123 sshd[6098]: fatal: Privilege separation user sshd 
does not exist

In itself, the fact that cf-agent can't find the newly downloaded file 
is a bit worrying but, by itself, survivable. However, it appears that 
the password file was itself deleted during this run as we were unable 
to login to the system and some services on the host were impacted. Now, 
quite aside from the merits or otherwise of using cf-engine to 
distribute identity information in this way, it seems that there might 
be cases where a file can be removed even though the updated (.cfnew) 
file is not available (I distribute other non-identity related updates 
using a similar mechanism). I have not been able to reproduce this case 
at all, and the system healed itself once the system returned to an idle 
state.

I am already looking into alternatives to distributing /etc/passwd, 
shadow and company throughout the grid while still using file based 
authentication. But I think there is a more general problem here with 
the copy_from method I use which is why I've written to the list.

Any guidance greatly appreciated.

Regards,

Malcolm.
_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to