Dear All, Yesterday one of the users filled up /tmp on a main node with junk and it rendered cfengine unusable. First it reported
daemon.log:May 6 21:11:23 ravana cfservd[16657]: Couldn't open checksum database /tmp/testDATABASEcache daemon.log:May 6 21:11:23 ravana cfservd[16657]: db_open: No space left on device and seems after that whenever any node connects to it - cfservd becomes extremely busy and then finally fails with next message being reported by the nodes cfengine:node20: Received signal 13 (SIGPIPE) while doing [no_active_lock] cfengine:node20: Logical start time Fri May 6 23:51:10 2005 cfengine:node20: This sub-task started really at Fri May 6 23:51:10 2005 or actually now for some reason without a node name cfengine:: Received signal 13 (SIGPIPE) while doing [pre-lock-state] cfengine:: Logical start time Sat May 7 11:00:33 2005 cfengine:: This sub-task started really at Sat May 7 11:00:33 2005 and then another stating refusal for copying cfengine:: Transmission refused or failed statting /etc/cfengine/inputs/CVS/Repository Got: cfengine:: Received signal 13 (SIGPIPE) while doing [lock.cfagent_conf.node2.copy.copy_3343] cfengine:: Logical start time Sat May 7 04:30:29 2005 cfengine:: This sub-task started really at Sat May 7 04:30:29 2005 I've tried restarting cfengine parts on both ends - doesn't help. running cfservd with -d2 gave next: while trying to run update script (copy /etc/cfengine/input files across the nodes into /etc/cfengine) ---------------------------------------- ... Access privileges - match found cfservd: Host node2.ravana.rutgers.edu granted access to /etc/cfengine/inputs/CVS/Root Clocks were off by 0 StatFile(/etc/cfengine/inputs/CVS/Root) OK: type=0 mode=644 lmode=0 uid=0 gid=0 size=10 atime=1115477605 mtime=1067285389 Transaction Send[t 65][Packed text] Attempting to send 73 bytes SendSocketStream, sent 73 Transaction Send[t 3][Packed text] Attempting to send 11 bytes SendSocketStream, sent 11 RecvSocketStream(8) (Concatenated 8 from stream) Transaction Receive [t 51][] RecvSocketStream(51) (Concatenated 51 from stream) Received: [MD5 /etc/cfengine/inputs/CVS/Root] on socket 5 CompareLocalChecksums(/etc/cfengine/inputs/CVS/Root,MD5=05e8d918529f204488a626792c4f8a6f) ChecksumChanged: key /etc/cfengine/inputs/CVS/Root with data MD5=05e8d918529f204488a626792c4f8a6f <At this point it stalls for a minute or two although cfservd running busy> IPV4 address sockaddr_ntop(10.0.0.2) Obtained IP address of 10.0.0.2 on socket 7 from accept FuzzyItemIn(LIST,10.0.0.2) Purging Old Connections... Done purging FuzzyItemIn(LIST,10.0.0.2) cfservd: Denying repeated connection from 10.0.0.2 ---------------------------------------- from client (cfagent) side it looks like ---------------------------------------- Compare binary sums on ravana:/etc/cfengine/inputs/CVS/Root & /var/lib/cfengine2/inputs/CVS/Root Using network md5 checksum instead ChecksumFile(m,/var/lib/cfengine2/inputs/CVS/Root) Send digest of /var/lib/cfengine2/inputs/CVS/Root to server, MD5=05e8d918529f204488a626792c4f8a6f Transaction Send[t 51][Packed text] Attempting to send 59 bytes SendSocketStream, sent 59 RecvSocketStream(8) <STALLS HERE and I got bored waiting till it dies... may be it never dies this time> ---------------------------------------- So here are the questions: 1. how to fix current situation? clearly there is something broken in a current state, so may be I can clean out cfengine state so as to start from a clean one - I wouldn't mind if it takes longer to run for the first time ;-) Sure I can completely reinstall and then it should work I believe but... 2. what would be a nice policy to enforce over /tmp so I don't remove anything valuable (like ssh-agent sockets and some other staff opened by running programs). I'm thinking about smth like files and directories large in size should be forbidden (>1M) if they are older than an hour. I'm not sure if I can discard data solely on age, so age+size sounds good to me.. -- Yaroslav Halchenko Research Assistant, Psychology Department, Rutgers-Newark Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171 101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07105 Student Ph.D. @ CS Dept. NJIT _______________________________________________ Help-cfengine mailing list Help-cfengine@gnu.org http://lists.gnu.org/mailman/listinfo/help-cfengine