This isn't quite the same issue, but several times I have observed a large multiCPU machine lock up because the accounting records associates with a zillion tiny rapidly launched jobs made an enormous /var/account/pacct file and filled the small root filesystem. Actually it wasn't usually pacct itself that brought the system to its knees but the cron scheduled gzip of that file which applied the coup de grace. That left the original big pacct and a very large partial pacct-$DATE.gz which used up the last few free bytes.

As far as I know there is no way to selectively disable saving process accounting records for "all children of process PID". Accounting is either on or off. Now when I run scripts prone to this accounting is turned off first.

This was on Centos 6.9, on machines reporting (via /proc/cpuinfo) 48 and 56 cpus.

Regards,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to