Sean Murphy wrote: > I am getting these errors on my var filesystem but df -h shows there is > plenty of space available. > > I am running FreeBSD 5.4 > > muse2# df -h > Filesystem Size Used Avail Capacity Mounted on > /dev/amrd0s1a 989M 56M 854M 6% / > devfs 1.0K 1.0K 0B 100% /dev > /dev/amrd0s1e 989M 32M 878M 4% /tmp > /dev/amrd0s1f 9.5G 4.2G 4.6G 48% /usr > /dev/amrd0s1g 245G 9.4G 216G 4% /usr/home > /dev/amrd0s1d 1.9G 526M 1.3G 29% /var > > muse2# tail /var/log/messages > Apr 3 09:00:44 muse2 kernel: pid 537 (mimedefang), uid 26 inumber > 126291 on /var: filesystem full > Apr 3 09:09:55 muse2 kernel: pid 52000 (httpd), uid 80 inumber 170037 > on /var: filesystem full > Apr 3 09:12:59 muse2 kernel: pid 34758 (mimedefang), uid 26 inumber > 127701 on /var: filesystem full > > I have restarted the mimdefang process but I get the same messages. > > What can I do?
There are two reasons why a filesystem may give 'out of space' errors when df(1) still shows plenty of space available. i) Out of inodes. You can tell this by running 'df -i'. You're unlikely to run into this unless either you used non-standard settings when you newfs'd the partition or else the partition is full of a very large number of very files. If this is the case, then apart from rampantly deleting lots of stuff the only solution is to backup the filesystem somewhere, recreate the filesystem by running newfs with a more realistic set of parameters (bytes-per-inode should be smaller) and then recover the data from backup. ii) Open file descriptor on an unlinked file. This is much more likely to be the problem with the /var partition, seeing as it's a favourite place for log files. What can happen is this: a process has a file open (ie. it has an open file descriptor on the data) but a second process comes along and unlinks the original file. That means that the file name and other meta data are removed from the directory contents, but since another process has the file open, the space taken up by the files' data is not returned to the generally available pool. Sounds daft at first, but that's the way Unix has worked since the epoch and if you think about it, it makes sense really. Doing that deliberately can be exceedingly useful -- a program can reserve itself some scratch space that can't be accessed or altered by anything else[*] What tends to happen in /var is a side effect of not rotating log files correctly. newsyslog(8) and pals will move aside and compress an existing log file very happily, then they will send a signal to the program generating the log file (by default assumed to be syslogd) to tell it to close and reopen any files it is logging to -- it's a common behaviour for Unix daemons to understand a SIGHUP to mean 'reinitialise yourself and reopen any files you're using' If newsyslog(8) signals the wrong process, or doesn't signal any process at all, or the process doesn't grok the SIGHUP, then you'll find you get exactly the sort of orphaned file with an open descriptor on it as described above. The way to debug this is to list all of the processes that have open descriptors on the partition: # fstat -f /var then it's a case of doing some detective work to try and identify which out of the many processes listed is the culprit. Unfortunately fstat(1) doesn't tell you file names -- instead you get the files inode number as column 6 of the output. There is no generic method of mapping from inode number to filename (indeed, orphaned files like we've been discussing have an inode number, but *no* filename); other than by doing exhaustive searches using eg. find(1): # find /var -inum nnnnn -print In this case you're looking for the ones that don't return an answer. Cheers, Matthew [*] Well, not without rootly powers, ample clue and a reasonable expenditure of effort. -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW
Description: OpenPGP digital signature