This is probably a one-off (actually two, but more about that later) that will 
only ever bite me and never be heard of againg, but I have to ask:

What could cause your /dev/, which is normally in the kilobytes in size, to 
swell to *gigabyte* range?

The reason I ask is that when I was attempting to upgrade my laptop to the 
latest amd64 snapshot, the upgrade failed due to a full root file system.

I thought that to be distinctly odd, because the file system layout is very 
close to the default with a gigabyte for root, to wit:

[Thu Nov 17 20:03:37] peter@elke:~$ df -h
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd1a     1005M    103M    852M    11%    /
/dev/sd1d      3.9G   18.6M    3.7G     0%    /tmp
/dev/sd1f      100G    554M   94.8G     1%    /usr
/dev/sd1h     29.5G    6.1G   22.0G    22%    /usr/local
/dev/sd1j      3.2G    2.0K    3.0G     0%    /usr/obj
/dev/sd1i     21.6G    2.0K   20.6G     0%    /usr/src
/dev/sd1g     1005M    2.0K    955M     0%    /usr/x11R6
/dev/sd1e     27.8G   39.5M   26.4G     0%    /var
/dev/sd0d      950G    370G    532G    41%    /home

as we see the world after a successful reinstall, including packages.

But before that reinstall, the root file system was indeed full, and /dev 
consumed more that 900 megabytes (the exact number is lost but take my word for 
it).

Even stranger, another machine here (this one running recent i386 snapshots) 
shows this:

[Thu Nov 17 20:09:11] peter@skapet:~$ doas du -hs /*
4.0K    /altroot
5.4M    /bin
88.0K   /boot
10.4M   /bsd
6.9M    /bsd.rd
10.4M   /bsd.sp
1.1G    /dev
8.3M    /etc

note the size of /dev here. This one has a larger root file system so no 
immediate danger of filling to capacity yet.

The only common denominator here I can think of is that both machines have 
suffered kernel panics with subsequent fsck on boot recently. In the case of 
this last one the panic was almost certainly due to a RAM chip failing, with 
fsck interrupted due to panic when hitting that bad RAM, and so forth. Even 
after the hardware had been swapped out, that machine was seriously sick in 
other ways. Anyway, this last machine has gone only through OS and packages 
upgrade after the panic, so most likely more evidence is preserved here than in 
the elke case.

The sane way forward is of course to reinstall and get on with life, but a part 
of me still wonders how this could have happened on two systems at roughly the 
same time.

If any devs are interested, I'll probably let the last box run for a few days 
more before doing any major surgery (assuming nothing else weird happens).

-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
http://bsdly.blogspot.com/ http://www.bsdly.net/ http://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.

Reply via email to