John Ekins wrote:
Hello Bill,

On Fri, 27 Jun 2003 23:53:30 -0400
Bill Moran <[EMAIL PROTECTED]> wrote:

-> I don't know what's wrong, but does unmounting and remounting the partition
-> reclaim the lost space?

Alas, I can't umount the partition, my guess is because it is unable to sync
(nothing to do with open files, and no error message saying "device busy"). The
command just doesn't return after I've issued it.

Hmmm ... not good. A little more research might qualify this problem for a PR.


-> If there's a LOT of inodes with problems, it could easily take a while to fix. -> Also, if you run fsck without specifying a filesystem to fix, it exhaustively
-> checks all filesystems. So even if the problem is on /var, it might spend a
-> long time checking /usr as well. You can work around this by calling fsck -> with the filesystem to check.


I don't think it's to do with inodes or block size, etc. There's about 2M inodes
on /var. A manual fsck on a dirty shutdown on this partition (ignoring the problem
in hand) takes a couple of minutes.

Hmmm ...


-> If these are production boxes, I'd recommend turning it off until you resolve
-> the problem.

Indeed, I tried that last night on one machine and it put the load through the
roof(48).

Yikes! Is the machine still responsive? Sometimes you can put the load that high and still have a functional box. I'm guessing by the way the conversation is going that you're able to grab one of these boxes and make some tweaks. Possibly try putting the spool directory on a dedicated partition and mounting it async? If the box shuts down dirty, you'll probably have to newfs the partition before you can use it again. At least make sure the spool partition is seperate from your log partition, that should help to mitigate the problem (although you may already have done that).

-> I don't know if this would qualify as "advice", but since nobody else
-> seems to have any suggestions, I figured I'd throw my thoughts in.

-> Are you using ATA or SCSI drives?

SCSI.

-> Does issuing a manual "sync" once you've stopped the spooling process help
-> any?


No. I'd already tried numerous syncs, and of course a clean shutdown tries that
too.

I was wondering if maybe the syncs were taking longer than the shutdown process was willing to wait.

-> Are these all identical mobos ... possibly a BIOS update available?

Haven't looked for an update, but I think they're all identical.

Hmmm ... but the fact that you're using SCSI makes this less of an issue, unless it's onboard SCSI. Possibly an update to the SCSI BIOS?

-> These aren't IBM ATA drives are they?  I had one of those give  me grief for
-> months (if you look in the archives, you should be able to find details on
-> which drives caused problems).

Alas not! They're straightforward Seagates, which in other machines we use (much
lighter load) don't have this problem.

-> Have you tried updating one of the machines to 4.8 to see if the problem
-> has been fixed?

I haven't tried that yet but will do so. I'm also going to test a 5.1R machine,
perhaps the background fsck will help when I alas come to reboot.

It may save you some time to look in CVS under the files for the drivers for the SCSI subsystem as well as the drivers for you specific cards to see if any commit messages talk about fixing problems like this. My experience with background fsck is that the machine is slow as hell while the background fsck is running. Whether or not this is better or worse than what you're experiencing with 4.7 is a question only you can answer.

-> Like I said, not good advice, just some ideas for you.

All advice and ideas are welcome.

Well ... I'm really shooting in the dark with these suggestions, but hopefully there will be something useful.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com

_______________________________________________
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to