On Tue, Jan 16, 2007 at 09:55:00PM +0100, Willem Jan Withagen wrote:
> Kris Kennaway wrote:
> ......
> 
> >>>The file-system would come to a stop, processes stuck on bio, snap-shots
> >>>not finishing etc.  This was caused by the system running out of usable
> >>>buffers.  The change forces them to be flushed every so often.  This is
> >>>independant of locking.  10 might be to aggresive.  Some scaling of
> >>>nbuf would probably be better.
> >>When I run mksnap_ffs it runs to the point where ANY access to the 
> >>filesystem gives that process a lockup.
> >
> >Yes, that is expected.  Actually it begins when something accesses the
> >directory in which the snapshot is being made, since that causes the
> >parent directory to be locked...then something tries to access the
> >parent directory, which eventually cascades back to the root.
> >
> >>Getting the file system back is only thru "hard reboot". Trying to do it 
> >>the gentle way locks the whole system.
> >
> >Or waiting until the snapshot operation finishes.  You (still) haven't
> >determined that it's actually hanging as opposed to just waiting for
> >the snapshot operation to finish.
> 
> True, and that is what I was refering to.
> 
> * I've let it run for 12 hours on 1,5T (that's why I asked for other
>       experiences)
> * I looked at diskstats with gstat:
>       that turned out that everything was idle for > 5 minutes
> 
> Then I concluded that it was locked.

OK, that does sound like it's deadlocked.  You could try Doug's patch,
or it might be another (unknown) condition.  If so, you'll need to do
some additional debugging with a serial console to figure out what is
wrong.

Kris

Attachment: pgpZLz9YTBAVB.pgp
Description: PGP signature

Reply via email to