On Wed, Apr 26, 2006 at 06:42:28PM +0200, Pawel Jakub Dawidek wrote:
> On Wed, Apr 26, 2006 at 04:36:17PM +0300, Kostik Belousov wrote:
> +> On Wed, Apr 26, 2006 at 01:43:42PM +0400, Dmitry Morozovsky wrote:
> +> > On Tue, 25 Apr 2006, Kris Kennaway wrote:
> +> > 
> +> > KK> What people are seeing now must be some other problem that I wan't
> +> > KK> able to reproduce.
> +> > KK> 
> +> > KK> Once I hear back from someone who can reproduce it with debugging
> +> > KK> enabled (I'm also trying) we can try to fix it.
> +> > 
> +> > Please try to simulate user who is over soft quota and is out of grace 
> period. 
> +> > I'm trying to do so as well, but currently quite busy with other tasks :(
> +> 
> +> I'm not sure whether the following is the issue you met, but:
> +> 
> +> dqsync from sys/ufs/ufs/ufs_quota.c calls vn_start_secondary_write()
> +> unconditionally. As result, mp->mnt_secondary_accwrites counter
> +> from the struct mount will always increase after the entry to the dqsync.
> +> ffs_snapshot calls ffs_sync, that calls dsync, that
> +> iterates over vnodes and calls dqsync on them.
> +> And, after the qsync, ffs_sync checks whether mp->mnt_secondary_accwrites
> +> changes by calling softdep_check_suspend (see line 1221 of ffs_vfsops.c).
> +> If changed, ffs_sync would restart the syncing loop, that never finishes.
> +> 
> +> This is very strange, since if true, it basicaly means that snapshots
> +> and quotas shall lead to immediate deadlock ...
> +> 
> +> The following patch moves call to vn_start_secondary_write after
> +> check for DQ_MOD. Please, try it.
> 
> Your patch must not be against HEAD, because in HEAD we have:
> 
>       if ((dq->dq_flags & DQ_MOD) == 0)
>               return (0);
>       if ((dqvp = dq->dq_ump->um_quotas[dq->dq_type]) == NULLVP)
>               panic("dqsync: file");
>       (void) vn_start_secondary_write(dqvp, &mp, V_WAIT);
>       if (vp != dqvp)
>               vn_lock(dqvp, LK_EXCLUSIVE | LK_RETRY, td);
> 
> As you can see DQ_MOD is checked before vn_start_secondary_write().
Aha, I overlooked this first check, it explains why deadlock is rare.

Look, _after_ this check and vn_start_sec_write placed sleep point,
and after that (correctly) there is another check for DQ_MOD.

Attachment: pgpRuGFMgauDY.pgp
Description: PGP signature

Reply via email to