On Fri, 9 Mar 2012, Alexandre Oliva wrote:
> On Mar  3, 2012, Sage Weil <[email protected]> wrote:
> 
> > It looks like the problem is that CInode::first isn't being journaled.  
> > Normally, that's fine because it matches the referring dentry.. but for 
> > multiversion inodes (like snapped directories), it won't match.  On replay 
> > we end up with bad value of 2, and it re-cows and clobbers the original 
> > old value.
> 
> > I pushed wip-1946 with a fix.  Want to give it a go?
> 
> Sorry about the delay, I spent the week facing disk full problems that
> followed a major crushmap rearrangement and cluster_snaps that I'd
> rather not remove before the rearrangement was complete.  Fun! :-)
> 
> I gave it a go, and I'm afraid it doesn't look like it fixed the
> problem.  bb85a7270 (wip-1946^) is the one patch I tested with, because
> the subsequent patch in wip-1946 failed the assertion during recovery,
> while replaying AFAICT a directory move.  in->first was 2 (the bad value
> you mention above?).

You didn't by chance keep the log?

Anyway, a defensive fix would be to replace that patch's

-       in->first = p->dnfirst;
+       assert(in->first == p->dnfirst ||
+              (in->is_multiversion() && in->first > p->dnfirst));

with

        if (!in->is_multiversion())
          in->first = p->dnfirst;

> Observed behavior was still the same: remove old snapshot, touch dir
> with old timestamp, remount, re-create snapshot, remount, check
> timestamps in snapshot dir, all fine, restart mds, all fine, touch dir,
> remount, all fine, restart mds again, snapshot timestamp changes to
> match.

Sorry I don't have more time to mess with this.  FWIW I'd want to test any 
fix by running it through the snap workunits (particularly snaptest-2.sh).  
Those are probably a good smoke test for testing any changes in this area 
if it's tedious to reproduce your bug.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to