On Fri, 9 Mar 2012, Alexandre Oliva wrote:
> On Mar 3, 2012, Sage Weil <[email protected]> wrote:
>
> > It looks like the problem is that CInode::first isn't being journaled.
> > Normally, that's fine because it matches the referring dentry.. but for
> > multiversion inodes (like snapped directories), it won't match. On replay
> > we end up with bad value of 2, and it re-cows and clobbers the original
> > old value.
>
> > I pushed wip-1946 with a fix. Want to give it a go?
>
> Sorry about the delay, I spent the week facing disk full problems that
> followed a major crushmap rearrangement and cluster_snaps that I'd
> rather not remove before the rearrangement was complete. Fun! :-)
>
> I gave it a go, and I'm afraid it doesn't look like it fixed the
> problem. bb85a7270 (wip-1946^) is the one patch I tested with, because
> the subsequent patch in wip-1946 failed the assertion during recovery,
> while replaying AFAICT a directory move. in->first was 2 (the bad value
> you mention above?).
You didn't by chance keep the log?
Anyway, a defensive fix would be to replace that patch's
- in->first = p->dnfirst;
+ assert(in->first == p->dnfirst ||
+ (in->is_multiversion() && in->first > p->dnfirst));
with
if (!in->is_multiversion())
in->first = p->dnfirst;
> Observed behavior was still the same: remove old snapshot, touch dir
> with old timestamp, remount, re-create snapshot, remount, check
> timestamps in snapshot dir, all fine, restart mds, all fine, touch dir,
> remount, all fine, restart mds again, snapshot timestamp changes to
> match.
Sorry I don't have more time to mess with this. FWIW I'd want to test any
fix by running it through the snap workunits (particularly snaptest-2.sh).
Those are probably a good smoke test for testing any changes in this area
if it's tedious to reproduce your bug.
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html