On Thu, Apr 10, 2014 at 11:02 AM, Raghavendra Gowdappa <rgowd...@redhat.com>wrote:
> Hi all, > > I was trying to come up with some consistency issues. I am not sure > whether case 5 is a valid one, since lookup would succeed and mkdir would > fail with EEXIST (scroll down to the case for more detailed explanation). > Case 5 is a valid one. This comment was based on an earlier test case which seemed to be invalid. Sorry about the confusion. > > We are considering a distribute of 3 bricks - b1, b2, b3. > > Case 1: > ======= > > Operation: rename (src, dst) - dst does not exist > > T0: rename successful on Hashed subvol but not on other bricks > T1: Snapshot on b1, b2, b3 > > Result: After snapshot is restored and healing is complete on src, dst we > end up with two directories src and dst having gfid of src > > Case 2: > ======= > > Operation: Two parallel rename (src, dst) and rename (dst, src). Both src > and dst exist and hash to b1 and b2 respectively > > T0: rename (src, dst) successful on b1 > T1: rename (dst, src) successful on b2 > T3: Snapshot on b1, b2, b3 > > Result: > After restore, if lookup happens on src and is healed to b1 from b2, gfids > of src on each brick will be, > b1 - (src, dst-gfid) > b2 - (src, dst-gfid) > b3 - (src, src-gfid) > > Case 3: > ======= > > Operation: Parallel rename and two mkdirs. Only src exists. Both hash to > same brick b1. > > T0: two lookups triggered as part of application mkdir1 and mkdir2 > complete with ENOENT. > T1: mkdir2 goes ahead and creates directory with gfid, gfid1 > T2: rename (src, dst) on b1 > T3: mkdir1 (src) on b1 > T4: snapshot on b1, b2 and b3 > > Result: > After restore and healing of src and dst, we end up with, > b1 - (src, gfid2) and (dst, gfid1) > b2 - (src, gfid1) and (dst, gfid1) > b3 - (src, gfid1) and (dst, gfid1) > > Another reason for this inconsistency is that dht don't consider mkdir > failures with EEXIST on subvols as failures. More details can be found in > [2]. > > Case 4: > ======= > > Operation: Parallel rename (src, dst) and rmdir (src). Both src and dst > exist with gfids gfid1 and gfid2 respectively > > T0: rename (src, dst) on b1 > T1: rmdir (src) on b2 and b3 > T2: snapshot on b1, b2 and b3 > > Result: After restore and healing, > b1 - (dst, gfid1) > b2 - (dst, gfid2) > b3 - (dst, gfid2) > > case 5: > ======= > > This bug was hit and fix being reviewed at [1] > > Operation: Parallel two rmdir and two mkdirs. Directory dir does not exist > to start with. > > T0: two lookups triggered as part of application mkdir1 and mkdir2 > complete with ENOENT. > T1: mkdir2 goes ahead and creates directory with gfid, gfid1 > T2: rmdir1 (dir) on b1 > T3: lookup (dir) triggered as part of rmdir2 (or any name based > opeartion), heals dir on b1 with gfid, gfid2 > T4: mkdir1 (dir, gfid2) on b2 and b3 > T5: snapshots on b1, b2 and b3 > > Result: > b1 - (dir, gfid1) > b2 - (dir, gfid2) > b3 - (dir, gfid2) > > Considering all these issues, following set of fixes have been proposed: > > 1. in posix, if we receive mkdir (dir1) on an existing gfid (with name > dir2), posix will convert mkdir (dir1) into rename (dir1, dir2). This > solves case 1 > > 2. in case of rename (src, dst), if dst already exists, rmdir (dst), so > that we don't bring in inconsistency into dst gfid space. This solves all > the cases of inconsistencies in dst gfid with rename failing. > > 3. hold entrylks in directory heal (part of lookup) and rmdir. This solves > consistency issues because of races b/w mkdir and rmdir. > > [1] http://review.gluster.org/#/c/4846/ > [2] http://review.gluster.org/4459 > > regards, > Raghavendra. > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@nongnu.org > https://lists.nongnu.org/mailman/listinfo/gluster-devel > -- Raghavendra G
_______________________________________________ Gluster-devel mailing list Gluster-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/gluster-devel