data-self-heal.t

Ravishankar N Tue, 05 May 2015 21:26:57 -0700

TL;DR: Need to come up with a fix for AFR data self-heal from clients(mounts).

/data-self-heal.t/ creates a 1x2 volume, sets afr changelog xattrsdirectly on the files in the backend bricks, then runs full heal to healthe files.

The test fails intermittently when run in a loop because data self-healattempts non-blocking locks before healing and the two heal threads(one per brick) might try to acquire the lock at the same time and bothmight fail. In afr-v1, only one thread gets spawned if both bricks arein the same node. In afr-v2, we cannot do this because unlike in v1,there is no conservative merge in afr_opendir_cbk() in v2. We are notsure that adding conservative merge in v2 is a good idea because itinvolves (multiple ) readdirs on both bricks and computing checksum onthe entries to see if there is a mismatch, which can be a costlyoperation when done from clients. Making the locks blocking could causeone heal thread to block instead of trying to heal other files if theother thread holds the lock. One approach is to do what ec does byusing a virtual xattr and handling it in the getxattr FOP to triggerdata heals from clients. More thought needs to be given to this.


Regards,
Ravi

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Regression failure of tests/basic/afr/data-self-heal.t

Reply via email to