Re: [Gluster-devel] question on glustershd

Ravishankar N Tue, 02 Dec 2014 23:14:06 -0800

On 12/03/2014 12:09 PM, Krutika Dhananjay wrote:
> 
> 
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
>     *From: *"Krutika Dhananjay" <[email protected]>
>     *To: *"Emmanuel Dreyfus" <[email protected]>
>     *Cc: *"Gluster Devel" <[email protected]>
>     *Sent: *Wednesday, December 3, 2014 11:54:03 AM
>     *Subject: *Re: [Gluster-devel] question on glustershd
> 
> 
> 
>     
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
>         *From: *"Emmanuel Dreyfus" <[email protected]>
>         *To: *"Ravishankar N" <[email protected]>, "Gluster Devel" 
> <[email protected]>
>         *Sent: *Wednesday, December 3, 2014 10:14:22 AM
>         *Subject: *Re: [Gluster-devel] question on glustershd
> 
>         Ravishankar N <[email protected]> wrote:
> 
>         > afr_shd_full_healer() is run only when we run 'gluster vol heal 
> <volname>
>         > full`, doing a full brick traversal (readdirp) from the root and
>         > attempting heal for each entry.
> 
>         Then we agree that "gluster vol heal $volume full" may fail to heal 
> some
>         files because of inode lock contention, right?
> 
>         If that is expected behavior, then the tests are wrong. For instance 
> in
>         tests/basic/afr/entry-self-heal.t we do "gluster vol heal $volume 
> full"
>         and we check that no unhealed files are left behind.
> 
>         Did I miss something, or do we have to either fix 
> afr_shd_full_healer()
>         or tests/basic/afr/entry-self-heal.t ?
> 
> 
>     Typical use of "heal full" is  in the event of a disk replacement where 
> one of the bricks in the replica set is totally empty.
>     And in a volume where both (assuming 2 way replication to keep the 
> discussion simple) children of AFR are on the same node, SHD would launch two 
> healers.
>     Each healer does readdirp() only on the brick associated with it (see how 
> @subvol is initialised in afr_shd_full_sweep()).
>     I guess in such scenarios, the healer associated with the brick that was 
> empty would have no entries to read, and as a result, nothing to heal from it 
> to the other brick.
>     In that case, there is no question of lock contention of the kind that 
> you explained above?
> 
> Come to think of it, it does not really matter whether the two bricks are on 
> the same node or not.
> In either case, there may not be a lock contention between healers associated 
> with different bricks, irrespective of whether they are part of the same SHD 
> or SHDs on different nodes.
> -Krutika
>


Actually, there is a bug with full heal in afr-v2.
When full heal is triggered, glusterd sends the heal op to only one shd of the 
replica pair:the one whose node has the highest uuid.
And that shd triggers heal on the bricks only that are local to it. So in a 1x2 
volume where the bricks are on different nodes, only
one shd gets the op and it triggers readdirp + heal on its local client (brick) 
only. (See BZ 1112158)

In afr-v1, also, only one shd receives the heal full op, but the readdirp is 
done at the afr-level (as opposed to the client xlator level in v2),
doing a conservative merge.


>     -Krutika
> 
>         -- 
>         Emmanuel Dreyfus
>         http://hcpnet.free.fr/pubz
>         [email protected]
>         _______________________________________________
>         Gluster-devel mailing list
>         [email protected]
>         http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> 
> 
> 
>     _______________________________________________
>     Gluster-devel mailing list
>     [email protected]
>     http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> 
> 
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> [email protected]
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> 

_______________________________________________
Gluster-devel mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] question on glustershd

Reply via email to