On 11/03/2018 04:13 PM, mabi wrote:
Ravi (or anyone else who can help), I now have even more files which are 
pending for healing.
If the count is increasing, there is likely a network (disconnect) problem between the gluster clients and the bricks that needs fixing.
  Here is the output of a "volume heal info summary":

Brick node1:/data/myvol-private/brick
Status: Connected
Total Number of entries: 49845
Number of entries in heal pending: 49845
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick node2:/data/myvol-private/brick
Status: Connected
Total Number of entries: 26644
Number of entries in heal pending: 26644
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick node3:/srv/glusterfs/myvol-private/brick
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Should I try to set the "cluster.data-self-heal" parameter of that volume to 
"off" as mentioned in the bug?
Yes, as  mentioned in the workaround in the thread that I shared.

And by doing that, does it mean that my files pending heal are in danger of 
being lost?
No.

Also is it dangerous to leave "cluster.data-self-heal" to off?
No. This is only disabling client side data healing. Self-heal daemon would still heal the files.
-Ravi



‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, November 3, 2018 1:31 AM, Ravishankar N <ravishan...@redhat.com> 
wrote:

Mabi,

If bug 1637953 is what you are experiencing, then you need to follow the
workarounds mentioned in
https://lists.gluster.org/pipermail/gluster-users/2018-October/035178.html.
Can you see if this works?

-Ravi

On 11/02/2018 11:40 PM, mabi wrote:

I tried again to manually run a heal by using the "gluster volume heal" command 
because still not files have been healed and noticed the following warning in the 
glusterd.log file:
[2018-11-02 18:04:19.454702] I [MSGID: 106533] 
[glusterd-volume-ops.c:938:__glusterd_handle_cli_heal_volume] 0-management: 
Received heal vol req for volume myvol-private
[2018-11-02 18:04:19.457311] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustershd: 
error returned while attempting to connect to host:(null), port:0
It looks like the glustershd can't connect to "host:(null)", could that be the reason why 
there is no healing taking place? if yes why do I see here "host:(null)"? and what needs 
fixing?
This seeem to have happened since I upgraded from 3.12.14 to 4.1.5.
I really would appreciate some help here, I suspect being an issue with 
GlusterFS 4.1.5.
Thank you in advance for any feedback.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, October 31, 2018 11:13 AM, mabi m...@protonmail.ch wrote:

Hello,
I have a GlusterFS 4.1.5 cluster with 3 nodes (including 1 arbiter) and currently have a 
volume with around 27174 files which are not being healed. The "volume heal 
info" command shows the same 27k files under the first node and the second node but 
there is nothing under the 3rd node (arbiter).
I already tried running a "volume heal" but none of the files got healed.
In the glfsheal log file for that particular volume the only error I see is a 
few of these entries:
[2018-10-31 10:06:41.524300] E [rpc-clnt.c:184:call_bail] 
0-myvol-private-client-0: bailing out frame type(GlusterFS 4.x v1) 
op(INODELK(29)) xid = 0x108b sent = 2018-10-31 09:36:41.314203. timeout = 1800 
for 127.0.1.1:49152
and then a few of these warnings:
[2018-10-31 10:08:12.161498] W [dict.c:671:dict_ref] 
(-->/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/cluster/replicate.so(+0x6734a) 
[0x7f2a6dff434a] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x5da84) 
[0x7f2a798e8a84] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_ref+0x58) 
[0x7f2a798a37f8] ) 0-dict: dict is NULL [Invalid argument]
the glustershd.log file shows the following:
[2018-10-31 10:10:52.502453] E [rpc-clnt.c:184:call_bail] 
0-myvol-private-client-0: bailing out frame type(GlusterFS 4.x v1) 
op(INODELK(29)) xid = 0xaa398 sent = 2018-10-31 09:40:50.927816. timeout = 1800 
for 127.0.1.1:49152
[2018-10-31 10:10:52.502502] E [MSGID: 114031] 
[client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] 0-myvol-private-client-0: 
remote operation failed [Transport endpoint is not connected]
any idea what could be wrong here?
Regards,
Mabi
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to