Ravi, I actually restarted glustershd by unmounting my volume on the clients, 
stopping and starting the volume on the cluster and re-mounting it on the 
clients yesterday evening and it managed to get around 1500~ files cleared from 
the "volume heal info" output. So I am down now to around ~25k more files to 
heal. While restarting the volume I saw the following log entries in the brick 
log file:

[2018-11-02 17:51:07.078738] W [inodelk.c:610:pl_inodelk_log_cleanup] 
0-myvol-private-server: releasing lock on da4f31fb-ac53-4d78-a633-f0046ac3ebcc 
held by {client=0x7fd48400c160, pid=-6 lk-owner=b0d405e0167f0000}


What also bothers me is that if I run a manual "volume heal" nothing happens 
except the following log entry in glusterd log:

[2018-11-03 06:32:16.033214] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustershd: 
error returned while attempting to connect to host:(null), port:0

That does not seem normal... what do you think?


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, November 3, 2018 1:31 AM, Ravishankar N <ravishan...@redhat.com> 
wrote:

> Mabi,
>
> If bug 1637953 is what you are experiencing, then you need to follow the
> workarounds mentioned in
> https://lists.gluster.org/pipermail/gluster-users/2018-October/035178.html.
> Can you see if this works?
>
> -Ravi
>
> On 11/02/2018 11:40 PM, mabi wrote:
>
> > I tried again to manually run a heal by using the "gluster volume heal" 
> > command because still not files have been healed and noticed the following 
> > warning in the glusterd.log file:
> > [2018-11-02 18:04:19.454702] I [MSGID: 106533] 
> > [glusterd-volume-ops.c:938:__glusterd_handle_cli_heal_volume] 0-management: 
> > Received heal vol req for volume myvol-private
> > [2018-11-02 18:04:19.457311] W [rpc-clnt.c:1753:rpc_clnt_submit] 
> > 0-glustershd: error returned while attempting to connect to host:(null), 
> > port:0
> > It looks like the glustershd can't connect to "host:(null)", could that be 
> > the reason why there is no healing taking place? if yes why do I see here 
> > "host:(null)"? and what needs fixing?
> > This seeem to have happened since I upgraded from 3.12.14 to 4.1.5.
> > I really would appreciate some help here, I suspect being an issue with 
> > GlusterFS 4.1.5.
> > Thank you in advance for any feedback.
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > On Wednesday, October 31, 2018 11:13 AM, mabi m...@protonmail.ch wrote:
> >
> > > Hello,
> > > I have a GlusterFS 4.1.5 cluster with 3 nodes (including 1 arbiter) and 
> > > currently have a volume with around 27174 files which are not being 
> > > healed. The "volume heal info" command shows the same 27k files under the 
> > > first node and the second node but there is nothing under the 3rd node 
> > > (arbiter).
> > > I already tried running a "volume heal" but none of the files got healed.
> > > In the glfsheal log file for that particular volume the only error I see 
> > > is a few of these entries:
> > > [2018-10-31 10:06:41.524300] E [rpc-clnt.c:184:call_bail] 
> > > 0-myvol-private-client-0: bailing out frame type(GlusterFS 4.x v1) 
> > > op(INODELK(29)) xid = 0x108b sent = 2018-10-31 09:36:41.314203. timeout = 
> > > 1800 for 127.0.1.1:49152
> > > and then a few of these warnings:
> > > [2018-10-31 10:08:12.161498] W [dict.c:671:dict_ref] 
> > > (-->/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/cluster/replicate.so(+0x6734a)
> > >  [0x7f2a6dff434a] 
> > > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x5da84) [0x7f2a798e8a84] 
> > > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_ref+0x58) 
> > > [0x7f2a798a37f8] ) 0-dict: dict is NULL [Invalid argument]
> > > the glustershd.log file shows the following:
> > > [2018-10-31 10:10:52.502453] E [rpc-clnt.c:184:call_bail] 
> > > 0-myvol-private-client-0: bailing out frame type(GlusterFS 4.x v1) 
> > > op(INODELK(29)) xid = 0xaa398 sent = 2018-10-31 09:40:50.927816. timeout 
> > > = 1800 for 127.0.1.1:49152
> > > [2018-10-31 10:10:52.502502] E [MSGID: 114031] 
> > > [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] 
> > > 0-myvol-private-client-0: remote operation failed [Transport endpoint is 
> > > not connected]
> > > any idea what could be wrong here?
> > > Regards,
> > > Mabi
> >
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to