> > All these ip are pingable and hosts resolvible across all 3 nodes but, >> only the 10.10.10.0 network is the decidated network for gluster (rosolved >> using gdnode* host names) ... You think that remove other entries can fix >> the problem? So, sorry, but, how can I remove other entries? >> > I don't think having extra entries could be a problem. Did you check the > fuse mount logs for disconnect messages that I referred to in the other > email? >
* tail -f /var/log/glusterfs/rhev-data-center-mnt-glusterSD-dvirtgluster\:engine.log* *NODE01:* [2017-07-24 07:34:00.799347] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0 -glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport endpoint is not connected) [2017-07-24 07:44:46.687334] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0 -glusterfsd-mgmt: Exhausted all volfile servers [2017-07-24 09:04:25.951350] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0 -glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport endpoint is not connected) [2017-07-24 09:15:11.839357] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0 -glusterfsd-mgmt: Exhausted all volfile servers [2017-07-24 10:34:51.231353] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0 -glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport endpoint is not connected) [2017-07-24 10:45:36.991321] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0 -glusterfsd-mgmt: Exhausted all volfile servers [2017-07-24 12:05:16.383323] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0 -glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport endpoint is not connected) [2017-07-24 12:16:02.271320] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0 -glusterfsd-mgmt: Exhausted all volfile servers [2017-07-24 13:35:41.535308] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0 -glusterfsd-mgmt: failed to connect with remote-host: gdnode03 (Transport endpoint is not connected) [2017-07-24 13:46:27.423304] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0 -glusterfsd-mgmt: Exhausted all volfile servers Why again gdnode03? Was removed from gluster! was the arbiter node... *NODE02:* [2017-07-24 14:08:18.709209] I [MSGID: 108026] [ afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on db56ac00-fd5b-4326-a879-326ff56181de. sources=0 [ 1] sinks=2 [2017-07-24 14:08:38.746688] I [MSGID: 108026] [ afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] 0-engine-replicate-0: performing metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81 [2017-07-24 14:08:38.749379] I [MSGID: 108026] [ afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81. sources=0 [1] sinks=2 [2017-07-24 14:08:46.068001] I [MSGID: 108026] [ afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on db56ac00-fd5b-4326-a879-326ff56181de. sources=0 [ 1] sinks=2 The message "I [MSGID: 108026] [ afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] 0-engine-replicate-0: performing metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81" repeated 3 times between [2017-07-24 14:08:38.746688] and [2017-07-24 14:10:09.088625] The message "I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81. sources=0 [1] sinks=2 " repeated 3 times between [2017-07-24 14:08:38.749379] and [2017-07-24 14:10:09.091377] [2017-07-24 14:10:19.384379] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on db56ac00-fd5b-4326-a879-326ff56181de. sources=0 [1] sinks=2 [2017-07-24 14:10:39.433155] I [MSGID: 108026] [afr-self-heal-metadata.c:51: __afr_selfheal_metadata_do] 0-engine-replicate-0: performing metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81 [2017-07-24 14:10:39.435847] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81. sources=0 [1] sinks=2 *NODE04:* [2017-07-24 14:08:56.789598] I [MSGID: 108026] [afr-self-heal-common.c:1254 :afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5bd74327. sources=[0] 1 sinks=2 [2017-07-24 14:09:17.231987] I [MSGID: 108026] [afr-self-heal-common.c:1254 :afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on db56ac00 -fd5b-4326-a879-326ff56181de. sources=[0] 1 sinks=2 [2017-07-24 14:09:38.039541] I [MSGID: 108026] [afr-self-heal-common.c:1254 :afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5bd74327. sources=[0] 1 sinks=2 [2017-07-24 14:09:48.875602] I [MSGID: 108026] [afr-self-heal-common.c:1254 :afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on db56ac00 -fd5b-4326-a879-326ff56181de. sources=[0] 1 sinks=2 [2017-07-24 14:10:39.832068] I [MSGID: 108026] [afr-self-heal-common.c:1254 :afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5bd74327. sources=[0] 1 sinks=2 The message "I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5bd74327. sources=[0] 1 sinks=2 " repeated 3 times between [2017-07-24 14:10: 39.832068] and [2017-07-24 14:12:22.686142] Last message was (I think) because I have reexecute an "heal" command n.b. dvirtgluster is the RR DNS for all node gluster > > >> And, what about the selinux? >> > Not sure about this. See if there are disconnect messages in the mount > logs first. > -Ravi > > >> Thank you >> >> >> > No messages selinux related... Thank you!
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
