Is it possible that version 9.1 and 9.6 can't talk to each other? My understanding was that they should be able to.
On Fri, 24 Feb 2023 at 10:36, David Cunningham <dcunning...@voisonics.com> wrote: > We've tried to remove "sg" from the cluster so we can re-install the > GlusterFS node on it, but the following command run on "br" also gives a > timeout error: > > gluster volume remove-brick gvol0 replica 1 > sg:/nodirectwritedata/gluster/gvol0 force > > How can we tell "br" to just remove "sg" without trying to contact it? > > > On Fri, 24 Feb 2023 at 10:31, David Cunningham <dcunning...@voisonics.com> > wrote: > >> Hello, >> >> We have a cluster with two nodes, "sg" and "br", which were running >> GlusterFS 9.1, installed via the Ubuntu package manager. We updated the >> Ubuntu packages on "sg" to version 9.6, and now have big problems. The "br" >> node is still on version 9.1. >> >> Running "gluster volume status" on either host gives "Error : Request >> timed out". On "sg" not all processes are running, compared to "br", as >> below. Restarting the services on "sg" doesn't help. Can anyone advise how >> we should proceed? This is a production system. >> >> root@sg:~# ps -ef | grep gluster >> root 15196 1 0 22:37 ? 00:00:00 /usr/sbin/glusterd -p >> /var/run/glusterd.pid --log-level INFO >> root 15426 1 0 22:39 ? 00:00:00 /usr/bin/python3 >> /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid >> root 15457 15426 0 22:39 ? 00:00:00 /usr/bin/python3 >> /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid >> root 19341 13695 0 23:24 pts/1 00:00:00 grep --color=auto gluster >> >> root@br:~# ps -ef | grep gluster >> root 2052 1 0 2022 ? 00:00:00 /usr/bin/python3 >> /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid >> root 2062 1 3 2022 ? 10-11:57:16 /usr/sbin/glusterfs >> --fuse-mountopts=noatime --process-name fuse --volfile-server=br >> --volfile-server=sg --volfile-id=/gvol0 --fuse-mountopts=noatime >> /mnt/glusterfs >> root 2379 2052 0 2022 ? 00:00:00 /usr/bin/python3 >> /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid >> root 5884 1 5 2022 ? 18-16:08:53 /usr/sbin/glusterfsd >> -s br --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p >> /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S >> /var/run/gluster/61df1d4e1c65300e.socket --brick-name >> /nodirectwritedata/gluster/gvol0 -l >> /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log >> --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 >> --process-name brick --brick-port 49152 --xlator-option >> gvol0-server.listen-port=49152 >> root 10463 18747 0 23:24 pts/1 00:00:00 grep --color=auto gluster >> root 27744 1 0 2022 ? 03:55:10 /usr/sbin/glusterfsd -s >> br --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p >> /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S >> /var/run/gluster/61df1d4e1c65300e.socket --brick-name >> /nodirectwritedata/gluster/gvol0 -l >> /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log >> --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 >> --process-name brick --brick-port 49153 --xlator-option >> gvol0-server.listen-port=49153 >> root 48227 1 0 Feb17 ? 00:00:26 /usr/sbin/glusterd -p >> /var/run/glusterd.pid --log-level INFO >> >> On "sg" in glusterd.log we're seeing: >> >> [2023-02-23 20:26:57.619318 +0000] E [rpc-clnt.c:181:call_bail] >> 0-management: bailing out frame type(glusterd mgmt v3), op(--(6)), xid = >> 0x11, unique = 27, sent = 2023-02-23 20:16:50.596447 +0000, timeout = 600 >> for 10.20.20.11:24007 >> [2023-02-23 20:26:57.619425 +0000] E [MSGID: 106115] >> [glusterd-mgmt.c:122:gd_mgmt_v3_collate_errors] 0-management: Unlocking >> failed on br. Please check log file for details. >> [2023-02-23 20:26:57.619545 +0000] E [MSGID: 106151] >> [glusterd-syncop.c:1655:gd_unlock_op_phase] 0-management: Failed to unlock >> on some peer(s) >> [2023-02-23 20:26:57.619693 +0000] W >> [glusterd-locks.c:817:glusterd_mgmt_v3_unlock] >> (-->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe19b9) >> [0x7fadf47fa9b9] >> -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe0e20) >> [0x7fadf47f9e20] >> -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe7904) >> [0x7fadf4800904] ) 0-management: Lock owner mismatch. Lock for vol gvol0 >> held by 11e528b0-8c69-4b5d-82ed-c41dd25536d6 >> [2023-02-23 20:26:57.619780 +0000] E [MSGID: 106117] >> [glusterd-syncop.c:1679:gd_unlock_op_phase] 0-management: Unable to release >> lock for gvol0 >> [2023-02-23 20:26:57.619939 +0000] I >> [socket.c:3811:socket_submit_outgoing_msg] 0-socket.management: not >> connected (priv->connected = -1) >> [2023-02-23 20:26:57.619969 +0000] E >> [rpcsvc.c:1567:rpcsvc_submit_generic] 0-rpc-service: failed to submit >> message (XID: 0x3, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to >> rpc-transport (socket.management) >> [2023-02-23 20:26:57.619995 +0000] E [MSGID: 106430] >> [glusterd-utils.c:678:glusterd_submit_reply] 0-glusterd: Reply submission >> failed >> >> And in the brick log: >> >> [2023-02-23 20:22:56.717721 +0000] I [addr.c:54:compare_addr_and_update] >> 0-/nodirectwritedata/gluster/gvol0: allowed = "*", received addr = >> "10.20.20.11" >> [2023-02-23 20:22:56.717817 +0000] I [login.c:110:gf_auth] 0-auth/login: >> allowed user names: a26c7de4-1236-4e0a-944a-cb82de7f7f0e >> [2023-02-23 20:22:56.717840 +0000] I [MSGID: 115029] >> [server-handshake.c:561:server_setvolume] 0-gvol0-server: accepted client >> from >> CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0 >> (version: 9.1) with subvol /nodirectwritedata/gluster/gvol0 >> [2023-02-23 20:22:56.741545 +0000] W [socket.c:766:__socket_rwv] >> 0-tcp.gvol0-server: readv on 10.20.20.11:49144 failed (No data available) >> [2023-02-23 20:22:56.741599 +0000] I [MSGID: 115036] >> [server.c:500:server_rpc_notify] 0-gvol0-server: disconnecting connection >> [{client-uid=CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0}] >> >> [2023-02-23 20:22:56.741866 +0000] I [MSGID: 101055] >> [client_t.c:397:gf_client_unref] 0-gvol0-server: Shutting down connection >> CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0 >> >> >> Thanks for your help, >> >> -- >> David Cunningham, Voisonics Limited >> http://voisonics.com/ >> USA: +1 213 221 1092 >> New Zealand: +64 (0)28 2558 3782 >> > > > -- > David Cunningham, Voisonics Limited > http://voisonics.com/ > USA: +1 213 221 1092 > New Zealand: +64 (0)28 2558 3782 > -- David Cunningham, Voisonics Limited http://voisonics.com/ USA: +1 213 221 1092 New Zealand: +64 (0)28 2558 3782
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users