This logfile didn't have any logs about heal. Could you find the same file on the other node as well and attach it to the mail-thread? We should also see the mount logs once to confirm replication did something or not. Otherwise we should see the brick logs. Instead of checking it iteratively, could you tar /var/log/glusterfs on both the nodes and give links to these files so that we can download and check what might have happened?
On Thu, Sep 20, 2018 at 2:00 PM Johan Karlsson <[email protected]> wrote: > I understand that a 2 way replica can require some fiddling with heal, but > how is it possible that all data just vanished, even from the bricks? > > --- > gluster> volume info > > Volume Name: gvol0 > Type: Replicate > Volume ID: 17ed4d1c-2120-4fe8-abd6-dd77d7ddac59 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: gfs01:/glusterdata/brick1/gvol0 > Brick2: gfs02:/glusterdata/brick2/gvol0 > Options Reconfigured: > performance.client-io-threads: off > nfs.disable: on > transport.address-family: inet > --- > > --- > gfs01 - Standard upgrade: > > Start-Date: 2018-09-12 12:51:51 > Commandline: apt-get dist-upgrade > --- > > --- > gfs02 - standard upgrade: > > Start-Date: 2018-09-12 13:28:32 > Commandline: apt-get dist-upgrade > --- > > --- > gfs01 glustershd.log > > [2018-09-12 12:52:56.211130] W [socket.c:592:__socket_rwv] 0-glusterfs: > readv on 127.0.0.1:24007 failed (No data available) > [2018-09-12 12:52:56.211155] I [glusterfsd-mgmt.c:2341:mgmt_rpc_notify] > 0-glusterfsd-mgmt: disconnected from remote-host: localhost > [2018-09-12 12:53:06.844040] E [socket.c:2517:socket_connect_finish] > 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection refused); > disconnecting socket > [2018-09-12 12:53:06.844066] I [glusterfsd-mgmt.c:2362:mgmt_rpc_notify] > 0-glusterfsd-mgmt: Exhausted all volfile servers > [2018-09-12 12:54:04.224545] W [glusterfsd.c:1514:cleanup_and_exit] > (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7fee21cfa6ba] > -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xed) [0x55872a03a70d] > -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x55872a03a524 > ] ) 0-: received signum (15), shutting down > [2018-09-12 12:54:05.221508] I [MSGID: 100030] [glusterfsd.c:2741:main] > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.4 > (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p > /var/run/gluster/glustershd/glustershd.pid > -l /var/log/glusterfs/glustershd.log -S > /var/run/gluster/c7535c5e8ebaab32.socket --xlator-option > *replicate*.node-uuid=5865e739-3c64-4039-8f96-5fc7a75d00fe --process-name > glustershd) > [2018-09-12 12:54:05.225264] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 1 > [2018-09-12 12:54:06.246818] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 2 > [2018-09-12 12:54:06.247109] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-0: parent translators are ready, attempting connect on > transport > [2018-09-12 12:54:06.247236] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-1: parent translators are ready, attempting connect on > transport > [2018-09-12 12:54:06.247269] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > Final graph: > > +------------------------------------------------------------------------------+ > 1: volume gvol0-client-0 > 2: type protocol/client > 3: option ping-timeout 42 > 4: option remote-host gfs01 > 5: option remote-subvolume /glusterdata/brick1/gvol0 > 6: option transport-type socket > 7: option transport.address-family inet > 8: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 9: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 10: option transport.tcp-user-timeout 0 > 11: option transport.socket.keepalive-time 20 > 12: option transport.socket.keepalive-interval 2 > 13: option transport.socket.keepalive-count 9 > 14: end-volume > 15: > 16: volume gvol0-client-1 > 17: type protocol/client > 18: option ping-timeout 42 > 19: option remote-host gfs02 > 20: option remote-subvolume /glusterdata/brick2/gvol0 > 21: option transport-type socket > 22: option transport.address-family inet > 23: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 24: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 25: option transport.tcp-user-timeout 0 > 26: option transport.socket.keepalive-time 20 > 27: option transport.socket.keepalive-interval 2 > 28: option transport.socket.keepalive-count 9 > 29: end-volume > 30: > 31: volume gvol0-replicate-0 > 32: type cluster/replicate > 33: option node-uuid 5865e739-3c64-4039-8f96-5fc7a75d00fe > 34: option afr-pending-xattr gvol0-client-0,gvol0-client-1 > 35: option background-self-heal-count 0 > 36: option metadata-self-heal on > 37: option data-self-heal on > 38: option entry-self-heal on > 39: option self-heal-daemon enable > 40: option use-compound-fops off > 41: option iam-self-heal-daemon yes > 42: subvolumes gvol0-client-0 gvol0-client-1 > 43: end-volume > 44: > 45: volume glustershd > [2018-09-12 12:54:06.247484] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > 46: type debug/io-stats > 47: option log-level INFO > 48: subvolumes gvol0-replicate-0 > 49: end-volume > 50: > > +------------------------------------------------------------------------------+ > [2018-09-12 12:54:06.249099] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 12:54:06.249561] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-1: changing port to 49152 (from 0) > [2018-09-12 12:54:06.249790] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 12:54:06.250309] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 12:54:06.250889] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-1: Connected > to gvol0-client-1, attached to remote volume '/glusterdata/brick2/gvol0'. > [2018-09-12 12:54:06.250904] I [MSGID: 108005] > [afr-common.c:5240:__afr_handle_child_up_event] 0-gvol0-replicate-0: > Subvolume 'gvol0-client-1' came back up; going online. > [2018-09-12 12:54:06.260091] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 12:54:06.269981] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-0: changing port to 49152 (from 0) > [2018-09-12 12:54:06.270175] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 12:54:06.270309] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 12:54:06.270698] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-0: Connected > to gvol0-client-0, attached to remote volume '/glusterdata/brick1/gvol0'. > [2018-09-12 13:57:40.616257] W [socket.c:592:__socket_rwv] 0-glusterfs: > readv on 127.0.0.1:24007 failed (No data available) > [2018-09-12 13:57:40.616312] I [glusterfsd-mgmt.c:2348:mgmt_rpc_notify] > 0-glusterfsd-mgmt: disconnected from remote-host: localhost > [2018-09-12 13:57:50.942555] W [glusterfsd.c:1514:cleanup_and_exit] > (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7fb690a156ba] > -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xed) [0x561b24e0d70d] > -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x561b24e0d524 > ] ) 0-: received signum (15), shutting down > [2018-09-12 13:58:06.192019] I [MSGID: 100030] [glusterfsd.c:2741:main] > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.4 > (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p > /var/run/gluster/glustershd/glustershd.pid > -l /var/log/glusterfs/glustershd.log -S > /var/run/gluster/c7535c5e8ebaab32.socket --xlator-option > *replicate*.node-uuid=5865e739-3c64-4039-8f96-5fc7a75d00fe --process-name > glustershd) > [2018-09-12 13:58:06.196996] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 1 > [2018-09-12 13:58:07.322458] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 2 > [2018-09-12 13:58:07.322772] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-0: parent translators are ready, attempting connect on > transport > [2018-09-12 13:58:07.323166] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-1: parent translators are ready, attempting connect on > transport > [2018-09-12 13:58:07.323196] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.323327] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.323420] E [MSGID: 114058] > [client-handshake.c:1523:client_query_portmap_cbk] 0-gvol0-client-0: failed > to get the port number for remote subvolume. Please run 'gluster volume > status' on server to see if brick process is running. > [2018-09-12 13:58:07.323459] I [MSGID: 114018] > [client.c:2254:client_rpc_notify] 0-gvol0-client-0: disconnected from > gvol0-client-0. Client process will keep trying to connect to glusterd > until brick's port is available > [2018-09-12 13:58:07.323486] E [MSGID: 108006] > [afr-common.c:5317:__afr_handle_child_down_event] 0-gvol0-replicate-0: All > subvolumes are down. Going offline until atleast one of them comes back up. > Final graph: > > +------------------------------------------------------------------------------+ > 1: volume gvol0-client-0 > 2: type protocol/client > 3: option ping-timeout 42 > 4: option remote-host gfs01 > 5: option remote-subvolume /glusterdata/brick1/gvol0 > 6: option transport-type socket > 7: option transport.address-family inet > 8: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 9: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 10: option transport.tcp-user-timeout 0 > 11: option transport.socket.keepalive-time 20 > 12: option transport.socket.keepalive-interval 2 > 13: option transport.socket.keepalive-count 9 > 14: end-volume > 15: > 16: volume gvol0-client-1 > 17: type protocol/client > 18: option ping-timeout 42 > 19: option remote-host gfs02 > 20: option remote-subvolume /glusterdata/brick2/gvol0 > 21: option transport-type socket > 22: option transport.address-family inet > 23: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 24: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 25: option transport.tcp-user-timeout 0 > 26: option transport.socket.keepalive-time 20 > 27: option transport.socket.keepalive-interval 2 > 28: option transport.socket.keepalive-count 9 > 29: end-volume > 30: > 31: volume gvol0-replicate-0 > 32: type cluster/replicate > 33: option node-uuid 5865e739-3c64-4039-8f96-5fc7a75d00fe > 34: option afr-pending-xattr gvol0-client-0,gvol0-client-1 > 35: option background-self-heal-count 0 > 36: option metadata-self-heal on > 37: option data-self-heal on > 38: option entry-self-heal on > 39: option self-heal-daemon enable > 40: option use-compound-fops off > 41: option iam-self-heal-daemon yes > 42: subvolumes gvol0-client-0 gvol0-client-1 > 43: end-volume > 44: > 45: volume glustershd > 46: type debug/io-stats > 47: option log-level INFO > 48: subvolumes gvol0-replicate-0 > 49: end-volume > 50: > > +------------------------------------------------------------------------------+ > [2018-09-12 13:58:07.323808] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.324101] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.324288] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-1: changing port to 49152 (from 0) > [2018-09-12 13:58:07.324737] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.325066] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.337185] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-1: Connected > to gvol0-client-1, attached to remote volume '/glusterdata/brick2/gvol0'. > [2018-09-12 13:58:07.337202] I [MSGID: 108005] > [afr-common.c:5240:__afr_handle_child_up_event] 0-gvol0-replicate-0: > Subvolume 'gvol0-client-1' came back up; going online. > [2018-09-12 13:58:11.193402] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:11.193575] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:11.193661] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-0: changing port to 49152 (from 0) > [2018-09-12 13:58:11.193975] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:11.194217] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:11.194773] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-0: Connected > to gvol0-client-0, attached to remote volume '/glusterdata/brick1/gvol0'. > [2018-09-12 13:59:05.215057] W [socket.c:592:__socket_rwv] > 0-gvol0-client-1: readv on 192.168.4.85:49152 failed (No data available) > [2018-09-12 13:59:05.215112] I [MSGID: 114018] > [client.c:2254:client_rpc_notify] 0-gvol0-client-1: disconnected from > gvol0-client-1. Client process will keep trying to connect to glusterd > until brick's port is available > [2018-09-12 13:59:18.521991] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:19.504398] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:19.505038] E [MSGID: 114058] > [client-handshake.c:1523:client_query_portmap_cbk] 0-gvol0-client-1: failed > to get the port number for remote subvolume. Please run 'gluster volume > status' on server to see if brick process is running. > [2018-09-12 13:59:19.505088] I [MSGID: 114018] > [client.c:2254:client_rpc_notify] 0-gvol0-client-1: disconnected from > gvol0-client-1. Client process will keep trying to connect to glusterd > until brick's port is available > [2018-09-12 13:59:21.519674] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:21.519929] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:21.520103] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-1: changing port to 49152 (from 0) > [2018-09-12 13:59:21.520531] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:21.520754] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:21.521890] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-1: Connected > to gvol0-client-1, attached to remote volume '/glusterdata/brick2/gvol0'. > --- > > --- > gfs01 mountpoint log: > > [2018-09-12 13:58:06.497145] I [MSGID: 100030] [glusterfsd.c:2741:main] > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.4 > (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=gfs01 > --volfile-id=/gvol0 /tss/filestore) > [2018-09-12 13:58:06.534575] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 1 > [2018-09-12 13:58:07.381591] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 2 > [2018-09-12 13:58:07.386730] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-0: parent translators are ready, attempting connect on > transport > [2018-09-12 13:58:07.387087] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-1: parent translators are ready, attempting connect on > transport > [2018-09-12 13:58:07.387129] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.387268] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > Final graph: > > +------------------------------------------------------------------------------+ > 1: volume gvol0-client-0 > 2: type protocol/client > 3: option ping-timeout 42 > 4: option remote-host gfs01 > 5: option remote-subvolume /glusterdata/brick1/gvol0 > 6: option transport-type socket > 7: option transport.address-family inet > 8: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 9: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 10: option transport.tcp-user-timeout 0 > 11: option transport.socket.keepalive-time 20 > 12: option transport.socket.keepalive-interval 2 > 13: option transport.socket.keepalive-count 9 > 14: option send-gids true > 15: end-volume > 16: > 17: volume gvol0-client-1 > 18: type protocol/client > 19: option ping-timeout 42 > 20: option remote-host gfs02 > 21: option remote-subvolume /glusterdata/brick2/gvol0 > 22: option transport-type socket > 23: option transport.address-family inet > 24: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 25: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 26: option transport.tcp-user-timeout 0 > 27: option transport.socket.keepalive-time 20 > 28: option transport.socket.keepalive-interval 2 > 29: option transport.socket.keepalive-count 9 > [2018-09-12 13:58:07.387367] E [MSGID: 114058] > [client-handshake.c:1523:client_query_portmap_cbk] 0-gvol0-client-0: failed > to get the port number for remote subvolume. Please run 'gluster volume > status' on server to see if brick process is running. > 30: option send-gids true > 31: end-volume > 32: > 33: volume gvol0-replicate-0 > 34: type cluster/replicate > 35: option afr-pending-xattr gvol0-client-0,gvol0-client-1 > 36: option use-compound-fops off > 37: subvolumes gvol0-client-0 gvol0-client-1 > 38: end-volume > 39: > 40: volume gvol0-dht > 41: type cluster/distribute > 42: option lock-migration off > 43: option force-migration off > [2018-09-12 13:58:07.387461] I [MSGID: 114018] > [client.c:2254:client_rpc_notify] 0-gvol0-client-0: disconnected from > gvol0-client-0. Client process will keep trying to connect to glusterd > until brick's port is available > [2018-09-12 13:58:07.387490] E [MSGID: 108006] > [afr-common.c:5317:__afr_handle_child_down_event] 0-gvol0-replicate-0: All > subvolumes are down. Going offline until atleast one of them comes back up. > 44: subvolumes gvol0-replicate-0 > 45: end-volume > 46: > 47: volume gvol0-write-behind > 48: type performance/write-behind > 49: subvolumes gvol0-dht > 50: end-volume > 51: > 52: volume gvol0-read-ahead > 53: type performance/read-ahead > 54: subvolumes gvol0-write-behind > 55: end-volume > 56: > 57: volume gvol0-readdir-ahead > 58: type performance/readdir-ahead > 59: option parallel-readdir off > 60: option rda-request-size 131072 > 61: option rda-cache-limit 10MB > 62: subvolumes gvol0-read-ahead > 63: end-volume > 64: > 65: volume gvol0-io-cache > 66: type performance/io-cache > 67: subvolumes gvol0-readdir-ahead > 68: end-volume > 69: > 70: volume gvol0-quick-read > 71: type performance/quick-read > 72: subvolumes gvol0-io-cache > 73: end-volume > 74: > 75: volume gvol0-open-behind > 76: type performance/open-behind > 77: subvolumes gvol0-quick-read > 78: end-volume > 79: > 80: volume gvol0-md-cache > [2018-09-12 13:58:07.387621] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > 81: type performance/md-cache > 82: subvolumes gvol0-open-behind > 83: end-volume > 84: > 85: volume gvol0 > 86: type debug/io-stats > 87: option log-level INFO > 88: option latency-measurement off > 89: option count-fop-hits off > 90: subvolumes gvol0-md-cache > 91: end-volume > 92: > 93: volume meta-autoload > 94: type meta > 95: subvolumes gvol0 > 96: end-volume > 97: > > +------------------------------------------------------------------------------+ > [2018-09-12 13:58:07.387891] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.388118] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-1: changing port to 49152 (from 0) > [2018-09-12 13:58:07.388701] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.389814] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:07.390371] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-1: Connected > to gvol0-client-1, attached to remote volume '/glusterdata/brick2/gvol0'. > [2018-09-12 13:58:07.390390] I [MSGID: 108005] > [afr-common.c:5240:__afr_handle_child_up_event] 0-gvol0-replicate-0: > Subvolume 'gvol0-client-1' came back up; going online. > [2018-09-12 13:58:07.391330] I [fuse-bridge.c:4294:fuse_init] > 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel > 7.23 > [2018-09-12 13:58:07.391346] I [fuse-bridge.c:4927:fuse_graph_sync] > 0-fuse: switched to graph 0 > [2018-09-12 13:58:07.393037] I [MSGID: 109005] > [dht-selfheal.c:2342:dht_selfheal_directory] 0-gvol0-dht: Directory > selfheal failed: Unable to form layout for directory / > [2018-09-12 13:58:10.534498] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:10.534637] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:10.534727] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-0: changing port to 49152 (from 0) > [2018-09-12 13:58:10.535015] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:10.535155] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:10.536297] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-0: Connected > to gvol0-client-0, attached to remote volume '/glusterdata/brick1/gvol0'. > [2018-09-12 13:59:05.215073] W [socket.c:592:__socket_rwv] > 0-gvol0-client-1: readv on 192.168.4.85:49152 failed (No data available) > [2018-09-12 13:59:05.215112] I [MSGID: 114018] > [client.c:2254:client_rpc_notify] 0-gvol0-client-1: disconnected from > gvol0-client-1. Client process will keep trying to connect to glusterd > until brick's port is available > [2018-09-12 13:59:18.861826] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:19.505060] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:19.517843] E [MSGID: 114058] > [client-handshake.c:1523:client_query_portmap_cbk] 0-gvol0-client-1: failed > to get the port number for remote subvolume. Please run 'gluster volume > status' on server to see if brick process is running. > [2018-09-12 13:59:19.517934] I [MSGID: 114018] > [client.c:2254:client_rpc_notify] 0-gvol0-client-1: disconnected from > gvol0-client-1. Client process will keep trying to connect to glusterd > until brick's port is available > [2018-09-12 13:59:21.860457] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:21.860727] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:21.860903] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-1: changing port to 49152 (from 0) > [2018-09-12 13:59:21.861333] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:21.861588] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:21.862134] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-1: Connected > to gvol0-client-1, attached to remote volume '/glusterdata/brick2/gvol0'. > --- > > --- > gfs02 glustershd.log > > [2018-09-12 13:29:24.440044] W [socket.c:592:__socket_rwv] 0-glusterfs: > readv on 127.0.0.1:24007 failed (No data available) > [2018-09-12 13:29:24.440066] I [glusterfsd-mgmt.c:2341:mgmt_rpc_notify] > 0-glusterfsd-mgmt: disconnected from remote-host: localhost > [2018-09-12 13:29:35.300684] E [socket.c:2517:socket_connect_finish] > 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection refused); > disconnecting socket > [2018-09-12 13:29:35.300719] I [glusterfsd-mgmt.c:2362:mgmt_rpc_notify] > 0-glusterfsd-mgmt: Exhausted all volfile servers > [2018-09-12 13:30:28.718734] W [glusterfsd.c:1514:cleanup_and_exit] > (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f671aa8f6ba] > -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xed) [0x55d18aa3670d] > -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x55d18aa36524 > ] ) 0-: received signum (15), shutting down > [2018-09-12 13:30:29.721210] I [MSGID: 100030] [glusterfsd.c:2741:main] > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.4 > (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p > /var/run/gluster/glustershd/glustershd.pid > -l /var/log/glusterfs/glustershd.log -S > /var/run/gluster/3c69308176cfc594.socket --xlator-option > *replicate*.node-uuid=44192eee-3f26-4e14-84d5-be847d66df7b --process-name > glustershd) > [2018-09-12 13:30:29.724100] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 1 > [2018-09-12 13:30:30.748354] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 2 > [2018-09-12 13:30:30.752656] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-0: parent translators are ready, attempting connect on > transport > [2018-09-12 13:30:30.752794] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-1: parent translators are ready, attempting connect on > transport > [2018-09-12 13:30:30.753009] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > Final graph: > > +------------------------------------------------------------------------------+ > 1: volume gvol0-client-0 > 2: type protocol/client > 3: option ping-timeout 42 > 4: option remote-host gfs01 > 5: option remote-subvolume /glusterdata/brick1/gvol0 > 6: option transport-type socket > 7: option transport.address-family inet > 8: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > [2018-09-12 13:30:30.754060] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > 9: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 10: option transport.tcp-user-timeout 0 > 11: option transport.socket.keepalive-time 20 > 12: option transport.socket.keepalive-interval 2 > 13: option transport.socket.keepalive-count 9 > 14: end-volume > 15: > 16: volume gvol0-client-1 > 17: type protocol/client > 18: option ping-timeout 42 > 19: option remote-host gfs02 > 20: option remote-subvolume /glusterdata/brick2/gvol0 > 21: option transport-type socket > 22: option transport.address-family inet > 23: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 24: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 25: option transport.tcp-user-timeout 0 > 26: option transport.socket.keepalive-time 20 > 27: option transport.socket.keepalive-interval 2 > 28: option transport.socket.keepalive-count 9 > 29: end-volume > 30: > 31: volume gvol0-replicate-0 > 32: type cluster/replicate > 33: option node-uuid 44192eee-3f26-4e14-84d5-be847d66df7b > 34: option afr-pending-xattr gvol0-client-0,gvol0-client-1 > 35: option background-self-heal-count 0 > 36: option metadata-self-heal on > 37: option data-self-heal on > 38: option entry-self-heal on > 39: option self-heal-daemon enable > 40: option use-compound-fops off > 41: option iam-self-heal-daemon yes > 42: subvolumes gvol0-client-0 gvol0-client-1 > 43: end-volume > 44: > 45: volume glustershd > 46: type debug/io-stats > 47: option log-level INFO > 48: subvolumes gvol0-replicate-0 > 49: end-volume > 50: > > +------------------------------------------------------------------------------+ > [2018-09-12 13:30:30.763395] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:30:30.765518] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:30:30.765727] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-0: changing port to 49152 (from 0) > [2018-09-12 13:30:30.766021] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:30:30.766308] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:30:30.767339] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-0: Connected > to gvol0-client-0, attached to remote volume '/glusterdata/brick1/gvol0'. > [2018-09-12 13:30:30.767362] I [MSGID: 108005] > [afr-common.c:5240:__afr_handle_child_up_event] 0-gvol0-replicate-0: > Subvolume 'gvol0-client-0' came back up; going online. > [2018-09-12 13:30:30.772846] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-1: changing port to 49152 (from 0) > [2018-09-12 13:30:30.773011] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:30:30.773125] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:30:30.773472] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-1: Connected > to gvol0-client-1, attached to remote volume '/glusterdata/brick2/gvol0'. > [2018-09-12 13:58:05.409172] W [socket.c:592:__socket_rwv] > 0-gvol0-client-0: readv on 192.168.4.84:49152 failed (Connection reset by > peer) > [2018-09-12 13:58:05.409219] I [MSGID: 114018] > [client.c:2254:client_rpc_notify] 0-gvol0-client-0: disconnected from > gvol0-client-0. Client process will keep trying to connect to glusterd > until brick's port is available > [2018-09-12 13:58:15.871815] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:15.872066] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:15.872229] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-0: changing port to 49152 (from 0) > [2018-09-12 13:58:15.872457] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:15.872704] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:58:15.873272] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-0: Connected > to gvol0-client-0, attached to remote volume '/glusterdata/brick1/gvol0'. > [2018-09-12 13:58:54.575838] W [socket.c:592:__socket_rwv] 0-glusterfs: > readv on 127.0.0.1:24007 failed (No data available) > [2018-09-12 13:58:54.575873] I [glusterfsd-mgmt.c:2348:mgmt_rpc_notify] > 0-glusterfsd-mgmt: disconnected from remote-host: localhost > [2018-09-12 13:59:04.876731] E [socket.c:2517:socket_connect_finish] > 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection refused); > disconnecting socket > [2018-09-12 13:59:04.876764] I [glusterfsd-mgmt.c:2369:mgmt_rpc_notify] > 0-glusterfsd-mgmt: Exhausted all volfile servers > [2018-09-12 13:59:05.213422] W [glusterfsd.c:1514:cleanup_and_exit] > (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f995004b6ba] > -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xed) [0x55c76d21470d] > -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x55c76d214524 > ] ) 0-: received signum (15), shutting down > [2018-09-12 13:59:25.843013] I [MSGID: 100030] [glusterfsd.c:2741:main] > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.4 > (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p > /var/run/gluster/glustershd/glustershd.pid > -l /var/log/glusterfs/glustershd.log -S > /var/run/gluster/3c69308176cfc594.socket --xlator-option > *replicate*.node-uuid=44192eee-3f26-4e14-84d5-be847d66df7b --process-name > glustershd) > [2018-09-12 13:59:25.847197] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 1 > [2018-09-12 13:59:26.945403] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 2 > [2018-09-12 13:59:26.945824] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-0: parent translators are ready, attempting connect on > transport > [2018-09-12 13:59:26.946110] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-1: parent translators are ready, attempting connect on > transport > [2018-09-12 13:59:26.946384] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > Final graph: > > +------------------------------------------------------------------------------+ > 1: volume gvol0-client-0 > 2: type protocol/client > 3: option ping-timeout 42 > 4: option remote-host gfs01 > 5: option remote-subvolume /glusterdata/brick1/gvol0 > 6: option transport-type socket > 7: option transport.address-family inet > 8: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 9: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 10: option transport.tcp-user-timeout 0 > 11: option transport.socket.keepalive-time 20 > 12: option transport.socket.keepalive-interval 2 > 13: option transport.socket.keepalive-count 9 > 14: end-volume > 15: > 16: volume gvol0-client-1 > 17: type protocol/client > 18: option ping-timeout 42 > 19: option remote-host gfs02 > 20: option remote-subvolume /glusterdata/brick2/gvol0 > 21: option transport-type socket > 22: option transport.address-family inet > 23: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 24: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 25: option transport.tcp-user-timeout 0 > 26: option transport.socket.keepalive-time 20 > 27: option transport.socket.keepalive-interval 2 > 28: option transport.socket.keepalive-count 9 > 29: end-volume > 30: > 31: volume gvol0-replicate-0 > 32: type cluster/replicate > 33: option node-uuid 44192eee-3f26-4e14-84d5-be847d66df7b > 34: option afr-pending-xattr gvol0-client-0,gvol0-client-1 > 35: option background-self-heal-count 0 > 36: option metadata-self-heal on > 37: option data-self-heal on > 38: option entry-self-heal on > 39: option self-heal-daemon enable > 40: option use-compound-fops off > 41: option iam-self-heal-daemon yes > 42: subvolumes gvol0-client-0 gvol0-client-1 > 43: end-volume > 44: > 45: volume glustershd > 46: type debug/io-stats > 47: option log-level INFO > 48: subvolumes gvol0-replicate-0 > 49: end-volume > 50: > > +------------------------------------------------------------------------------+ > [2018-09-12 13:59:26.946860] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:26.946961] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:26.946966] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:26.947054] E [MSGID: 114058] > [client-handshake.c:1523:client_query_portmap_cbk] 0-gvol0-client-1: failed > to get the port number for remote subvolume. Please run 'gluster volume > status' on server to see if brick process is running. > [2018-09-12 13:59:26.947165] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-0: changing port to 49152 (from 0) > [2018-09-12 13:59:26.947213] I [MSGID: 114018] > [client.c:2254:client_rpc_notify] 0-gvol0-client-1: disconnected from > gvol0-client-1. Client process will keep trying to connect to glusterd > until brick's port is available > [2018-09-12 13:59:26.947233] E [MSGID: 108006] > [afr-common.c:5317:__afr_handle_child_down_event] 0-gvol0-replicate-0: All > subvolumes are down. Going offline until atleast one of them comes back up. > [2018-09-12 13:59:26.947557] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:26.947796] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:26.948355] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-0: Connected > to gvol0-client-0, attached to remote volume '/glusterdata/brick1/gvol0'. > [2018-09-12 13:59:26.948368] I [MSGID: 108005] > [afr-common.c:5240:__afr_handle_child_up_event] 0-gvol0-replicate-0: > Subvolume 'gvol0-client-0' came back up; going online. > [2018-09-12 13:59:30.845313] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:30.845467] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:30.845537] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-1: changing port to 49152 (from 0) > [2018-09-12 13:59:30.845785] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:30.845953] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:30.846293] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-1: Connected > to gvol0-client-1, attached to remote volume '/glusterdata/brick2/gvol0'. > --- > > --- > gfs02 mountpoint log: > > [2018-09-12 13:59:26.116762] I [MSGID: 100030] [glusterfsd.c:2741:main] > 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.4 > (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=gfs02 > --volfile-id=/gvol0 /tss/filestore) > [2018-09-12 13:59:26.142136] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 1 > [2018-09-12 13:59:27.029834] I [MSGID: 101190] > [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 2 > [2018-09-12 13:59:27.034636] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-0: parent translators are ready, attempting connect on > transport > [2018-09-12 13:59:27.034977] I [MSGID: 114020] [client.c:2328:notify] > 0-gvol0-client-1: parent translators are ready, attempting connect on > transport > Final graph: > > +------------------------------------------------------------------------------+ > 1: volume gvol0-client-0 > [2018-09-12 13:59:27.035277] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:27.035328] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > 2: type protocol/client > 3: option ping-timeout 42 > 4: option remote-host gfs01 > 5: option remote-subvolume /glusterdata/brick1/gvol0 > 6: option transport-type socket > 7: option transport.address-family inet > 8: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 9: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 10: option transport.tcp-user-timeout 0 > 11: option transport.socket.keepalive-time 20 > 12: option transport.socket.keepalive-interval 2 > 13: option transport.socket.keepalive-count 9 > 14: option send-gids true > 15: end-volume > 16: > 17: volume gvol0-client-1 > 18: type protocol/client > 19: option ping-timeout 42 > 20: option remote-host gfs02 > 21: option remote-subvolume /glusterdata/brick2/gvol0 > 22: option transport-type socket > 23: option transport.address-family inet > 24: option username d5e3e173-156f-46c1-9eb7-a35b201fc311 > 25: option password 8d3c3564-cef3-4261-90bd-c64e85c6d267 > 26: option transport.tcp-user-timeout 0 > 27: option transport.socket.keepalive-time 20 > 28: option transport.socket.keepalive-interval 2 > 29: option transport.socket.keepalive-count 9 > 30: option send-gids true > 31: end-volume > 32: > 33: volume gvol0-replicate-0 > 34: type cluster/replicate > 35: option afr-pending-xattr gvol0-client-0,gvol0-client-1 > 36: option use-compound-fops off > 37: subvolumes gvol0-client-0 gvol0-client-1 > 38: end-volume > 39: > 40: volume gvol0-dht > 41: type cluster/distribute > 42: option lock-migration off > 43: option force-migration off > 44: subvolumes gvol0-replicate-0 > 45: end-volume > 46: > 47: volume gvol0-write-behind > 48: type performance/write-behind > 49: subvolumes gvol0-dht > 50: end-volume > 51: > 52: volume gvol0-read-ahead > 53: type performance/read-ahead > 54: subvolumes gvol0-write-behind > 55: end-volume > 56: > 57: volume gvol0-readdir-ahead > 58: type performance/readdir-ahead > 59: option parallel-readdir off > 60: option rda-request-size 131072 > 61: option rda-cache-limit 10MB > 62: subvolumes gvol0-read-ahead > 63: end-volume > 64: > 65: volume gvol0-io-cache > 66: type performance/io-cache > 67: subvolumes gvol0-readdir-ahead > 68: end-volume > 69: > 70: volume gvol0-quick-read > 71: type performance/quick-read > 72: subvolumes gvol0-io-cache > 73: end-volume > 74: > 75: volume gvol0-open-behind > 76: type performance/open-behind > 77: subvolumes gvol0-quick-read > 78: end-volume > [2018-09-12 13:59:27.035568] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > 79: > [2018-09-12 13:59:27.035672] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > 80: volume gvol0-md-cache > 81: type performance/md-cache > 82: subvolumes gvol0-open-behind > 83: end-volume > 84: > 85: volume gvol0 > 86: type debug/io-stats > 87: option log-level INFO > 88: option latency-measurement off > 89: option count-fop-hits off > 90: subvolumes gvol0-md-cache > 91: end-volume > 92: > 93: volume meta-autoload > 94: type meta > 95: subvolumes gvol0 > 96: end-volume > 97: > > +------------------------------------------------------------------------------+ > [2018-09-12 13:59:27.035769] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-0: changing port to 49152 (from 0) > [2018-09-12 13:59:27.036156] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:27.036187] E [MSGID: 114058] > [client-handshake.c:1523:client_query_portmap_cbk] 0-gvol0-client-1: failed > to get the port number for remote subvolume. Please run 'gluster volume > status' on server to see if brick process is running. > [2018-09-12 13:59:27.036230] I [MSGID: 114018] > [client.c:2254:client_rpc_notify] 0-gvol0-client-1: disconnected from > gvol0-client-1. Client process will keep trying to connect to glusterd > until brick's port is available > [2018-09-12 13:59:27.036240] E [MSGID: 108006] > [afr-common.c:5317:__afr_handle_child_down_event] 0-gvol0-replicate-0: All > subvolumes are down. Going offline until atleast one of them comes back up. > [2018-09-12 13:59:27.036411] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-0: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:27.036967] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-0: Connected > to gvol0-client-0, attached to remote volume '/glusterdata/brick1/gvol0'. > [2018-09-12 13:59:27.036979] I [MSGID: 108005] > [afr-common.c:5240:__afr_handle_child_up_event] 0-gvol0-replicate-0: > Subvolume 'gvol0-client-0' came back up; going online. > [2018-09-12 13:59:27.037684] I [fuse-bridge.c:4294:fuse_init] > 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel > 7.23 > [2018-09-12 13:59:27.037696] I [fuse-bridge.c:4927:fuse_graph_sync] > 0-fuse: switched to graph 0 > [2018-09-12 13:59:27.038866] I [MSGID: 109005] > [dht-selfheal.c:2342:dht_selfheal_directory] 0-gvol0-dht: Directory > selfheal failed: Unable to form layout for directory / > [2018-09-12 13:59:30.139072] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:30.139208] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:30.139282] I [rpc-clnt.c:2105:rpc_clnt_reconfig] > 0-gvol0-client-1: changing port to 49152 (from 0) > [2018-09-12 13:59:30.139537] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:30.139650] W [rpc-clnt.c:1753:rpc_clnt_submit] > 0-gvol0-client-1: error returned while attempting to connect to > host:(null), port:0 > [2018-09-12 13:59:30.139981] I [MSGID: 114046] > [client-handshake.c:1176:client_setvolume_cbk] 0-gvol0-client-1: Connected > to gvol0-client-1, attached to remote volume '/glusterdata/brick2/gvol0'. > > > Regards, > > > Johan Karlsson > > > ------------------------------ > *From:* Pranith Kumar Karampuri <[email protected]> > *Sent:* Thursday, September 20, 2018 8:13:47 AM > *To:* Gowdappa, Raghavendra > *Cc:* Johan Karlsson; gluster-users; Ravishankar Narayanankutty > *Subject:* Re: [Gluster-users] Data on gluster volume gone > > Please also attach the logs for the mount points and the glustershd.logs > > On Thu, Sep 20, 2018 at 11:41 AM Pranith Kumar Karampuri < > [email protected]> wrote: > > How did you do the upgrade? > > On Thu, Sep 20, 2018 at 11:01 AM Raghavendra Gowdappa <[email protected]> > wrote: > > > > On Thu, Sep 20, 2018 at 1:29 AM, Raghavendra Gowdappa <[email protected] > > wrote: > > Can you give volume info? Looks like you are using 2 way replica. > > > Yes indeed. > gluster volume create gvol0 replica 2 gfs01:/glusterdata/brick1/gvol0 > gfs02:/glusterdata/brick2/gvol0 > > +Pranith. +Ravi. > > Not sure whether 2 way replication has caused this. From what I understand > we need either 3 way replication or arbiter for correct resolution of heals. > > > On Wed, Sep 19, 2018 at 9:39 AM, Johan Karlsson <[email protected]> > wrote: > > I have two servers setup with glusterFS in replica mode, a single volume > exposed via a mountpoint. The servers are running Ubuntu 16.04 LTS > > After a package upgrade + reboot of both servers, it was discovered that > the data was completely gone. New data written on the volume via the > mountpoint is replicated correctly, and gluster status/info commands states > that everything is ok (no split brain scenario or any healing needed etc). > But the previous data is completely gone, not even present on any of the > bricks. > > The following upgrade was done: > > glusterfs-server:amd64 (4.1.0-ubuntu1~xenial3 -> 4.1.4-ubuntu1~xenial1) > glusterfs-client:amd64 (4.1.0-ubuntu1~xenial3 -> 4.1.4-ubuntu1~xenial1) > glusterfs-common:amd64 (4.1.0-ubuntu1~xenial3 -> 4.1.4-ubuntu1~xenial1) > > The logs only show that connection between the servers was lost, which is > expected. > > I can't even determine if it was the package upgrade or the reboot that > caused this issue, but I've tried to recreate the issue without success. > > Any idea what could have gone wrong, or if I have done some wrong during > the setup. For reference, this is how I've done the setup: > > --- > Add a separate disk with a single partition on both servers (/dev/sdb1) > > Add gfs hostnames for direct communication without DNS, on both servers: > > /etc/hosts > > 192.168.4.45 gfs01 > 192.168.4.46 gfs02 > > On gfs01, create a new LVM Volume Group: > vgcreate gfs01-vg /dev/sdb1 > > And on the gfs02: > vgcreate gfs02-vg /dev/sdb1 > > Create logical volumes named "brick" on the servers: > > gfs01: > lvcreate -l 100%VG -n brick1 gfs01-vg > gfs02: > lvcreate -l 100%VG -n brick2 gfs02-vg > > Format the volumes with ext4 filesystem: > > gfs01: > mkfs.ext4 /dev/gfs01-vg/brick1 > gfs02: > mkfs.ext4 /dev/gfs02-vg/brick2 > > Create a mountpoint for the bricks on the servers: > > gfs01: > mkdir -p /glusterdata/brick1 > gds02: > mkdir -p /glusterdata/brick2 > > Make a permanent mount on the servers: > > gfs01: > /dev/gfs01-vg/brick1 /glusterdata/brick1 ext4 defaults 0 > 0 > gfs02: > /dev/gfs02-vg/brick2 /glusterdata/brick2 ext4 defaults 0 > 0 > > Mount it: > mount -a > > Create a gluster volume mount point on the bricks on the servers: > > gfs01: > mkdir -p /glusterdata/brick1/gvol0 > gfs02: > mkdir -p /glusterdata/brick2/gvol0 > > From each server, peer probe the other one: > > gluster peer probe gfs01 > peer probe: success > > gluster peer probe gfs02 > peer probe: success > > From any single server, create the gluster volume as a "replica" with two > nodes; gfs01 and gfs02: > > gluster volume create gvol0 replica 2 gfs01:/glusterdata/brick1/gvol0 > gfs02:/glusterdata/brick2/gvol0 > > Start the volume: > > gluster volume start gvol0 > > On each server, mount the gluster filesystem on the /filestore mount point: > > gfs01: > mount -t glusterfs gfs01:/gvol0 /filestore > gfs02: > mount -t glusterfs gfs02:/gvol0 /filestore > > Make the mount permanent on the servers: > > /etc/fstab > > gfs01: > gfs01:/gvol0 /filestore glusterfs defaults,_netdev 0 0 > gfs02: > gfs02:/gvol0 /filestore glusterfs defaults,_netdev 0 0 > --- > > Regards, > > Johan Karlsson > _______________________________________________ > Gluster-users mailing list > [email protected] > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > > > -- > Pranith > > > > -- > Pranith > -- Pranith
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
