Hello Atin, I've tried restarting gluster one after another, but still see the same result.
On Tue, May 30, 2017 at 10:40 AM, Atin Mukherjee <[email protected]> wrote: > Pawan - I couldn't reach to any conclusive analysis so far. But, looking > at the client (nfs) & glusterd log files, it does look like that there is > an issue w.r.t peer connections. Does restarting all the glusterd one by > one solve this? > > On Mon, May 29, 2017 at 4:50 PM, Pawan Alwandi <[email protected]> wrote: > >> Sorry for big attachment in previous mail...last 1000 lines of those logs >> attached now. >> >> On Mon, May 29, 2017 at 4:44 PM, Pawan Alwandi <[email protected]> wrote: >> >>> >>> >>> On Thu, May 25, 2017 at 9:54 PM, Atin Mukherjee <[email protected]> >>> wrote: >>> >>>> >>>> On Thu, 25 May 2017 at 19:11, Pawan Alwandi <[email protected]> wrote: >>>> >>>>> Hello Atin, >>>>> >>>>> Yes, glusterd on other instances are up and running. Below is the >>>>> requested output on all the three hosts. >>>>> >>>>> Host 1 >>>>> >>>>> # gluster peer status >>>>> Number of Peers: 2 >>>>> >>>>> Hostname: 192.168.0.7 >>>>> Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>> State: Peer in Cluster (Disconnected) >>>>> >>>> >>>> Glusterd is disconnected here. >>>> >>>>> >>>>> >>>>> Hostname: 192.168.0.6 >>>>> Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>> State: Peer in Cluster (Disconnected) >>>>> >>>> >>>> Same as above >>>> >>>> Can you please check what does glusterd log have to say here about >>>> these disconnects? >>>> >>> >>> glusterd keeps logging this every 3s >>> >>> [2017-05-29 11:04:52.182782] W [socket.c:852:__socket_keepalive] >>> 0-socket: failed to set keep idle -1 on socket 5, Invalid argument >>> [2017-05-29 11:04:52.182808] E [socket.c:2966:socket_connect] >>> 0-management: Failed to set keep-alive: Invalid argument >>> [2017-05-29 11:04:52.183032] W [socket.c:852:__socket_keepalive] >>> 0-socket: failed to set keep idle -1 on socket 20, Invalid argument >>> [2017-05-29 11:04:52.183052] E [socket.c:2966:socket_connect] >>> 0-management: Failed to set keep-alive: Invalid argument >>> [2017-05-29 11:04:52.183622] E [rpc-clnt.c:362:saved_frames_unwind] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8] >>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) >>> called at 2017-05-29 11:04:52.183210 (xid=0x23419) >>> [2017-05-29 11:04:52.183735] W >>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl >>> usterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu >>> sterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu >>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] ) >>> 0-management: Lock for vol shared not held >>> [2017-05-29 11:04:52.183928] E [rpc-clnt.c:362:saved_frames_unwind] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e] >>> (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8] >>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) >>> called at 2017-05-29 11:04:52.183422 (xid=0x23419) >>> [2017-05-29 11:04:52.184027] W >>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl >>> usterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu >>> sterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu >>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] ) >>> 0-management: Lock for vol shared not held >>> >>> >>> >>>> >>>> >>>>> >>>>> # gluster volume status >>>>> Status of volume: shared >>>>> Gluster process TCP Port RDMA Port >>>>> Online Pid >>>>> ------------------------------------------------------------ >>>>> ------------------ >>>>> Brick 192.168.0.5:/data/exports/shared 49152 0 >>>>> Y 2105 >>>>> NFS Server on localhost 2049 0 >>>>> Y 2089 >>>>> Self-heal Daemon on localhost N/A N/A >>>>> Y 2097 >>>>> >>>> >>>> Volume status output does show all the bricks are up. So I'm not sure >>>> why are you seeing the volume as read only. Can you please provide the >>>> mount log? >>>> >>> >>> The attached tar has nfs.log, etc-glusterfs-glusterd.vol.log, >>> glustershd.log from host1. >>> >>> >>>> >>>> >>>>> >>>>> Task Status of Volume shared >>>>> ------------------------------------------------------------ >>>>> ------------------ >>>>> There are no active volume tasks >>>>> >>>>> Host 2 >>>>> >>>>> # gluster peer status >>>>> Number of Peers: 2 >>>>> >>>>> Hostname: 192.168.0.7 >>>>> Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>> State: Peer in Cluster (Connected) >>>>> >>>>> Hostname: 192.168.0.5 >>>>> Uuid: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>> State: Peer in Cluster (Connected) >>>>> >>>>> >>>>> # gluster volume status >>>>> Status of volume: shared >>>>> Gluster process Port Online Pid >>>>> ------------------------------------------------------------ >>>>> ------------------ >>>>> Brick 192.168.0.5:/data/exports/shared 49152 Y 2105 >>>>> Brick 192.168.0.6:/data/exports/shared 49152 Y 2188 >>>>> Brick 192.168.0.7:/data/exports/shared 49152 Y 2453 >>>>> NFS Server on localhost 2049 Y 2194 >>>>> Self-heal Daemon on localhost N/A Y 2199 >>>>> NFS Server on 192.168.0.5 2049 Y 2089 >>>>> Self-heal Daemon on 192.168.0.5 N/A Y 2097 >>>>> NFS Server on 192.168.0.7 2049 Y 2458 >>>>> Self-heal Daemon on 192.168.0.7 N/A Y 2463 >>>>> >>>>> Task Status of Volume shared >>>>> ------------------------------------------------------------ >>>>> ------------------ >>>>> There are no active volume tasks >>>>> >>>>> Host 3 >>>>> >>>>> # gluster peer status >>>>> Number of Peers: 2 >>>>> >>>>> Hostname: 192.168.0.5 >>>>> Uuid: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>> State: Peer in Cluster (Connected) >>>>> >>>>> Hostname: 192.168.0.6 >>>>> Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>> State: Peer in Cluster (Connected) >>>>> >>>>> # gluster volume status >>>>> Status of volume: shared >>>>> Gluster process Port Online Pid >>>>> ------------------------------------------------------------ >>>>> ------------------ >>>>> Brick 192.168.0.5:/data/exports/shared 49152 Y 2105 >>>>> Brick 192.168.0.6:/data/exports/shared 49152 Y 2188 >>>>> Brick 192.168.0.7:/data/exports/shared 49152 Y 2453 >>>>> NFS Server on localhost 2049 Y 2458 >>>>> Self-heal Daemon on localhost N/A Y 2463 >>>>> NFS Server on 192.168.0.6 2049 Y 2194 >>>>> Self-heal Daemon on 192.168.0.6 N/A Y 2199 >>>>> NFS Server on 192.168.0.5 2049 Y 2089 >>>>> Self-heal Daemon on 192.168.0.5 N/A Y 2097 >>>>> >>>>> Task Status of Volume shared >>>>> ------------------------------------------------------------ >>>>> ------------------ >>>>> There are no active volume tasks >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, May 24, 2017 at 8:32 PM, Atin Mukherjee <[email protected]> >>>>> wrote: >>>>> >>>>>> Are the other glusterd instances are up? output of gluster peer >>>>>> status & gluster volume status please? >>>>>> >>>>>> On Wed, May 24, 2017 at 4:20 PM, Pawan Alwandi <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Thanks Atin, >>>>>>> >>>>>>> So I got gluster downgraded to 3.7.9 on host 1 and now have the >>>>>>> glusterfs and glusterfsd processes come up. But I see the volume is >>>>>>> mounted read only. >>>>>>> >>>>>>> I see these being logged every 3s: >>>>>>> >>>>>>> [2017-05-24 10:45:44.440435] W [socket.c:852:__socket_keepalive] >>>>>>> 0-socket: failed to set keep idle -1 on socket 17, Invalid argument >>>>>>> [2017-05-24 10:45:44.440475] E [socket.c:2966:socket_connect] >>>>>>> 0-management: Failed to set keep-alive: Invalid argument >>>>>>> [2017-05-24 10:45:44.440734] W [socket.c:852:__socket_keepalive] >>>>>>> 0-socket: failed to set keep idle -1 on socket 20, Invalid argument >>>>>>> [2017-05-24 10:45:44.440754] E [socket.c:2966:socket_connect] >>>>>>> 0-management: Failed to set keep-alive: Invalid argument >>>>>>> [2017-05-24 10:45:44.441354] E [rpc-clnt.c:362:saved_frames_unwind] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8] >>>>>>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) >>>>>>> op(DUMP(1)) >>>>>>> called at 2017-05-24 10:45:44.440945 (xid=0xbf) >>>>>>> [2017-05-24 10:45:44.441505] W >>>>>>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl >>>>>>> usterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb] >>>>>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu >>>>>>> sterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a] >>>>>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu >>>>>>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] ) >>>>>>> 0-management: Lock for vol shared not held >>>>>>> [2017-05-24 10:45:44.441660] E [rpc-clnt.c:362:saved_frames_unwind] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e] >>>>>>> (--> >>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8] >>>>>>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) >>>>>>> op(DUMP(1)) >>>>>>> called at 2017-05-24 10:45:44.441086 (xid=0xbf) >>>>>>> [2017-05-24 10:45:44.441790] W >>>>>>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl >>>>>>> usterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb] >>>>>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu >>>>>>> sterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a] >>>>>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu >>>>>>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] ) >>>>>>> 0-management: Lock for vol shared not held >>>>>>> >>>>>>> The heal info says this: >>>>>>> >>>>>>> # gluster volume heal shared info >>>>>>> Brick 192.168.0.5:/data/exports/shared >>>>>>> Number of entries: 0 >>>>>>> >>>>>>> Brick 192.168.0.6:/data/exports/shared >>>>>>> Status: Transport endpoint is not connected >>>>>>> >>>>>>> Brick 192.168.0.7:/data/exports/shared >>>>>>> Status: Transport endpoint is not connected >>>>>>> >>>>>>> Any idea whats up here? >>>>>>> >>>>>>> Pawan >>>>>>> >>>>>>> On Mon, May 22, 2017 at 9:42 PM, Atin Mukherjee <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, May 22, 2017 at 9:05 PM, Pawan Alwandi <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, May 22, 2017 at 8:36 PM, Atin Mukherjee < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, May 22, 2017 at 7:51 PM, Atin Mukherjee < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Sorry Pawan, I did miss the other part of the attachments. So >>>>>>>>>>> looking from the glusterd.info file from all the hosts, it >>>>>>>>>>> looks like host2 and host3 do not have the correct op-version. Can >>>>>>>>>>> you >>>>>>>>>>> please set the op-version as "operating-version=30702" in host2 and >>>>>>>>>>> host3 >>>>>>>>>>> and restart glusterd instance one by one on all the nodes? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Please ensure that all the hosts are upgraded to the same bits >>>>>>>>>> before doing this change. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Having to upgrade all 3 hosts to newer version before gluster >>>>>>>>> could work successfully on any of them means application downtime. >>>>>>>>> The >>>>>>>>> applications running on these hosts are expected to be highly >>>>>>>>> available. >>>>>>>>> So with the way the things are right now, is an online upgrade >>>>>>>>> possible? >>>>>>>>> My upgrade steps are: (1) stop the applications (2) umount the gluster >>>>>>>>> volume, and then (3) upgrade gluster one host at a time. >>>>>>>>> >>>>>>>> >>>>>>>> One of the way to mitigate this is to first do an online upgrade to >>>>>>>> glusterfs-3.7.9 (op-version:30707) given this bug was introduced in >>>>>>>> 3.7.10 >>>>>>>> and then come to 3.11. >>>>>>>> >>>>>>>> >>>>>>>>> Our goal is to get gluster upgraded to 3.11 from 3.6.9, and to >>>>>>>>> make this an online upgrade we are okay to take two steps 3.6.9 -> >>>>>>>>> 3.7 and >>>>>>>>> then 3.7 to 3.11. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Apparently it looks like there is a bug which you have >>>>>>>>>>> uncovered, during peer handshaking if one of the glusterd instance >>>>>>>>>>> is >>>>>>>>>>> running with old bits then during validating the handshake request >>>>>>>>>>> there is >>>>>>>>>>> a possibility that uuid received will be blank and the same was >>>>>>>>>>> ignored >>>>>>>>>>> however there was a patch http://review.gluster.org/13519 which >>>>>>>>>>> had some additional changes which was always looking at this field >>>>>>>>>>> and >>>>>>>>>>> doing some extra checks which was causing the handshake to fail. >>>>>>>>>>> For now, >>>>>>>>>>> the above workaround should suffice. I'll be sending a patch pretty >>>>>>>>>>> soon. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Posted a patch https://review.gluster.org/#/c/17358 . >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, May 22, 2017 at 11:35 AM, Pawan Alwandi < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello Atin, >>>>>>>>>>>> >>>>>>>>>>>> The tar's have the content of `/var/lib/glusterd` too for all 3 >>>>>>>>>>>> nodes, please check again. >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> >>>>>>>>>>>> On Mon, May 22, 2017 at 11:32 AM, Atin Mukherjee < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Pawan, >>>>>>>>>>>>> >>>>>>>>>>>>> I see you have provided the log files from the nodes, however >>>>>>>>>>>>> it'd be really helpful if you can provide me the content of >>>>>>>>>>>>> /var/lib/glusterd from all the nodes to get to the root cause of >>>>>>>>>>>>> this issue. >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, May 19, 2017 at 12:09 PM, Pawan Alwandi < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello Atin, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for continued support. I've attached requested files >>>>>>>>>>>>>> from all 3 nodes. >>>>>>>>>>>>>> >>>>>>>>>>>>>> (I think we already verified the UUIDs to be correct, anyway >>>>>>>>>>>>>> let us know if you find any more info in the logs) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Pawan >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, May 18, 2017 at 11:45 PM, Atin Mukherjee < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, 18 May 2017 at 23:40, Atin Mukherjee < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, 17 May 2017 at 12:47, Pawan Alwandi >>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hello Atin, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I realized that these http://gluster.readthedocs.io/ >>>>>>>>>>>>>>>>> en/latest/Upgrade-Guide/upgrade_to_3.10/ instructions >>>>>>>>>>>>>>>>> only work for upgrades from 3.7, while we are running 3.6.2. >>>>>>>>>>>>>>>>> Are there >>>>>>>>>>>>>>>>> instructions/suggestion you have for us to upgrade from 3.6 >>>>>>>>>>>>>>>>> version? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I believe upgrade from 3.6 to 3.7 and then to 3.10 would >>>>>>>>>>>>>>>>> work, but I see similar errors reported when I upgraded to >>>>>>>>>>>>>>>>> 3.7 too. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> For what its worth, I was able to set the op-version >>>>>>>>>>>>>>>>> (gluster v set all cluster.op-version 30702) but that doesn't >>>>>>>>>>>>>>>>> seem to help. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [2017-05-17 06:48:33.700014] I [MSGID: 100030] >>>>>>>>>>>>>>>>> [glusterfsd.c:2338:main] 0-/usr/sbin/glusterd: Started running >>>>>>>>>>>>>>>>> /usr/sbin/glusterd version 3.7.20 (args: /usr/sbin/glusterd -p >>>>>>>>>>>>>>>>> /var/run/glusterd.pid) >>>>>>>>>>>>>>>>> [2017-05-17 06:48:33.703808] I [MSGID: 106478] >>>>>>>>>>>>>>>>> [glusterd.c:1383:init] 0-management: Maximum allowed open >>>>>>>>>>>>>>>>> file descriptors >>>>>>>>>>>>>>>>> set to 65536 >>>>>>>>>>>>>>>>> [2017-05-17 06:48:33.703836] I [MSGID: 106479] >>>>>>>>>>>>>>>>> [glusterd.c:1432:init] 0-management: Using /var/lib/glusterd >>>>>>>>>>>>>>>>> as working >>>>>>>>>>>>>>>>> directory >>>>>>>>>>>>>>>>> [2017-05-17 06:48:33.708866] W [MSGID: 103071] >>>>>>>>>>>>>>>>> [rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma: >>>>>>>>>>>>>>>>> rdma_cm event channel creation failed [No such device] >>>>>>>>>>>>>>>>> [2017-05-17 06:48:33.709011] W [MSGID: 103055] >>>>>>>>>>>>>>>>> [rdma.c:4901:init] 0-rdma.management: Failed to initialize IB >>>>>>>>>>>>>>>>> Device >>>>>>>>>>>>>>>>> [2017-05-17 06:48:33.709033] W >>>>>>>>>>>>>>>>> [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: >>>>>>>>>>>>>>>>> 'rdma' initialization failed >>>>>>>>>>>>>>>>> [2017-05-17 06:48:33.709088] W >>>>>>>>>>>>>>>>> [rpcsvc.c:1642:rpcsvc_create_listener] 0-rpc-service: >>>>>>>>>>>>>>>>> cannot create listener, initing the transport failed >>>>>>>>>>>>>>>>> [2017-05-17 06:48:33.709105] E [MSGID: 106243] >>>>>>>>>>>>>>>>> [glusterd.c:1656:init] 0-management: creation of 1 listeners >>>>>>>>>>>>>>>>> failed, >>>>>>>>>>>>>>>>> continuing with succeeded transport >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.480043] I [MSGID: 106513] >>>>>>>>>>>>>>>>> [glusterd-store.c:2068:glusterd_restore_op_version] >>>>>>>>>>>>>>>>> 0-glusterd: retrieved op-version: 30600 >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.605779] I [MSGID: 106498] >>>>>>>>>>>>>>>>> [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo] >>>>>>>>>>>>>>>>> 0-management: connect returned 0 >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.607059] I >>>>>>>>>>>>>>>>> [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management: >>>>>>>>>>>>>>>>> setting frame-timeout to 600 >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.607670] I >>>>>>>>>>>>>>>>> [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management: >>>>>>>>>>>>>>>>> setting frame-timeout to 600 >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.607025] I [MSGID: 106498] >>>>>>>>>>>>>>>>> [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo] >>>>>>>>>>>>>>>>> 0-management: connect returned 0 >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.608125] I [MSGID: 106544] >>>>>>>>>>>>>>>>> [glusterd.c:159:glusterd_uuid_init] 0-management: >>>>>>>>>>>>>>>>> retrieved UUID: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Final graph: >>>>>>>>>>>>>>>>> +----------------------------- >>>>>>>>>>>>>>>>> -------------------------------------------------+ >>>>>>>>>>>>>>>>> 1: volume management >>>>>>>>>>>>>>>>> 2: type mgmt/glusterd >>>>>>>>>>>>>>>>> 3: option rpc-auth.auth-glusterfs on >>>>>>>>>>>>>>>>> 4: option rpc-auth.auth-unix on >>>>>>>>>>>>>>>>> 5: option rpc-auth.auth-null on >>>>>>>>>>>>>>>>> 6: option rpc-auth-allow-insecure on >>>>>>>>>>>>>>>>> 7: option transport.socket.listen-backlog 128 >>>>>>>>>>>>>>>>> 8: option event-threads 1 >>>>>>>>>>>>>>>>> 9: option ping-timeout 0 >>>>>>>>>>>>>>>>> 10: option transport.socket.read-fail-log off >>>>>>>>>>>>>>>>> 11: option transport.socket.keepalive-interval 2 >>>>>>>>>>>>>>>>> 12: option transport.socket.keepalive-time 10 >>>>>>>>>>>>>>>>> 13: option transport-type rdma >>>>>>>>>>>>>>>>> 14: option working-directory /var/lib/glusterd >>>>>>>>>>>>>>>>> 15: end-volume >>>>>>>>>>>>>>>>> 16: >>>>>>>>>>>>>>>>> +----------------------------- >>>>>>>>>>>>>>>>> -------------------------------------------------+ >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.609868] I [MSGID: 101190] >>>>>>>>>>>>>>>>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: >>>>>>>>>>>>>>>>> Started thread with index 1 >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.610839] W [socket.c:596:__socket_rwv] >>>>>>>>>>>>>>>>> 0-management: readv on 192.168.0.7:24007 failed (No data >>>>>>>>>>>>>>>>> available) >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.611907] E >>>>>>>>>>>>>>>>> [rpc-clnt.c:370:saved_frames_unwind] (--> >>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> lusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3] >>>>>>>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> frpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] (--> >>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> frpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] (--> >>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> frpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39] >>>>>>>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> frpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] ))))) >>>>>>>>>>>>>>>>> 0-management: forced unwinding frame type(GLUSTERD-DUMP) >>>>>>>>>>>>>>>>> op(DUMP(1)) called >>>>>>>>>>>>>>>>> at 2017-05-17 06:48:35.609965 (xid=0x1) >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.611928] E [MSGID: 106167] >>>>>>>>>>>>>>>>> [glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk] >>>>>>>>>>>>>>>>> 0-management: Error through RPC layer, retry again later >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.611944] I [MSGID: 106004] >>>>>>>>>>>>>>>>> [glusterd-handler.c:5201:__glusterd_peer_rpc_notify] >>>>>>>>>>>>>>>>> 0-management: Peer <192.168.0.7> >>>>>>>>>>>>>>>>> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>), >>>>>>>>>>>>>>>>> in state <Peer in Cluster>, has disconnected from glusterd. >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.612024] W >>>>>>>>>>>>>>>>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>>>>>>>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/ >>>>>>>>>>>>>>>>> glusterfs/3.7.20/xlator/mgmt/g >>>>>>>>>>>>>>>>> lusterd.so(glusterd_big_locked_notify+0x4b) >>>>>>>>>>>>>>>>> [0x7fd6bdc4912b] -->/usr/lib/x86_64-linux-gnu/g >>>>>>>>>>>>>>>>> lusterfs/3.7.20/xlator/mgmt/gl >>>>>>>>>>>>>>>>> usterd.so(__glusterd_peer_rpc_notify+0x160) >>>>>>>>>>>>>>>>> [0x7fd6bdc52dd0] -->/usr/lib/x86_64-linux-gnu/g >>>>>>>>>>>>>>>>> lusterfs/3.7.20/xlator/mgmt/gl >>>>>>>>>>>>>>>>> usterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7fd6bdcef1b3] >>>>>>>>>>>>>>>>> ) 0-management: Lock for vol shared not held >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.612039] W [MSGID: 106118] >>>>>>>>>>>>>>>>> [glusterd-handler.c:5223:__glusterd_peer_rpc_notify] >>>>>>>>>>>>>>>>> 0-management: Lock not released for shared >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.612079] W [socket.c:596:__socket_rwv] >>>>>>>>>>>>>>>>> 0-management: readv on 192.168.0.6:24007 failed (No data >>>>>>>>>>>>>>>>> available) >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.612179] E >>>>>>>>>>>>>>>>> [rpc-clnt.c:370:saved_frames_unwind] (--> >>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> lusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3] >>>>>>>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> frpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] (--> >>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> frpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] (--> >>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> frpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39] >>>>>>>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>> frpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] ))))) >>>>>>>>>>>>>>>>> 0-management: forced unwinding frame type(GLUSTERD-DUMP) >>>>>>>>>>>>>>>>> op(DUMP(1)) called >>>>>>>>>>>>>>>>> at 2017-05-17 06:48:35.610007 (xid=0x1) >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.612197] E [MSGID: 106167] >>>>>>>>>>>>>>>>> [glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk] >>>>>>>>>>>>>>>>> 0-management: Error through RPC layer, retry again later >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.612211] I [MSGID: 106004] >>>>>>>>>>>>>>>>> [glusterd-handler.c:5201:__glusterd_peer_rpc_notify] >>>>>>>>>>>>>>>>> 0-management: Peer <192.168.0.6> >>>>>>>>>>>>>>>>> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>), >>>>>>>>>>>>>>>>> in state <Peer in Cluster>, has disconnected from glusterd. >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.612292] W >>>>>>>>>>>>>>>>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>>>>>>>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/ >>>>>>>>>>>>>>>>> glusterfs/3.7.20/xlator/mgmt/g >>>>>>>>>>>>>>>>> lusterd.so(glusterd_big_locked_notify+0x4b) >>>>>>>>>>>>>>>>> [0x7fd6bdc4912b] -->/usr/lib/x86_64-linux-gnu/g >>>>>>>>>>>>>>>>> lusterfs/3.7.20/xlator/mgmt/gl >>>>>>>>>>>>>>>>> usterd.so(__glusterd_peer_rpc_notify+0x160) >>>>>>>>>>>>>>>>> [0x7fd6bdc52dd0] -->/usr/lib/x86_64-linux-gnu/g >>>>>>>>>>>>>>>>> lusterfs/3.7.20/xlator/mgmt/gl >>>>>>>>>>>>>>>>> usterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7fd6bdcef1b3] >>>>>>>>>>>>>>>>> ) 0-management: Lock for vol shared not held >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.613432] W [MSGID: 106118] >>>>>>>>>>>>>>>>> [glusterd-handler.c:5223:__glusterd_peer_rpc_notify] >>>>>>>>>>>>>>>>> 0-management: Lock not released for shared >>>>>>>>>>>>>>>>> [2017-05-17 06:48:35.614317] E [MSGID: 106170] >>>>>>>>>>>>>>>>> [glusterd-handshake.c:1051:gd_validate_mgmt_hndsk_req] >>>>>>>>>>>>>>>>> 0-management: Request from peer 192.168.0.6:991 has an >>>>>>>>>>>>>>>>> entry in peerinfo, but uuid does not match >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Apologies for delay. My initial suspect was correct. You >>>>>>>>>>>>>>>> have an incorrect UUID in the peer file which is causing this. >>>>>>>>>>>>>>>> Can you >>>>>>>>>>>>>>>> please provide me the >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Clicked the send button accidentally! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can you please send me the content of /var/lib/glusterd & >>>>>>>>>>>>>>> glusterd log from all the nodes? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, May 15, 2017 at 10:31 PM, Atin Mukherjee < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, 15 May 2017 at 11:58, Pawan Alwandi >>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi Atin, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I see below error. Do I require gluster to be upgraded >>>>>>>>>>>>>>>>>>> on all 3 hosts for this to work? Right now I have host 1 >>>>>>>>>>>>>>>>>>> running 3.10.1 >>>>>>>>>>>>>>>>>>> and host 2 & 3 running 3.6.2 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> # gluster v set all cluster.op-version 31001 >>>>>>>>>>>>>>>>>>> volume set: failed: Required op_version (31001) is not >>>>>>>>>>>>>>>>>>> supported >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yes you should given 3.6 version is EOLed. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mon, May 15, 2017 at 3:32 AM, Atin Mukherjee < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Sun, 14 May 2017 at 21:43, Atin Mukherjee < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Allright, I see that you haven't bumped up the >>>>>>>>>>>>>>>>>>>>> op-version. Can you please execute: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> gluster v set all cluster.op-version 30101 and then >>>>>>>>>>>>>>>>>>>>> restart glusterd on all the nodes and check the brick >>>>>>>>>>>>>>>>>>>>> status? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> s/30101/31001 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi < >>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hello Atin, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks for looking at this. Below is the output you >>>>>>>>>>>>>>>>>>>>>> requested for. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Again, I'm seeing those errors after upgrading >>>>>>>>>>>>>>>>>>>>>> gluster on host 1. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Host 1 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>>>>>>>>>>>>>>>> UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>>>>>>>>>>>>>>> operating-version=30600 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>>>>>>>>>>>>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>>>>>>>>>>>>>>>> state=3 >>>>>>>>>>>>>>>>>>>>>> hostname1=192.168.0.7 >>>>>>>>>>>>>>>>>>>>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>>>>>>>>>>>>>>>> state=3 >>>>>>>>>>>>>>>>>>>>>> hostname1=192.168.0.6 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # gluster --version >>>>>>>>>>>>>>>>>>>>>> glusterfs 3.10.1 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Host 2 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>>>>>>>>>>>>>>>> UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>>>>>>>>>>>>>>>> operating-version=30600 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>>>>>>>>>>>>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>>>>>>>>>>>>>>>> state=3 >>>>>>>>>>>>>>>>>>>>>> hostname1=192.168.0.7 >>>>>>>>>>>>>>>>>>>>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>>>>>>>>>>>>>>> state=3 >>>>>>>>>>>>>>>>>>>>>> hostname1=192.168.0.5 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # gluster --version >>>>>>>>>>>>>>>>>>>>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Host 3 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>>>>>>>>>>>>>>>> UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>>>>>>>>>>>>>>>> operating-version=30600 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>>>>>>>>>>>>>>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>>>>>>>>>>>>>>> state=3 >>>>>>>>>>>>>>>>>>>>>> hostname1=192.168.0.5 >>>>>>>>>>>>>>>>>>>>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>>>>>>>>>>>>>>>> state=3 >>>>>>>>>>>>>>>>>>>>>> hostname1=192.168.0.6 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> # gluster --version >>>>>>>>>>>>>>>>>>>>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Sat, May 13, 2017 at 6:28 PM, Atin Mukherjee < >>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I have already asked for the following earlier: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Can you please provide output of following from all >>>>>>>>>>>>>>>>>>>>>>> the nodes: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> cat /var/lib/glusterd/glusterd.info >>>>>>>>>>>>>>>>>>>>>>> cat /var/lib/glusterd/peers/* >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Sat, 13 May 2017 at 12:22, Pawan Alwandi >>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hello folks, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Does anyone have any idea whats going on here? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> Pawan >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi < >>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I'm trying to upgrade gluster from 3.6.2 to 3.10.1 >>>>>>>>>>>>>>>>>>>>>>>>> but don't see the glusterfsd and glusterfs processes >>>>>>>>>>>>>>>>>>>>>>>>> coming up. >>>>>>>>>>>>>>>>>>>>>>>>> http://gluster.readthedocs.io/ >>>>>>>>>>>>>>>>>>>>>>>>> en/latest/Upgrade-Guide/upgrade_to_3.10/ is the >>>>>>>>>>>>>>>>>>>>>>>>> process that I'm trying to follow. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> This is a 3 node server setup with a replicated >>>>>>>>>>>>>>>>>>>>>>>>> volume having replica count of 3. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Logs below: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:03.507959] I [MSGID: 100030] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: >>>>>>>>>>>>>>>>>>>>>>>>> Started running >>>>>>>>>>>>>>>>>>>>>>>>> /usr/sbin/glusterd version 3.10.1 (args: >>>>>>>>>>>>>>>>>>>>>>>>> /usr/sbin/glusterd -p >>>>>>>>>>>>>>>>>>>>>>>>> /var/run/glusterd.pid) >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:03.512827] I [MSGID: 106478] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd.c:1449:init] 0-management: Maximum allowed >>>>>>>>>>>>>>>>>>>>>>>>> open file descriptors >>>>>>>>>>>>>>>>>>>>>>>>> set to 65536 >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:03.512855] I [MSGID: 106479] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd.c:1496:init] 0-management: Using >>>>>>>>>>>>>>>>>>>>>>>>> /var/lib/glusterd as working >>>>>>>>>>>>>>>>>>>>>>>>> directory >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:03.520426] W [MSGID: 103071] >>>>>>>>>>>>>>>>>>>>>>>>> [rdma.c:4590:__gf_rdma_ctx_create] >>>>>>>>>>>>>>>>>>>>>>>>> 0-rpc-transport/rdma: rdma_cm event channel creation >>>>>>>>>>>>>>>>>>>>>>>>> failed [No such device] >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:03.520452] W [MSGID: 103055] >>>>>>>>>>>>>>>>>>>>>>>>> [rdma.c:4897:init] 0-rdma.management: Failed to >>>>>>>>>>>>>>>>>>>>>>>>> initialize IB Device >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:03.520465] W >>>>>>>>>>>>>>>>>>>>>>>>> [rpc-transport.c:350:rpc_transport_load] >>>>>>>>>>>>>>>>>>>>>>>>> 0-rpc-transport: 'rdma' initialization failed >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:03.520518] W >>>>>>>>>>>>>>>>>>>>>>>>> [rpcsvc.c:1661:rpcsvc_create_listener] >>>>>>>>>>>>>>>>>>>>>>>>> 0-rpc-service: cannot create listener, initing the >>>>>>>>>>>>>>>>>>>>>>>>> transport failed >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:03.520534] E [MSGID: 106243] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd.c:1720:init] 0-management: creation of 1 >>>>>>>>>>>>>>>>>>>>>>>>> listeners failed, >>>>>>>>>>>>>>>>>>>>>>>>> continuing with succeeded transport >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:04.931764] I [MSGID: 106513] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-store.c:2197:glusterd_restore_op_version] >>>>>>>>>>>>>>>>>>>>>>>>> 0-glusterd: retrieved op-version: 30600 >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:04.964354] I [MSGID: 106544] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd.c:158:glusterd_uuid_init] 0-management: >>>>>>>>>>>>>>>>>>>>>>>>> retrieved UUID: 7f2a6e11-2a53-4ab4-9ceb-8be6a9 >>>>>>>>>>>>>>>>>>>>>>>>> f2d073 >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:04.993944] I [MSGID: 106498] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] >>>>>>>>>>>>>>>>>>>>>>>>> 0-management: connect returned 0 >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:04.995864] I [MSGID: 106498] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] >>>>>>>>>>>>>>>>>>>>>>>>> 0-management: connect returned 0 >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:04.995879] W [MSGID: 106062] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-handler.c:3466:glust >>>>>>>>>>>>>>>>>>>>>>>>> erd_transport_inet_options_build] 0-glusterd: >>>>>>>>>>>>>>>>>>>>>>>>> Failed to get tcp-user-timeout >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:04.995903] I >>>>>>>>>>>>>>>>>>>>>>>>> [rpc-clnt.c:1059:rpc_clnt_connection_init] >>>>>>>>>>>>>>>>>>>>>>>>> 0-management: setting frame-timeout to 600 >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:04.996325] I >>>>>>>>>>>>>>>>>>>>>>>>> [rpc-clnt.c:1059:rpc_clnt_connection_init] >>>>>>>>>>>>>>>>>>>>>>>>> 0-management: setting frame-timeout to 600 >>>>>>>>>>>>>>>>>>>>>>>>> Final graph: >>>>>>>>>>>>>>>>>>>>>>>>> +----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>> -------------------------------------------------+ >>>>>>>>>>>>>>>>>>>>>>>>> 1: volume management >>>>>>>>>>>>>>>>>>>>>>>>> 2: type mgmt/glusterd >>>>>>>>>>>>>>>>>>>>>>>>> 3: option rpc-auth.auth-glusterfs on >>>>>>>>>>>>>>>>>>>>>>>>> 4: option rpc-auth.auth-unix on >>>>>>>>>>>>>>>>>>>>>>>>> 5: option rpc-auth.auth-null on >>>>>>>>>>>>>>>>>>>>>>>>> 6: option rpc-auth-allow-insecure on >>>>>>>>>>>>>>>>>>>>>>>>> 7: option transport.socket.listen-backlog >>>>>>>>>>>>>>>>>>>>>>>>> 128 >>>>>>>>>>>>>>>>>>>>>>>>> 8: option event-threads 1 >>>>>>>>>>>>>>>>>>>>>>>>> 9: option ping-timeout 0 >>>>>>>>>>>>>>>>>>>>>>>>> 10: option transport.socket.read-fail-log off >>>>>>>>>>>>>>>>>>>>>>>>> 11: option transport.socket.keepalive-interval >>>>>>>>>>>>>>>>>>>>>>>>> 2 >>>>>>>>>>>>>>>>>>>>>>>>> 12: option transport.socket.keepalive-time 10 >>>>>>>>>>>>>>>>>>>>>>>>> 13: option transport-type rdma >>>>>>>>>>>>>>>>>>>>>>>>> 14: option working-directory /var/lib/glusterd >>>>>>>>>>>>>>>>>>>>>>>>> 15: end-volume >>>>>>>>>>>>>>>>>>>>>>>>> 16: >>>>>>>>>>>>>>>>>>>>>>>>> +----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>> -------------------------------------------------+ >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:04.996310] W [MSGID: 106062] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-handler.c:3466:glust >>>>>>>>>>>>>>>>>>>>>>>>> erd_transport_inet_options_build] 0-glusterd: >>>>>>>>>>>>>>>>>>>>>>>>> Failed to get tcp-user-timeout >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.000461] I [MSGID: 101190] >>>>>>>>>>>>>>>>>>>>>>>>> [event-epoll.c:629:event_dispatch_epoll_worker] >>>>>>>>>>>>>>>>>>>>>>>>> 0-epoll: Started thread with index 1 >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.001493] W >>>>>>>>>>>>>>>>>>>>>>>>> [socket.c:593:__socket_rwv] 0-management: readv on >>>>>>>>>>>>>>>>>>>>>>>>> 192.168.0.7:24007 failed (No data available) >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.001513] I [MSGID: 106004] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-handler.c:5882:__glusterd_peer_rpc_notify] >>>>>>>>>>>>>>>>>>>>>>>>> 0-management: Peer <192.168.0.7> >>>>>>>>>>>>>>>>>>>>>>>>> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>), >>>>>>>>>>>>>>>>>>>>>>>>> in state <Peer in Cluster>, h >>>>>>>>>>>>>>>>>>>>>>>>> as disconnected from glusterd. >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.001677] W >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] >>>>>>>>>>>>>>>>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/ >>>>>>>>>>>>>>>>>>>>>>>>> glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559) >>>>>>>>>>>>>>>>>>>>>>>>> [0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu >>>>>>>>>>>>>>>>>>>>>>>>> /glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0) >>>>>>>>>>>>>>>>>>>>>>>>> [0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g >>>>>>>>>>>>>>>>>>>>>>>>> lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3) >>>>>>>>>>>>>>>>>>>>>>>>> [0x7f0bf9e29ba3] ) 0-management: Lock for vol shared >>>>>>>>>>>>>>>>>>>>>>>>> no >>>>>>>>>>>>>>>>>>>>>>>>> t held >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.001696] W [MSGID: 106118] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-handler.c:5907:__glusterd_peer_rpc_notify] >>>>>>>>>>>>>>>>>>>>>>>>> 0-management: Lock not released for shared >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.003099] E >>>>>>>>>>>>>>>>>>>>>>>>> [rpc-clnt.c:365:saved_frames_unwind] (--> >>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>>>>>>>>>> lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c] >>>>>>>>>>>>>>>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s >>>>>>>>>>>>>>>>>>>>>>>>> aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (--> >>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>>>>>>>>>> frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] >>>>>>>>>>>>>>>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>>>>>>>>>> frpc.so.0(rpc_clnt_connection_cleanup+0x >>>>>>>>>>>>>>>>>>>>>>>>> 91)[0x7f0bfec91c21] (--> >>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>>>>>>>>>> frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] >>>>>>>>>>>>>>>>>>>>>>>>> ))))) 0-management: forced unwinding frame >>>>>>>>>>>>>>>>>>>>>>>>> type(GLUSTERD-DUMP) op(DUMP(1)) >>>>>>>>>>>>>>>>>>>>>>>>> called at 2017-05-10 09:0 >>>>>>>>>>>>>>>>>>>>>>>>> 7:05.000627 (xid=0x1) >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.003129] E [MSGID: 106167] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-handshake.c:2181:__glusterd_peer_dump_version_cbk] >>>>>>>>>>>>>>>>>>>>>>>>> 0-management: Error through RPC layer, retry again >>>>>>>>>>>>>>>>>>>>>>>>> later >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.003251] W >>>>>>>>>>>>>>>>>>>>>>>>> [socket.c:593:__socket_rwv] 0-management: readv on >>>>>>>>>>>>>>>>>>>>>>>>> 192.168.0.6:24007 failed (No data available) >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.003267] I [MSGID: 106004] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-handler.c:5882:__glusterd_peer_rpc_notify] >>>>>>>>>>>>>>>>>>>>>>>>> 0-management: Peer <192.168.0.6> >>>>>>>>>>>>>>>>>>>>>>>>> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>), >>>>>>>>>>>>>>>>>>>>>>>>> in state <Peer in Cluster>, h >>>>>>>>>>>>>>>>>>>>>>>>> as disconnected from glusterd. >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.003318] W >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] >>>>>>>>>>>>>>>>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/ >>>>>>>>>>>>>>>>>>>>>>>>> glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559) >>>>>>>>>>>>>>>>>>>>>>>>> [0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu >>>>>>>>>>>>>>>>>>>>>>>>> /glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0) >>>>>>>>>>>>>>>>>>>>>>>>> [0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g >>>>>>>>>>>>>>>>>>>>>>>>> lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3) >>>>>>>>>>>>>>>>>>>>>>>>> [0x7f0bf9e29ba3] ) 0-management: Lock for vol shared >>>>>>>>>>>>>>>>>>>>>>>>> no >>>>>>>>>>>>>>>>>>>>>>>>> t held >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.003329] W [MSGID: 106118] >>>>>>>>>>>>>>>>>>>>>>>>> [glusterd-handler.c:5907:__glusterd_peer_rpc_notify] >>>>>>>>>>>>>>>>>>>>>>>>> 0-management: Lock not released for shared >>>>>>>>>>>>>>>>>>>>>>>>> [2017-05-10 09:07:05.003457] E >>>>>>>>>>>>>>>>>>>>>>>>> [rpc-clnt.c:365:saved_frames_unwind] (--> >>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>>>>>>>>>> lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c] >>>>>>>>>>>>>>>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s >>>>>>>>>>>>>>>>>>>>>>>>> aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (--> >>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>>>>>>>>>> frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] >>>>>>>>>>>>>>>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>>>>>>>>>> frpc.so.0(rpc_clnt_connection_cleanup+0x >>>>>>>>>>>>>>>>>>>>>>>>> 91)[0x7f0bfec91c21] (--> >>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>>>>>>>>>>>>>>> frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] >>>>>>>>>>>>>>>>>>>>>>>>> ))))) 0-management: forced unwinding frame >>>>>>>>>>>>>>>>>>>>>>>>> type(GLUSTERD-DUMP) op(DUMP(1)) >>>>>>>>>>>>>>>>>>>>>>>>> called at 2017-05-10 09:0 >>>>>>>>>>>>>>>>>>>>>>>>> 7:05.001407 (xid=0x1) >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> There are a bunch of errors reported but I'm not >>>>>>>>>>>>>>>>>>>>>>>>> sure which is signal and which ones are noise. Does >>>>>>>>>>>>>>>>>>>>>>>>> anyone have any idea >>>>>>>>>>>>>>>>>>>>>>>>> whats going on here? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>> Pawan >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>>>>> http://lists.gluster.org/mailm >>>>>>>>>>>>>>>>>>>>>>>> an/listinfo/gluster-users >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>> - Atin (atinm) >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> - Atin (atinm) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> - Atin (atinm) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> - Atin (atinm) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> - Atin (atinm) >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> -- >>>> - Atin (atinm) >>>> >>> >>> >> >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
