Technically if only one node is pumping all these status commands, you shouldn't get into this situation. Can you please help me with the latest cmd_history & glusterd log files from all the nodes?
On Wed, Jul 26, 2017 at 1:41 PM, Paolo Margara <[email protected]> wrote: > Hi Atin, > > I've initially disabled gluster status check on all nodes except on one on > my nagios instance as you recommended but this issue happens again. > > So I've disabled it on each nodes but the error happens again, currently > only oVirt is monitoring gluster. > > I cannot modify this behaviour in the oVirt GUI, there is anything that > could I do from the gluster prospective to solve this issue? Considering > that 3.8 is near EOL also upgrading to 3.10 could be an option. > > > Greetings, > > Paolo > > Il 20/07/2017 15:37, Paolo Margara ha scritto: > > OK, on my nagios instance I've disabled gluster status check on all nodes > except on one, I'll check if this is enough. > > Thanks, > > Paolo > > Il 20/07/2017 13:50, Atin Mukherjee ha scritto: > > So from the cmd_history.logs across all the nodes it's evident that > multiple commands on the same volume are run simultaneously which can > result into transactions collision and you can end up with one command > succeeding and others failing. Ideally if you are running volume status > command for monitoring it's suggested to be run from only one node. > > On Thu, Jul 20, 2017 at 3:54 PM, Paolo Margara <[email protected]> > wrote: > >> In attachment the requested logs for all the three nodes. >> >> thanks, >> >> Paolo >> >> Il 20/07/2017 11:38, Atin Mukherjee ha scritto: >> >> Please share the cmd_history.log file from all the storage nodes. >> >> On Thu, Jul 20, 2017 at 2:34 PM, Paolo Margara <[email protected]> >> wrote: >> >>> Hi list, >>> >>> recently I've noted a strange behaviour of my gluster storage, sometimes >>> while executing a simple command like "gluster volume status >>> vm-images-repo" as a response I got "Another transaction is in progress for >>> vm-images-repo. Please try again after sometime.". This situation does not >>> get solved simply waiting for but I've to restart glusterd on the node that >>> hold (and does not release) the lock, this situation occur randomly after >>> some days. In the meanwhile, prior and after the issue appear, everything >>> is working as expected. >>> >>> I'm using gluster 3.8.12 on CentOS 7.3, the only relevant information >>> that I found on the log file (etc-glusterfs-glusterd.vol.log) of my >>> three nodes are the following: >>> >>> * node1, at the moment the issue begins: >>> >>> [2017-07-19 15:07:43.130203] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] >>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f) >>> [0x7f373f25f00f] -->/usr/lib64/glusterfs/3.8.12 >>> /xlator/mgmt/glusterd.so(+0x2ba25) [0x7f373f250a25] >>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f) >>> [0x7f373f2f548f] ) 0-management: Lock for vm-images-repo held by >>> 2c6f154f-efe3-4479-addc-b2021aa9d5df >>> [2017-07-19 15:07:43.128242] I [MSGID: 106499] >>> [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: >>> Received status volume req for volume vm-images-repo >>> [2017-07-19 15:07:43.130244] E [MSGID: 106119] >>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to >>> acquire lock for vm-images-repo >>> [2017-07-19 15:07:43.130320] E [MSGID: 106376] >>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned: >>> -1 >>> [2017-07-19 15:07:43.130665] E [MSGID: 106116] >>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>> failed on virtnode-0-1-gluster. Please check log file for details. >>> [2017-07-19 15:07:43.131293] E [MSGID: 106116] >>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>> failed on virtnode-0-2-gluster. Please check log file for details. >>> [2017-07-19 15:07:43.131360] E [MSGID: 106151] >>> [glusterd-syncop.c:1884:gd_sync_task_begin] 0-management: Locking Peers >>> Failed. >>> [2017-07-19 15:07:43.132005] E [MSGID: 106116] >>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Unlocking >>> failed on virtnode-0-2-gluster. Please check log file for details. >>> [2017-07-19 15:07:43.132182] E [MSGID: 106116] >>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Unlocking >>> failed on virtnode-0-1-gluster. Please check log file for details. >>> >>> * node2, at the moment the issue begins: >>> >>> [2017-07-19 15:07:43.131975] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] >>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f) >>> [0x7f17b5b9e00f] -->/usr/lib64/glusterfs/3.8.12 >>> /xlator/mgmt/glusterd.so(+0x2ba25) [0x7f17b5b8fa25] >>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f) >>> [0x7f17b5c3448f] ) 0-management: Lock for vm-images-repo held by >>> d9047ecd-26b5-467b-8e91-50f76a0c4d16 >>> [2017-07-19 15:07:43.132019] E [MSGID: 106119] >>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to >>> acquire lock for vm-images-repo >>> [2017-07-19 15:07:43.133568] W >>> [glusterd-locks.c:686:glusterd_mgmt_v3_unlock] >>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f) >>> [0x7f17b5b9e00f] -->/usr/lib64/glusterfs/3.8.12 >>> /xlator/mgmt/glusterd.so(+0x2b712) [0x7f17b5b8f712] >>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a) >>> [0x7f17b5c3482a] ) 0-management: Lock owner mismatch. Lock for vol >>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16 >>> [2017-07-19 15:07:43.133597] E [MSGID: 106118] >>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable to >>> release lock for vm-images-repo >>> The message "E [MSGID: 106376] [glusterd-op-sm.c:7775:glusterd_op_sm] >>> 0-management: handler returned: -1" repeated 3 times between [2017-07-19 >>> 15:07:42.976193] and [2017-07-19 15:07:43.133646] >>> >>> * node3, at the moment the issue begins: >>> >>> [2017-07-19 15:07:42.976593] I [MSGID: 106499] >>> [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: >>> Received status volume req for volume vm-images-repo >>> [2017-07-19 15:07:43.129941] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] >>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f) >>> [0x7f6133f5b00f] -->/usr/lib64/glusterfs/3.8.12 >>> /xlator/mgmt/glusterd.so(+0x2ba25) [0x7f6133f4ca25] >>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f) >>> [0x7f6133ff148f] ) 0-management: Lock for vm-images-repo held by >>> d9047ecd-26b5-467b-8e91-50f76a0c4d16 >>> [2017-07-19 15:07:43.129981] E [MSGID: 106119] >>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to >>> acquire lock for vm-images-repo >>> [2017-07-19 15:07:43.130034] E [MSGID: 106376] >>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned: >>> -1 >>> [2017-07-19 15:07:43.130131] E [MSGID: 106275] >>> [glusterd-rpc-ops.c:876:glusterd_mgmt_v3_lock_peers_cbk_fn] >>> 0-management: Received mgmt_v3 lock RJT from uuid: >>> 2c6f154f-efe3-4479-addc-b2021aa9d5df >>> [2017-07-19 15:07:43.130710] W >>> [glusterd-locks.c:686:glusterd_mgmt_v3_unlock] >>> (-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f) >>> [0x7f6133f5b00f] -->/usr/lib64/glusterfs/3.8.12 >>> /xlator/mgmt/glusterd.so(+0x2b712) [0x7f6133f4c712] >>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a) >>> [0x7f6133ff182a] ) 0-management: Lock owner mismatch. Lock for vol >>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16 >>> [2017-07-19 15:07:43.130733] E [MSGID: 106118] >>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable to >>> release lock for vm-images-repo >>> [2017-07-19 15:07:43.130771] E [MSGID: 106376] >>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler returned: >>> -1 >>> >>> The thing that is really strange is that in this case the uuid of node3 >>> is d9047ecd-26b5-467b-8e91-50f76a0c4d16! >>> >>> The mapping nodename-uuid is: >>> >>> * (node1) virtnode-0-0-gluster: 2c6f154f-efe3-4479-addc-b2021aa9d5df >>> >>> * (node2) virtnode-0-1-gluster: e93ebee7-5d95-4100-a9df-4a3e60134b73 >>> >>> * (node3) virtnode-0-2-gluster: d9047ecd-26b5-467b-8e91-50f76a0c4d16 >>> >>> In this case restarting glusterd on node3 usually solve the issue. >>> >>> What could be the root cause of this behavior? How can I fix this once >>> and for all? >>> >>> If needed I could provide the full log file. >>> >>> >>> Greetings, >>> >>> Paolo Margara >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> > > _______________________________________________ > Gluster-users mailing > [email protected]http://lists.gluster.org/mailman/listinfo/gluster-users > >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
