[ClusterLabs] PSA: pvmove -i0 and LVM resource

Hayden,Robert Tue, 27 Mar 2018 11:13:18 -0700

Thought I would share an experience with the community.  We have RHEL 7.4 
clusters that uses the heartbeat LVM resource (HA-LVM volume group).  The LVM 
resource does a "vgscan --cache" command as part of its monitoring routine.


We have found that the pvmove command option "-i0" will block the vgscan 
command (most likely any LVM command).  The pvmove command just needs to be 
executed on any physical volume and not specifically on one being managed by 
RHCS.   In our case, the node where the pvmove was being executed was evicted 
from the cluster.

Blocking Command:   pvmove -v -i0 -n /dev/testvg/testlv00 /dev/mapper/mpathd1 
/dev/mapper/mpaths1

When testing without the -i0 option or with the -iX where X is non-zero, the 
pvmove did not block vgscan commands.

Associated errors in /var/log/messages:

Mar 26 14:03:27 nodeapp1 lvmpolld: W: LVMPOLLD: polling for output of the lvm 
cmd (PID 74134) has timed out

<skipped lines>
Mar 26 14:04:27 nodeapp1 lvmpolld: W: LVMPOLLD: polling for output of the lvm 
cmd (PID 74134) has timed out
Mar 26 14:04:32 nodeapp1 lrmd[81636]: warning: share1_vg_monitor_60000 process 
(PID 77254) timed out
Mar 26 14:04:32 nodeapp1 lrmd[81636]: warning: share1_vg_monitor_60000:77254 - 
timed out after 90000ms
Mar 26 14:04:32 nodeapp1 crmd[81641]:   error: Result of monitor operation for 
share1_vg on nodeapp1: Timed Out
Mar 26 14:04:32 nodeapp1 crmd[81641]:  notice: State transition S_IDLE -> 
S_POLICY_ENGINE

<skipped lines>
Mar 26 14:05:27 nodeapp1 LVM(share1_vg)[88723]: INFO: 0 logical volume(s) in 
volume group "share1vg" now active
Mar 26 14:05:27 nodeapp1 lvmpolld: W: LVMPOLLD: polling for output of the lvm 
cmd (PID 74134) has timed out
Mar 26 14:05:27 nodeapp1 lvmpolld[74130]: LVMPOLLD: LVM2 cmd is unresponsive 
too long (PID 74134) (no output for 180 seconds)

<skipped lines>
Mar 26 14:05:55 nodeapp1 lrmd[81636]: warning: share1_vg_stop_0 process (PID 
88723) timed out
Mar 26 14:05:55 nodeapp1 lrmd[81636]: warning: share1_vg_stop_0:88723 - timed 
out after 30000ms
Mar 26 14:05:55 nodeapp1 crmd[81641]:   error: Result of stop operation for 
share1_vg on nodeapp1: Timed Out
Mar 26 14:05:55 nodeapp1 crmd[81641]: warning: Action 6 (share1_vg_stop_0) on 
nodeapp1 failed (target: 0 vs. rc: 1): Error
Mar 26 14:05:55 nodeapp1 crmd[81641]:  notice: Transition aborted by operation 
share1_vg_stop_0 'modify' on nodeapp1: Event failed
Mar 26 14:05:55 nodeapp1 crmd[81641]: warning: Action 6 (share1_vg_stop_0) on 
nodeapp1 failed (target: 0 vs. rc: 1): Error

<skipped lines>
Mar 26 14:05:55 nodeapp1 pengine[81639]: warning: Processing failed op stop for 
share1_vg on nodeapp1: unknown error (1)
Mar 26 14:05:55 nodeapp1 pengine[81639]: warning: Processing failed op stop for 
share1_vg on nodeapp1: unknown error (1)
Mar 26 14:05:55 nodeapp1 pengine[81639]: warning: Cluster node nodeapp1 will be 
fenced: share1_vg failed there
Mar 26 14:05:55 nodeapp1 pengine[81639]: warning: Forcing share1_vg away from 
nodeapp1 after 1000000 failures (max=1000000)
Mar 26 14:05:55 nodeapp1 pengine[81639]: warning: Scheduling Node nodeapp1 for 
STONITH

Hope this helps someone down the line.


Robert



CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.

_______________________________________________
Users mailing list: [email protected]
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] PSA: pvmove -i0 and LVM resource

Reply via email to