> On 4 Aug 2015, at 7:36 pm, renayama19661...@ybb.ne.jp wrote: > > Hi Andrew, > > Thank you for comments. > >>> However, a trap of crm_mon is sent to an SNMP manager. >> >> Are you using the built-in SNMP logic or using -E to give crm_mon a script >> which >> is then producing the trap? >> (I’m trying to figure out who could be turning the monitor action into a >> start) > > > I used the built-in SNMP. > I started as a daemon with -d option.
Is it running on both nodes or just snmp1? Because there is no logic in crm_mon that would have remapped the monitor to start, so my working theory is that its a duplicate of an old event. Can you tell which node the trap is being sent from? > > > Best Regards, > Hideo Yamauchi. > > > ----- Original Message ----- >> From: Andrew Beekhof <and...@beekhof.net> >> To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to >> open-source clustering welcomed <users@clusterlabs.org> >> Cc: >> Date: 2015/8/4, Tue 14:15 >> Subject: Re: [ClusterLabs] [Problem] The SNMP trap which has been already >> started is transmitted. >> >> >>> On 27 Jul 2015, at 4:18 pm, renayama19661...@ybb.ne.jp wrote: >>> >>> Hi All, >>> >>> The transmission of the SNMP trap of crm_mon seems to have a problem. >>> I identified a problem on latest Pacemaker and Pacemaker1.1.13. >>> >>> >>> Step 1) I constitute a cluster and send simple CLI file. >>> >>> [root@snmp1 ~]# crm_mon -1 >>> Last updated: Mon Jul 27 14:40:37 2015 Last change: Mon Jul 27 >> 14:40:29 2015 by root via cibadmin on snmp1 >>> Stack: corosync >>> Current DC: snmp1 (version 1.1.13-3d781d3) - partition with quorum >>> 2 nodes and 1 resource configured >>> >>> Online: [ snmp1 snmp2 ] >>> >>> prmDummy (ocf::heartbeat:Dummy): Started snmp1 >>> >>> Step 2) I stop a node of the standby once. >>> >>> [root@snmp2 ~]# stop pacemaker >>> pacemaker stop/waiting >>> >>> >>> Step 3) I start a node of the standby again. >>> [root@snmp2 ~]# start pacemaker >>> pacemaker start/running, process 2284 >>> >>> Step 4) The indication of crm_mon does not change in particular. >>> [root@snmp1 ~]# crm_mon -1 >>> Last updated: Mon Jul 27 14:45:12 2015 Last change: Mon Jul 27 >> 14:40:29 2015 by root via cibadmin on snmp1 >>> Stack: corosync >>> Current DC: snmp1 (version 1.1.13-3d781d3) - partition with quorum >>> 2 nodes and 1 resource configured >>> >>> Online: [ snmp1 snmp2 ] >>> >>> prmDummy (ocf::heartbeat:Dummy): Started snmp1 >>> >>> >>> In addition, as for the resource that started in snmp1 node, nothing >> changes. >>> >>> ------- >>> Jul 27 14:41:39 snmp1 crmd[29116]: notice: State transition S_IDLE -> >> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL >> origin=abort_transition_graph ] >>> Jul 27 14:41:39 snmp1 cib[29111]: info: Completed cib_modify operation >> for section status: OK (rc=0, origin=snmp1/attrd/11, version=0.4.20) >>> Jul 27 14:41:39 snmp1 attrd[29114]: info: Update 11 for probe_complete: >> OK (0) >>> Jul 27 14:41:39 snmp1 attrd[29114]: info: Update 11 for >> probe_complete[snmp1]=true: OK (0) >>> Jul 27 14:41:39 snmp1 attrd[29114]: info: Update 11 for >> probe_complete[snmp2]=true: OK (0) >>> Jul 27 14:41:39 snmp1 cib[29202]: info: Wrote version 0.4.0 of the CIB >> to disk (digest: a1f1920279fe0b1466a79cab09fa77d6) >>> Jul 27 14:41:39 snmp1 pengine[29115]: notice: On loss of CCM Quorum: >> Ignore >>> Jul 27 14:41:39 snmp1 pengine[29115]: info: Node snmp2 is online >>> Jul 27 14:41:39 snmp1 pengine[29115]: info: Node snmp1 is online >>> Jul 27 14:41:39 snmp1 pengine[29115]: info: >> prmDummy#011(ocf::heartbeat:Dummy):#011Started snmp1 >>> Jul 27 14:41:39 snmp1 pengine[29115]: info: Leave >> prmDummy#011(Started snmp1) >>> ------- >>> >>> However, a trap of crm_mon is sent to an SNMP manager. >> >> Are you using the built-in SNMP logic or using -E to give crm_mon a script >> which >> is then producing the trap? >> (I’m trying to figure out who could be turning the monitor action into a >> start) >> >>> The resource does not reboot, but the SNMP trap which a resource started is >> sent. >>> >>> ------- >>> Jul 27 14:41:39 SNMP-MANAGER snmptrapd[4521]: 2015-07-27 14:41:39 snmp1 >> [UDP: >> [192.168.40.100]:35265->[192.168.40.2]]:#012DISMAN-EVENT-MIB::sysUpTimeInstance >> >> = Timeticks: (1437975699) 166 days, 10:22:36.99#011SNMPv2-MIB::snmpTrapOID.0 >> = >> OID: >> PACEMAKER-MIB::pacemakerNotification#011PACEMAKER-MIB::pacemakerNotificationResource >> >> = STRING: "prmDummy"#011PACEMAKER-MIB::pacemakerNotificationNode = >> STRING: "snmp1"#011PACEMAKER-MIB::pacemakerNotificationOperation = >> STRING: "start"#011PACEMAKER-MIB::pacemakerNotificationDescription = >> STRING: "OK"#011PACEMAKER-MIB::pacemakerNotificationReturnCode = >> INTEGER: 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode = >> INTEGER: >> 0#011PACEMAKER-MIB::pacemakerNotificationStatus = INTEGER: 0 >>> Jul 27 14:41:39 SNMP-MANAGER snmptrapd[4521]: 2015-07-27 14:41:39 snmp1 >> [UDP: >> [192.168.40.100]:35265->[192.168.40.2]]:#012DISMAN-EVENT-MIB::sysUpTimeInstance >> >> = Timeticks: (1437975699) 166 days, 10:22:36.99#011SNMPv2-MIB::snmpTrapOID.0 >> = >> OID: >> PACEMAKER-MIB::pacemakerNotification#011PACEMAKER-MIB::pacemakerNotificationResource >> >> = STRING: "prmDummy"#011PACEMAKER-MIB::pacemakerNotificationNode = >> STRING: "snmp1"#011PACEMAKER-MIB::pacemakerNotificationOperation = >> STRING: "monitor"#011PACEMAKER-MIB::pacemakerNotificationDescription = >> STRING: "OK"#011PACEMAKER-MIB::pacemakerNotificationReturnCode = >> INTEGER: 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode = >> INTEGER: >> 0#011PACEMAKER-MIB::pacemakerNotificationStatus = INTEGER: 0 >>> ------- >>> >>> A difference of CIB occurring by the start stop of the node seems to have a >> problem. >>> By this difference, crm_mon transmits an unnecessary SNMP trap. >>> ------- >>> Jul 27 14:41:39 snmp1 cib[29111]: info: + /cib: @num_updates=19 >>> Jul 27 14:41:39 snmp1 cib[29111]: info: + >> /cib/status/node_state[@id='3232238190']: >> @crm-debug-origin=do_update_resource >>> Jul 27 14:41:39 snmp1 cib[29111]: info: ++ >> /cib/status/node_state[@id='3232238190']/lrm[@id='3232238190']/lrm_resources: >> >> <lrm_resource id="prmDummy" type="Dummy" >> class="ocf" provider="heartbeat"/> >>> Jul 27 14:41:39 snmp1 cib[29111]: info: ++ >> <lrm_rsc_op >> id="prmDummy_last_0" operation_key="prmDummy_monitor_0" >> operation="monitor" crm-debug-origin="do_update_resource" >> crm_feature_set="3.0.10" >> transition-key="6:6:7:34187f48-1f81-49c8-846e-ff3ed4c8f787" >> transition-magic="0:7;6:6:7:34187f48-1f81-49c8-846e-ff3ed4c8f787" >> on_node="snmp2" call-id="5" rc-code="7" >> op-status="0" interval="0" last-run="1437975699" >> last-rc-change="1437975699" exec-time="18" queue-ti >>> Jul 27 14:41:39 snmp1 cib[29111]: info: ++ >> </lrm_resource> >>> ------- >>> >>> I registered this problem with Bugzilla. >>> * http://bugs.clusterlabs.org/show_bug.cgi?id=5245 >>> * The log attached it to Bugzilla. >>> >>> Best Regards, >>> Hideo Yamauchi. >>> >>> _______________________________________________ >>> Users mailing list: Users@clusterlabs.org >>> http://clusterlabs.org/mailman/listinfo/users >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org