>>> Aj Revelino <[email protected]> schrieb am 08.04.2022 um 23:27 in >>> Nachricht <cajy7vkc27h7lf_fay1duf2dklnvtmtzyw8b61vk+6_ijvxs...@mail.gmail.com>: > Hi Ulrich, > I set the cluster in maintenance mode due to the consistent logging of the > error messages in the system log. > > Pacemaker has attempted to execute the monitor operation of the resource > agent here. Is there a way to find out why pacemaker says 'No such device > or address'? > hanapopdb1-rsc_SAPHana_HPN_HDB00_monitor_60000:195 [ Error performing > operation: No such device or address]*
If inspecting the RA or turning on debugging for the RA does not help, you could try to add a line like "exec 2>&1 >log_file; set -x" to the beginning of the RA. I know some of those SAP RAs are hard to understand. Regards, Ulrich > > Regards, > Aj > > On Fri, Apr 8, 2022 at 8:23 PM Ulrich Windl < > [email protected]> wrote: > >> "maintenance-mode=true"? Why? >> >> >> >>> Aj Revelino <[email protected]> schrieb am 08.04.2022 um 11:17 in >> Nachricht >> <CAJY7vkA=SfaJngsfJnREkFMnMJ0hn=ppkec7cyuci32cr3r...@mail.gmail.com>: >> > Hello All, >> > I've a 2 node SAP Hana cluster (hanapodb1 and hanapodb2). Pacemaker >> > monitors the data replication between the primary and the secondary node. >> > The issue is that crm status shows that everything is okay but the system >> > log shows the following error log. >> > >> > >> > *pacemaker-controld[3582]: notice: >> > hanapopdb1-rsc_SAPHana_HPN_HDB00_monitor_60000:195 [ Error performing >> > operation: No such device or address]* >> > I am unable to identify the cause of the error message and resolve it >> > >> > And due to the above, the data replication between the 2 nodes is >> recorded >> > as failed (SFAIL) . Pls see the excerpt from the CIB below: >> > >> > <node_state id="2" in_ccm="true" crmd="online" >> > crm-debug-origin="do_update_resource" uname="zhanapopdb2" join="member" >> > expected="member"> >> > <transient_attributes id="2"> >> > <instance_attributes id="status-2"> >> > * <nvpair id="status-2-hana_hpn_clone_state" >> > name="hana_hpn_clone_state" value="WAITING4PRIM"/>* >> > <nvpair id="status-2-hana_hpn_version" name="hana_hpn_version" >> > value="2.00.056.00.1624618329"/> >> > <nvpair id="status-2-master-rsc_SAPHana_HPN_HDB00" >> > name="master-rsc_SAPHana_HPN_HDB00" value="-INFINITY"/> >> > *<nvpair id="status-2-hana_hpn_sync_state" >> > name="hana_hpn_sync_state" value="SFAIL"/>* >> > <nvpair id="status-2-hana_hpn_roles" name="hana_hpn_roles" >> > value="4:S:master1:master:worker:master"/> >> > </instance_attributes> >> > </transient_attributes> >> > >> > Pacemaker is able to failover the resources from the primary to the >> > secondary but they all fail back to the primary, the moment I clean up >> the >> > failure in the primary node. >> > I deleted and recreated the entire configuration and reconfigured the >> hana >> > data replication but it hasn't helped. >> > >> > >> > *Cluster configuration:* >> > hanapopdb1:~ # crm configure show >> > node 1: hanapopdb1 \ >> > attributes hana_hpn_vhost=hanapopdb1 hana_hpn_site=SITE1PO >> > hana_hpn_op_mode=logreplay_readaccess hana_hpn_srmode=sync >> > lpa_hpn_lpt=1649393239 hana_hpn_remoteHost=hanapopdb2 >> > node 2: hanapopdb2 \ >> > attributes lpa_hpn_lpt=10 hana_hpn_op_mode=logreplay_readaccess >> > hana_hpn_vhost=hanapopdb2 hana_hpn_remoteHost=hanapopdb1 >> > hana_hpn_site=SITE2PO hana_hpn_srmode=sync >> > primitive rsc_SAPHanaTopology_HPN_HDB00 ocf:suse:SAPHanaTopology \ >> > operations $id=rsc_sap2_HPN_HDB00-operations \ >> > op monitor interval=10 timeout=600 \ >> > op start interval=0 timeout=600 \ >> > op stop interval=0 timeout=300 \ >> > params SID=HPN InstanceNumber=00 >> > primitive rsc_SAPHana_HPN_HDB00 ocf:suse:SAPHana \ >> > operations $id=rsc_sap_HPN_HDB00-operations \ >> > op start interval=0 timeout=3600 \ >> > op stop interval=0 timeout=3600 \ >> > op promote interval=0 timeout=3600 \ >> > op monitor interval=60 role=Master timeout=700 \ >> > op monitor interval=61 role=Slave timeout=700 \ >> > params SID=HPN InstanceNumber=00 PREFER_SITE_TAKEOVER=true >> > DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false >> > primitive rsc_ip_HPN_HDB00 IPaddr2 \ >> > meta target-role=Started \ >> > operations $id=rsc_ip_HPN_HDB00-operations \ >> > op monitor interval=10s timeout=20s \ >> > params ip=10.10.1.60 >> > primitive rsc_nc_HPN_HDB00 azure-lb \ >> > params port=62506 >> > primitive stonith-sbd stonith:external/sbd \ >> > params pcmk_delay_max=30 \ >> > op monitor interval=30 timeout=30 >> > group g_ip_HPN_HDB00 rsc_ip_HPN_HDB00 rsc_nc_HPN_HDB00 >> > ms msl_SAPHana_HPN_HDB00 rsc_SAPHana_HPN_HDB00 \ >> > meta is-managed=true notify=true clone-max=2 clone-node-max=1 >> > target-role=Started interleave=true >> > clone cln_SAPHanaTopology_HPN_HDB00 rsc_SAPHanaTopology_HPN_HDB00 \ >> > meta clone-node-max=1 target-role=Started interleave=true >> > colocation col_saphana_ip_HPN_HDB00 4000: g_ip_HPN_HDB00:Started >> > msl_SAPHana_HPN_HDB00:Master >> > order ord_SAPHana_HPN_HDB00 Optional: cln_SAPHanaTopology_HPN_HDB00 >> > msl_SAPHana_HPN_HDB00 >> > property cib-bootstrap-options: \ >> > last-lrm-refresh=1649387935 \ >> > maintenance-mode=true >> > >> > Regards, >> > >> > Aj >> >> >> >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ >> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
