On Fri, 2022-04-08 at 17:17 +0800, Aj Revelino wrote: > Hello All, > I've a 2 node SAP Hana cluster (hanapodb1 and hanapodb2). Pacemaker > monitors the data replication between the primary and the secondary > node. The issue is that crm status shows that everything is okay but > the system log shows the following error log. > > pacemaker-controld[3582]: notice: hanapopdb1- > rsc_SAPHana_HPN_HDB00_monitor_60000:195 [ Error performing operation: > No such device or address] > I am unable to identify the cause of the error message and resolve it > > And due to the above, the data replication between the 2 nodes is > recorded as failed (SFAIL) . Pls see the excerpt from the CIB below: > > <node_state id="2" in_ccm="true" crmd="online" crm-debug- > origin="do_update_resource" uname="zhanapopdb2" join="member" > expected="member"> > <transient_attributes id="2"> > <instance_attributes id="status-2"> > <nvpair id="status-2-hana_hpn_clone_state" > name="hana_hpn_clone_state" value="WAITING4PRIM"/> > <nvpair id="status-2-hana_hpn_version" > name="hana_hpn_version" value="2.00.056.00.1624618329"/> > <nvpair id="status-2-master-rsc_SAPHana_HPN_HDB00" > name="master-rsc_SAPHana_HPN_HDB00" value="-INFINITY"/> > <nvpair id="status-2-hana_hpn_sync_state" > name="hana_hpn_sync_state" value="SFAIL"/> > <nvpair id="status-2-hana_hpn_roles" name="hana_hpn_roles" > value="4:S:master1:master:worker:master"/> > </instance_attributes> > </transient_attributes> > > Pacemaker is able to failover the resources from the primary to the > secondary but they all fail back to the primary, the moment I clean > up the failure in the primary node.
I'm not familiar enough with SAP to speak to that side of things, but the behavior after clean-up is normal. If you don't want resources to go back to their preferred node after a failure is cleaned up, set the resource-stickiness meta-attribute to a positive number (either on the resource itself, or in resource defaults if you want it to apply to everything). > I deleted and recreated the entire configuration and reconfigured the > hana data replication but it hasn't helped. > > > Cluster configuration: > hanapopdb1:~ # crm configure show > node 1: hanapopdb1 \ > attributes hana_hpn_vhost=hanapopdb1 hana_hpn_site=SITE1PO > hana_hpn_op_mode=logreplay_readaccess hana_hpn_srmode=sync > lpa_hpn_lpt=1649393239 hana_hpn_remoteHost=hanapopdb2 > node 2: hanapopdb2 \ > attributes lpa_hpn_lpt=10 > hana_hpn_op_mode=logreplay_readaccess hana_hpn_vhost=hanapopdb2 > hana_hpn_remoteHost=hanapopdb1 hana_hpn_site=SITE2PO > hana_hpn_srmode=sync > primitive rsc_SAPHanaTopology_HPN_HDB00 ocf:suse:SAPHanaTopology \ > operations $id=rsc_sap2_HPN_HDB00-operations \ > op monitor interval=10 timeout=600 \ > op start interval=0 timeout=600 \ > op stop interval=0 timeout=300 \ > params SID=HPN InstanceNumber=00 > primitive rsc_SAPHana_HPN_HDB00 ocf:suse:SAPHana \ > operations $id=rsc_sap_HPN_HDB00-operations \ > op start interval=0 timeout=3600 \ > op stop interval=0 timeout=3600 \ > op promote interval=0 timeout=3600 \ > op monitor interval=60 role=Master timeout=700 \ > op monitor interval=61 role=Slave timeout=700 \ > params SID=HPN InstanceNumber=00 PREFER_SITE_TAKEOVER=true > DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false > primitive rsc_ip_HPN_HDB00 IPaddr2 \ > meta target-role=Started \ > operations $id=rsc_ip_HPN_HDB00-operations \ > op monitor interval=10s timeout=20s \ > params ip=10.10.1.60 > primitive rsc_nc_HPN_HDB00 azure-lb \ > params port=62506 > primitive stonith-sbd stonith:external/sbd \ > params pcmk_delay_max=30 \ > op monitor interval=30 timeout=30 > group g_ip_HPN_HDB00 rsc_ip_HPN_HDB00 rsc_nc_HPN_HDB00 > ms msl_SAPHana_HPN_HDB00 rsc_SAPHana_HPN_HDB00 \ > meta is-managed=true notify=true clone-max=2 clone-node-max=1 > target-role=Started interleave=true > clone cln_SAPHanaTopology_HPN_HDB00 rsc_SAPHanaTopology_HPN_HDB00 \ > meta clone-node-max=1 target-role=Started interleave=true > colocation col_saphana_ip_HPN_HDB00 4000: g_ip_HPN_HDB00:Started > msl_SAPHana_HPN_HDB00:Master > order ord_SAPHana_HPN_HDB00 Optional: cln_SAPHanaTopology_HPN_HDB00 > msl_SAPHana_HPN_HDB00 > property cib-bootstrap-options: \ > last-lrm-refresh=1649387935 \ > maintenance-mode=true > > Regards, > > Aj > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/