Re: [Linux-HA] Problem with master/slave migration on fedora17

ovirt Fri, 17 Aug 2012 02:07:41 -0700

it shows id=number

# cibadmin -Ql | grep node.*xen


Птн 17 Авг 2012 04:34:10 +0400, Andrew Beekhof  написал:
On Tue, Jun 19, 2012 at 7:06 PM,   wrote:
> Hi, Andrew
>
> Here is console log and crm_report may be they help (I found this because
> lcmc console has bug in fedora17 and I tried to test migration by cli, now I
> change lcmc from ptest to crm_simulate and it is working but I think that
> the problem still exists):
>
>
> [root@xen01 cluster]# crm resource status
>  Resource Group: rg_ISCSI_0
>      p_lvm_vg_0 (ocf::heartbeat:LVM) Started
>      p_target_ISCSI_0   (ocf::heartbeat:iSCSITarget) Started
>      p_ISCSI_0_lun1     (ocf::heartbeat:iSCSILogicalUnit) Started
>      p_ISCSI_0_lun2     (ocf::heartbeat:iSCSILogicalUnit) Started
>      p_ISCSI_0_lun3     (ocf::heartbeat:iSCSILogicalUnit) Started
>      ClusterIP1 (ocf::heartbeat:IPaddr2) Started
>  Resource Group: rg_ISCSI_1
>      p_lvm_vg_1 (ocf::heartbeat:LVM) Started
>      p_target_ISCSI_1   (ocf::heartbeat:iSCSITarget) Started
>      p_ISCSI_1_lun1     (ocf::heartbeat:iSCSILogicalUnit) Started
>      p_ISCSI_1_lun2     (ocf::heartbeat:iSCSILogicalUnit) Started
>      ClusterIP2 (ocf::heartbeat:IPaddr2) Started
>  Master/Slave Set: ms_drbd_iscsi_0 [p_drbd_iscsi_0]
>      Masters: [ xen01 ]
>      Slaves: [ xen02 ]
>  Master/Slave Set: ms_drbd_iscsi_1 [p_drbd_iscsi_1]
>      Masters: [ xen02 ]
>      Slaves: [ xen01 ]
>  res_Filesystem_p_fs_export     (ocf::heartbeat:Filesystem) Started
>  res_exportfs_p_nfs_export      (ocf::heartbeat:exportfs) Started
>
> [root@xen01 cluster]# crm status
> ============
> Last updated: Tue Jun 19 11:41:16 2012
> Last change: Mon Jun 18 17:56:56 2012 via crm_resource on xen01
> Stack: corosync
> Current DC: xen01 (100) - partition with quorum
> Version: 1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, unknown expected votes
> 17 Resources configured.
> ============
>
> Online: [ xen01 xen02 ]
>
>  Resource Group: rg_ISCSI_0
>      p_lvm_vg_0 (ocf::heartbeat:LVM):   Started xen01
>      p_target_ISCSI_0   (ocf::heartbeat:iSCSITarget):   Started xen01
>      p_ISCSI_0_lun1     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen01
>      p_ISCSI_0_lun2     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen01
>      p_ISCSI_0_lun3     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen01
>      ClusterIP1 (ocf::heartbeat:IPaddr2):       Started xen01
>  Resource Group: rg_ISCSI_1
>      p_lvm_vg_1 (ocf::heartbeat:LVM):   Started xen02
>      p_target_ISCSI_1   (ocf::heartbeat:iSCSITarget):   Started xen02
>      p_ISCSI_1_lun1     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen02
>      p_ISCSI_1_lun2     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen02
>      ClusterIP2 (ocf::heartbeat:IPaddr2):       Started xen02
>  Master/Slave Set: ms_drbd_iscsi_0 [p_drbd_iscsi_0]
>      Masters: [ xen01 ]
>      Slaves: [ xen02 ]
>  Master/Slave Set: ms_drbd_iscsi_1 [p_drbd_iscsi_1]
>      Masters: [ xen02 ]
>      Slaves: [ xen01 ]
>  res_Filesystem_p_fs_export     (ocf::heartbeat:Filesystem):    Started
> xen02
>  res_exportfs_p_nfs_export      (ocf::heartbeat:exportfs):      Started
> xen02
>
> [root@xen01 cluster]# crm resource migrate ms_drbd_iscsi_0 xen02
> Error performing operation: ms_drbd_iscsi_0 is already active on xen02

I assume you want to move the master instance here, but thats not what
the command does.
It is trying to move the resource, which is why you get that error.

>
> [root@xen01 cluster]# crm configure 'location cli-standby-ms_drbd_iscsi_0
> ms_drbd_iscsi_0 rule $id="cli-standby-rule-ms_drbd_iscsi_0" inf: #uname eq
> xen02'
> WARNING: cli-standby-ms_drbd_iscsi_0: referenced node xen02 does not exist

If you run
   cibadmin -Ql | grep node.*xen02

Does it show id=xen02 or a number?
I suspect there is a number there (new in fedora17) and this is
confusing the shell.

Just force it to accept the change.

> [root@xen01 /]# crm status
> ============
> Last updated: Tue Jun 19 12:04:17 2012
> Last change: Tue Jun 19 12:03:57 2012 via cibadmin on xen01
> Stack: corosync
> Current DC: xen01 (100) - partition with quorum
> Version: 1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, unknown expected votes
> 17 Resources configured.
> ============
>
> Online: [ xen01 xen02 ]
>
>  Resource Group: rg_ISCSI_0
>      p_lvm_vg_0 (ocf::heartbeat:LVM):   Started xen02
>      p_target_ISCSI_0   (ocf::heartbeat:iSCSITarget):   Started xen02
>      p_ISCSI_0_lun1     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen02
>      p_ISCSI_0_lun2     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen02
>      p_ISCSI_0_lun3     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen02
>      ClusterIP1 (ocf::heartbeat:IPaddr2):       Started xen02
>  Resource Group: rg_ISCSI_1
>      p_lvm_vg_1 (ocf::heartbeat:LVM):   Started xen02
>      p_target_ISCSI_1   (ocf::heartbeat:iSCSITarget):   Started xen02
>      p_ISCSI_1_lun1     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen02
>      p_ISCSI_1_lun2     (ocf::heartbeat:iSCSILogicalUnit):      Started
> xen02
>      ClusterIP2 (ocf::heartbeat:IPaddr2):       Started xen02
>  Master/Slave Set: ms_drbd_iscsi_0 [p_drbd_iscsi_0]
>      Masters: [ xen02 ]
>      Slaves: [ xen01 ]
>  Master/Slave Set: ms_drbd_iscsi_1 [p_drbd_iscsi_1]
>      Masters: [ xen02 ]
>      Slaves: [ xen01 ]
>  res_Filesystem_p_fs_export     (ocf::heartbeat:Filesystem):    Started
> xen02
>  res_exportfs_p_nfs_export      (ocf::heartbeat:exportfs):      Started
> xen02
>
> [root@xen01 ~]# crm_report -V -f "2012-06-19 09:00:00"
> xen01:      WARNING: The tarball produced by this program may contain
> xen01:               sensitive information such as passwords.
> xen01:
> xen01:      We will attempt to remove such information if you use the
> xen01:      -p option. For example: -p "pass.*" -p "user.*"
> xen01:
> xen01:      However, doing this may reduce the ability for the recipients
> xen01:      to diagnose issues and generally provide assistance.
> xen01:
> xen01:      IT IS YOUR RESPONSIBILITY TO PROTECT SENSITIVE DATA FROM
> EXPOSURE
> xen01:
> xen01:      WARN: Unknown cluster type: any
> xen01:      Debug: Querying CIB for nodes
> xen01:      Calculated node list: xen01 xen02
> xen01:      Debug: We are a cluster node
> xen01:      Collecting data from xen01 xen02  (06/19/2012 09:00:00 AM to
> 06/19/2012 12:06:18 PM)
> xen01:      Debug: Using full path to working directory:
> /root/pcmk-Tue-19-Jun-2012
> xen01:      Debug: Machine state directory: /var
> xen01:      Debug: State files located in: /var/run/crm
> xen01:      Debug: PE files located in: /var/lib/pengine
> xen01:      Debug: Heartbeat state files located in: /var/lib/heartbeat
> xen01:      Debug: Core files located under:  /var/lib/heartbeat/cores
> /var/lib/corosync
> xen01:      Debug: Pacemaker daemons located under:
> xen01:      Debug: Initializing xen01 subdir
> xen01:      Debug: Detected the 'corosync' cluster stack
> xen01:      Debug: Could not determine logd.cf location
> xen01:      Debug: Reading corosync log settings
> xen01:      Debug: Reading log settings from /etc/corosync/corosync.conf
> xen01:      Debug: Pattern 'Mark:pcmk:1340093178' not found anywhere
> xen01:      Debug: Config: corosync /etc/corosync/corosync.conf
> /var/log/cluster/corosync.log
> xen01:      Debug: Found /etc/fedora-release /etc/os-release
> /etc/redhat-release /etc/system-release  distribution release file(s)
> xen01:      Debug: The package manager is: rpm
> xen01:      Debug: Verifying installation of: pacemaker
> xen01:      Debug: Verifying installation of: pacemaker-libs
> xen01:      Debug: Verifying installation of: corosync
> xen01:      Debug: Verifying installation of: resource-agents
> xen01:      Debug: Verifying installation of: cluster-glue-libs
> xen01:      Debug: Verifying installation of: cluster-glue
> xen01:      Debug: Verifying installation of: drbd-pacemaker
> xen01:      Debug: Verifying installation of: drbd-utils
> xen01:      Debug: Verifying installation of: lvm2
> xen01:      Debug: Verifying installation of: glibc
> xen01:      Debug: found 17 pengine input files in /var/lib/pengine
> xen01:      Debug: Looking for backtraces: 1340081990 1340093188
> xen01:      Debug: Sanitizing files:
> xen01:      Debug: Found log /var/log/cluster/corosync.log
> xen01:      Including segment [138159-139505] from
> /var/log/cluster/corosync.log
> xen01:      Debug: Removing empty file: -rw-r--r-- 1 root root 0 Jun 19
> 12:06 backtraces.txt
> xen01:      Debug: Removing empty file: -rw-r--r-- 1 root root 0 Jun 19
> 12:06 crm_verify.txt
> xen01:      Debug: Removing empty file: -rw-r--r-- 1 root root 0 Jun 19
> 12:06 dlm_dump.txt
> xen01:      Debug: Removing empty file: -rw-r--r-- 1 root root 0 Jun 19
> 12:06 members.txt
> xen01:      Debug: Removing empty file: -rw-r--r-- 1 root root 0 Jun 19
> 12:06 permissions.txt
> xen02:      Debug: Canonicalizing working directory path:
> /root/pcmk-Tue-19-Jun-2012
> xen02:      Debug: Machine state directory: /var
> xen02:      Debug: State files located in: /var/run/crm
> xen02:      Debug: PE files located in: /var/lib/pengine
> xen02:      Debug: Heartbeat state files located in: /var/lib/heartbeat
> xen02:      Debug: Core files located under:  /var/lib/heartbeat/cores
> /var/lib/corosync
> xen02:      Debug: Pacemaker daemons located under:
> xen02:      Debug: Initializing xen02 subdir
> xen02:      Debug: Detected the 'corosync' cluster stack
> xen02:      Debug: Could not determine logd.cf location
> xen02:      Debug: Reading corosync log settings
> xen02:      Debug: Reading log settings from /etc/corosync/corosync.conf
> xen02:      Debug: Pattern 'Mark:pcmk:1340093186' not found anywhere
> xen02:      Debug: Config: corosync /etc/corosync/corosync.conf
> /var/log/cluster/corosync.log
> xen02:      Debug: Found /etc/fedora-release /etc/os-release
> /etc/redhat-release /etc/system-release  distribution release file(s)
> xen02:      Debug: The package manager is: rpm
> xen02:      Debug: Verifying installation of: pacemaker
> xen02:      Debug: Verifying installation of: pacemaker-libs
> xen02:      Debug: Verifying installation of: corosync
> xen02:      Debug: Verifying installation of: resource-agents
> xen02:      Debug: Verifying installation of: cluster-glue-libs
> xen02:      Debug: Verifying installation of: cluster-glue
> xen02:      Debug: Verifying installation of: drbd-pacemaker
> xen02:      Debug: Verifying installation of: drbd-utils
> xen02:      Debug: Verifying installation of: lvm2
> xen02:      Debug: Verifying installation of: glibc
> xen02:      Debug: Looking for backtraces: 1340081990 1340093188
> xen02:      Debug: Sanitizing files:
> xen02:      Debug: Removing empty file: -rw-r--r--. 1 root root 0 Jun 19
> 12:06 backtraces.txt
> xen02:      Debug: Removing empty file: -rw-r--r--. 1 root root 0 Jun 19
> 12:06 corosync.log
> xen02:      Debug: Removing empty file: -rw-r--r--. 1 root root 0 Jun 19
> 12:06 crm_verify.txt
> xen02:      Debug: Removing empty file: -rw-r--r--. 1 root root 0 Jun 19
> 12:06 dlm_dump.txt
> xen02:      Debug: Removing empty file: -rw-r--r--. 1 root root 0 Jun 19
> 12:06 members.txt
> xen02:      Debug: Removing empty file: -rw-r--r--. 1 root root 0 Jun 19
> 12:06 permissions.txt
> xen02:      Debug: Streaming report back to xen01
> xen01:
> xen01:      Collected results are available in
> /root/pcmk-Tue-19-Jun-2012.tar.gz
> xen01:
> xen01:      Please create a bug entry at
> xen01:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> xen01:      Include a description of your problem and attach this tarball
> xen01:
> xen01:      Thank you for taking time to create this report.
>
>
> Bye,
> Vadim
>
>
> Втр 19 Июн 2012 04:36:30 +0400, Andrew Beekhof  написал:
>
> Not enough information i'm afraid.
> We need more than descriptions of the events, can you run crm_report
> for the period covered by your test?
>
> On Mon, Jun 18, 2012 at 6:29 PM,  wrote:
>> Environment
>>
>> fedora17+corosync-2.0.1-1.fc17.x86_64+pacemaker-1.1.7-2.fc17.x86_64
>>
>> two node cluster:
>>
>> #corosync-quorumtool -l
>> Membership information
>> ----------------------
>>    Nodeid      Votes Name
>>       200          1 xen02
>>       100          1 xen01
>>
>> # crm status
>> ============
>> Last updated: Mon Jun 18 11:51:42 2012
>> Last change: Mon Jun 18 11:51:39 2012 via cibadmin on xen01
>> Stack: corosync
>> Current DC: xen02 (200) - partition with quorum
>> Version: 1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff
>> 2 Nodes configured, unknown expected votes
>> 3 Resources configured.
>> ============
>>
>> Online: [ xen01 xen02 ]
>>
>> BUT
>>
>> # crm node list
>> xen01(100): normal
>> xen02(200): normal
>>
>> with one resource ClusterIP ocf:heartbeat:IPaddr2 cluster is working and
>> migration gone with no problems
>> but for ms_drbd resource migration is working only once
>>
>> if i do crm resource migrate (first migration finished correctry), then
>> delete location constraint and then do back-migration
>>
>> 1. using crm resource migrate, i'v got error
>>
>> Error performing operation: ms_drbd_iscsi_0 is already active on xen02
>>
>> 2. if i create location constraint using crm configure,
>>
>> #crm configure
>> # location cli-standby-ms_drbd_iscsi_0 ms_drbd_iscsi_0 rule
>> $id="cli-standby-rule-ms_drbd_iscsi_0" inf: #uname eq xen02
>> WARNING: cli-standby-ms_drbd_iscsi_0: referenced node xen02 does not exist
>> #commit
>> WARNING: cli-standby-ms_drbd_iscsi_0: referenced node xen02 does not exist
>>
>> migration DONE with warning
>>
>> and
>>
>> # crm_verify -L
>> #
>> # crm configure show
>> node $id="100" xen01
>> node $id="200" xen02
>> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>>        params ip="172.16.0.4" cidr_netmask="32" \
>>        op monitor interval="30s" \
>>        meta is-managed="true" target-role="Stopped"
>> primitive p_drbd_iscsi_0 ocf:linbit:drbd \
>>        params drbd_resource="iscsi0" \
>>        op start interval="0" timeout="240s" \
>>        op stop interval="0" timeout="100s" \
>>        op monitor interval="10s" role="Master" \
>>        op monitor interval="60s" role="Slave"
>> ms ms_drbd_iscsi_0 p_drbd_iscsi_0 \
>>        meta master-max="1" master-node-max="1" clone-max="2"
>> clone-node-max="1" notify="true" target-role="Started" is-managed="true"
>> location cli-standby-ms_drbd_iscsi_0 ms_drbd_iscsi_0 \
>>        rule $id="cli-standby-rule-ms_drbd_iscsi_0" inf: #uname eq xen02
>> property $id="cib-bootstrap-options" \
>>        dc-version="1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff"
>> \
>>        cluster-infrastructure="corosync" \
>>        stonith-enabled="false" \
>>        no-quorum-policy="ignore"
>> rsc_defaults $id="rsc-options" \
>>        resource-stickiness="100"
>>
>> I think that the problem is in nodes names
>>
>> # crm node list
>>
>> xen01(100): normal
>>
>> xen02(200): normal
>>
>> because in fedora16  with  pacemaker-1.1.7-2.fc16.x86_64 +
>> corosync-1.4.3-1.fc16.x86_64
>>
>> # crm node list
>> xen01: normal
>>        standby: off
>> xen02: normal
>>        standby: off
>>
>> --
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> linux-ha.org/mailman/listinfo/linux-ha" title="Открыть внешнюю
>> ссылку">http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: linux-ha.org/ReportingProblems" title="Открыть внешнюю
>> ссылку">http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> linux-ha.org/mailman/listinfo/linux-ha" title="Открыть внешнюю
> ссылку">http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: linux-ha.org/ReportingProblems" title="Открыть внешнюю
> ссылку">http://linux-ha.org/ReportingProblems
>
>
>
> --
>

--
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Proble­m wi­th master/slave migr­ation­ on fedora17

Reply via email to

Re: [Linux-HA] Problem with master/slave migration on fedora17