How about your /var/log/vdsm.log in the two nodes? It seems that VDSM
got some problems.
On 2012-5-28 11:04, T-Sinjon wrote:
1,on node1, vdsm seems strange , it's sleeping
[root@ovirt-node-1 ~]# systemctl status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
Loaded: loaded (/lib/systemd/system/vdsmd.service; enabled)
Active: active (running) since Mon, 28 May 2012 02:43:22 +0000; 9min ago
Process: 1157 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited,
status=0/SUCCESS)
Main PID: 2228 (respawn)
CGroup: name=systemd:/system/vdsmd.service
? 2228 /bin/bash -e /usr/share/vdsm/respawn --minlifetime...
? 3573 sleep 900
2,no firewall blocked
3,network is ok, i can ssh into node1 from engine
I have used the fence option (confirm host has been rebooted), but SPM
did not changed to other node, below is the engine.log when i do this
action:
2012-05-28 10:49:51,846 INFO
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
(pool-5-thread-49) [72d88732] Lock Acquired to object EngineLock
[exclusiveLocks= key:
org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value:
ae567034-5d8e-11e1-bdc9-a7168ad4d39f
, sharedLocks= ]
2012-05-28 10:49:51,847 INFO
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
(pool-5-thread-49) [72d88732] Running command: FenceVdsManualyCommand
internal: false. Entities affected : ID:
ae567034-5d8e-11e1-bdc9-a7168ad4d39f Type: VDS
2012-05-28 10:49:51,927 INFO
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
(pool-5-thread-49) [72d88732] Trying to fence spm ovirt-node-1.local
via vds ovirt-node-2.local
2012-05-28 10:49:51,933 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand]
(pool-5-thread-49) [72d88732] START, FenceSpmStorageVDSCommand(vdsId =
a522a6a6-a72e-11e1-baa3-bba876a88ef4, storagePoolId =
524a7003-edec-4f52-a38e-b15cadfbe3ef, prevId=1, prevLVER=17), log id:
530cb694
2012-05-28 10:49:51,965 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase]
(pool-5-thread-49) [72d88732] Command
org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand
return value
Class Name:
org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
mStatus Class Name:
org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
mCode 654
mMessage Not SPM
2012-05-28 10:49:51,966 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase]
(pool-5-thread-49) [72d88732] Vds: ovirt-node-2.local
2012-05-28 10:49:51,966 ERROR
[org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-5-thread-49)
[72d88732] Command FenceSpmStorageVDS execution failed. Exception:
IRSNonOperationalException: IRSGenericException: IRSErrorException:
IRSNonOperationalException: Not SPM
2012-05-28 10:49:51,966 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand]
(pool-5-thread-49) [72d88732] FINISH, FenceSpmStorageVDSCommand, log
id: 530cb694
2012-05-28 10:49:51,967 WARN
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
(pool-5-thread-49) [72d88732] Could not fence spm on vds
ovirt-node-2.local
2012-05-28 10:49:51,971 ERROR
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
(pool-5-thread-49) [72d88732] Transaction rolled-back for command:
org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand.
2012-05-28 10:49:51,971 INFO
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
(pool-5-thread-49) [72d88732] Lock freed to object EngineLock
[exclusiveLocks= key:
org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value:
ae567034-5d8e-11e1-bdc9-a7168ad4d39f
, sharedLocks= ]
2012-05-28 10:49:57,457 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-79) hostFromVds::selectedVds -
ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:49:57,461 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-79) SPM Init: could not find reported vds or
not up - pool:BLC vds_spm_id: 1
2012-05-28 10:49:57,466 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-79) SPM selection - vds seems as spm
ovirt-node-1.local
2012-05-28 10:49:57,466 WARN
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-79) spm vds is non responsive, stopping spm
selection.
2012-05-28 10:50:00,002 INFO
[org.ovirt.engine.core.bll.AutoRecoveryManager]
(QuartzScheduler_Worker-87) Checking autorecoverable hosts
2012-05-28 10:50:00,004 INFO
[org.ovirt.engine.core.bll.AutoRecoveryManager]
(QuartzScheduler_Worker-87) Checking autorecoverable hosts done
2012-05-28 10:50:00,004 INFO
[org.ovirt.engine.core.bll.AutoRecoveryManager]
(QuartzScheduler_Worker-87) Checking autorecoverable storage domains
2012-05-28 10:50:00,006 INFO
[org.ovirt.engine.core.bll.AutoRecoveryManager]
(QuartzScheduler_Worker-87) Checking autorecoverable storage domains done
2012-05-28 10:50:07,502 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-93) hostFromVds::selectedVds -
ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:07,505 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-93) SPM Init: could not find reported vds or
not up - pool:BLC vds_spm_id: 1
2012-05-28 10:50:07,510 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-93) SPM selection - vds seems as spm
ovirt-node-1.local
2012-05-28 10:50:07,510 WARN
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-93) spm vds is non responsive, stopping spm
selection.
2012-05-28 10:50:17,551 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-34) hostFromVds::selectedVds -
ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:17,554 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-34) SPM Init: could not find reported vds or
not up - pool:BLC vds_spm_id: 1
2012-05-28 10:50:17,559 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-34) SPM selection - vds seems as spm
ovirt-node-1.local
2012-05-28 10:50:17,559 WARN
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-34) spm vds is non responsive, stopping spm
selection.
2012-05-28 10:50:27,609 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-92) hostFromVds::selectedVds -
ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:27,612 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-92) SPM Init: could not find reported vds or
not up - pool:BLC vds_spm_id: 1
2012-05-28 10:50:27,617 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-92) SPM selection - vds seems as spm
ovirt-node-1.local
2012-05-28 10:50:27,618 WARN
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-92) spm vds is non responsive, stopping spm
selection.
2012-05-28 10:50:37,652 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-67) hostFromVds::selectedVds -
ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:37,656 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-67) SPM Init: could not find reported vds or
not up - pool:BLC vds_spm_id: 1
2012-05-28 10:50:37,661 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-67) SPM selection - vds seems as spm
ovirt-node-1.local
2012-05-28 10:50:37,662 WARN
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-67) spm vds is non responsive, stopping spm
selection.
2012-05-28 10:50:47,709 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-34) hostFromVds::selectedVds -
ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:47,712 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-34) SPM Init: could not find reported vds or
not up - pool:BLC vds_spm_id: 1
On 28 May, 2012, at 12:08 AM, Haim Ateya wrote:
Hi, first question that comes to mind is why host is in
non-responsive state?
Please check the following:
1. vdsmd service is running on host side
2. No firewall is blocking comm. in and out
3. No network issue between host and manager
Now, for your question, you can use the manual fence option (confirm
host has been rebooted), which will free spm role for faulty host,
and engine will elect new spm.
Haim
On May 27, 2012, at 18:32, T-Sinjon <[email protected]
<mailto:[email protected]>> wrote:
Description of problem:
i have 2 nodes
ovirt-node1.local Non Responsive SPM
ovirt-node2.local Up None
The SPM node stuck in Non-responsive status, it can't be actived,
all vms in the node went into Unknown status and the master vm
domain became inactived
when i do "Maintenace" action to node1, it says:
Error: Cannot switch Host to Maintenance mode.
Host still has running VMs on it and is in Non-Responsive state.
but there has no vm running in node1 , it only has 2 vms in Unknown
status
Because I can't active the SPM host , so i can't active the vm
storage domain
1,How can i migrated the SPM role to other host in my data center ,
such us node2?
2,How can i send the node1 to UP status?(I have did 'confirm the
host has been Rebooted' action , and rebooted the node1, but it did
no sense)
_______________________________________________
Users mailing list
[email protected] <mailto:[email protected]>
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users
--
Shu Ming<[email protected]>
IBM China Systems and Technology Laboratory
_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users