Re: [Users] SPM host in unknown status
How about your /var/log/vdsm.log in the two nodes? It seems that VDSM got some problems. On 2012-5-28 11:04, T-Sinjon wrote: 1,on node1, vdsm seems strange , it's sleeping [root@ovirt-node-1 ~]# systemctl status vdsmd.service vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/lib/systemd/system/vdsmd.service; enabled) Active: active (running) since Mon, 28 May 2012 02:43:22 +; 9min ago Process: 1157 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, status=0/SUCCESS) Main PID: 2228 (respawn) CGroup: name=systemd:/system/vdsmd.service ? 2228 /bin/bash -e /usr/share/vdsm/respawn --minlifetime... ? 3573 sleep 900 2,no firewall blocked 3,network is ok, i can ssh into node1 from engine I have used the fence option (confirm host has been rebooted), but SPM did not changed to other node, below is the engine.log when i do this action: 2012-05-28 10:49:51,846 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Lock Acquired to object EngineLock [exclusiveLocks= key: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: ae567034-5d8e-11e1-bdc9-a7168ad4d39f , sharedLocks= ] 2012-05-28 10:49:51,847 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Running command: FenceVdsManualyCommand internal: false. Entities affected : ID: ae567034-5d8e-11e1-bdc9-a7168ad4d39f Type: VDS 2012-05-28 10:49:51,927 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Trying to fence spm ovirt-node-1.local via vds ovirt-node-2.local 2012-05-28 10:49:51,933 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-49) [72d88732] START, FenceSpmStorageVDSCommand(vdsId = a522a6a6-a72e-11e1-baa3-bba876a88ef4, storagePoolId = 524a7003-edec-4f52-a38e-b15cadfbe3ef, prevId=1, prevLVER=17), log id: 530cb694 2012-05-28 10:49:51,965 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-49) [72d88732] Command org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand return value Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 654 mMessage Not SPM 2012-05-28 10:49:51,966 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-49) [72d88732] Vds: ovirt-node-2.local 2012-05-28 10:49:51,966 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-5-thread-49) [72d88732] Command FenceSpmStorageVDS execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM 2012-05-28 10:49:51,966 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-49) [72d88732] FINISH, FenceSpmStorageVDSCommand, log id: 530cb694 2012-05-28 10:49:51,967 WARN [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Could not fence spm on vds ovirt-node-2.local 2012-05-28 10:49:51,971 ERROR [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Transaction rolled-back for command: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand. 2012-05-28 10:49:51,971 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Lock freed to object EngineLock [exclusiveLocks= key: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: ae567034-5d8e-11e1-bdc9-a7168ad4d39f , sharedLocks= ] 2012-05-28 10:49:57,457 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC 2012-05-28 10:49:57,461 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1 2012-05-28 10:49:57,466 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) SPM selection - vds seems as spm ovirt-node-1.local 2012-05-28 10:49:57,466 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) spm vds is non responsive, stopping spm selection. 2012-05-28 10:50:00,002 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable hosts 2012-05-28 10:50:00,004 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable hosts done 2012-05-28 10:50:00,004 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable storage domains 2012-05-28 10:50:00,006 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable
Re: [Users] SPM host in unknown status
1,on node1, vdsm seems strange , it's sleeping [root@ovirt-node-1 ~]# systemctl status vdsmd.service vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/lib/systemd/system/vdsmd.service; enabled) Active: active (running) since Mon, 28 May 2012 02:43:22 +; 9min ago Process: 1157 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, status=0/SUCCESS) Main PID: 2228 (respawn) CGroup: name=systemd:/system/vdsmd.service ├ 2228 /bin/bash -e /usr/share/vdsm/respawn --minlifetime... └ 3573 sleep 900 2,no firewall blocked 3,network is ok, i can ssh into node1 from engine I have used the fence option (confirm host has been rebooted), but SPM did not changed to other node, below is the engine.log when i do this action: 2012-05-28 10:49:51,846 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Lock Acquired to object EngineLock [exclusiveLocks= key: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: ae567034-5d8e-11e1-bdc9-a7168ad4d39f , sharedLocks= ] 2012-05-28 10:49:51,847 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Running command: FenceVdsManualyCommand internal: false. Entities affected : ID: ae567034-5d8e-11e1-bdc9-a7168ad4d39f Type: VDS 2012-05-28 10:49:51,927 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Trying to fence spm ovirt-node-1.local via vds ovirt-node-2.local 2012-05-28 10:49:51,933 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-49) [72d88732] START, FenceSpmStorageVDSCommand(vdsId = a522a6a6-a72e-11e1-baa3-bba876a88ef4, storagePoolId = 524a7003-edec-4f52-a38e-b15cadfbe3ef, prevId=1, prevLVER=17), log id: 530cb694 2012-05-28 10:49:51,965 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-49) [72d88732] Command org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand return value Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 654 mMessage Not SPM 2012-05-28 10:49:51,966 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-49) [72d88732] Vds: ovirt-node-2.local 2012-05-28 10:49:51,966 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-5-thread-49) [72d88732] Command FenceSpmStorageVDS execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM 2012-05-28 10:49:51,966 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-49) [72d88732] FINISH, FenceSpmStorageVDSCommand, log id: 530cb694 2012-05-28 10:49:51,967 WARN [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Could not fence spm on vds ovirt-node-2.local 2012-05-28 10:49:51,971 ERROR [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Transaction rolled-back for command: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand. 2012-05-28 10:49:51,971 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Lock freed to object EngineLock [exclusiveLocks= key: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: ae567034-5d8e-11e1-bdc9-a7168ad4d39f , sharedLocks= ] 2012-05-28 10:49:57,457 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC 2012-05-28 10:49:57,461 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1 2012-05-28 10:49:57,466 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) SPM selection - vds seems as spm ovirt-node-1.local 2012-05-28 10:49:57,466 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) spm vds is non responsive, stopping spm selection. 2012-05-28 10:50:00,002 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable hosts 2012-05-28 10:50:00,004 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable hosts done 2012-05-28 10:50:00,004 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable storage domains 2012-05-28 10:50:00,006 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable storage domains done 2012-05-28 10:50:07,502 INFO [org.ovirt.engine.c
Re: [Users] SPM host in unknown status
Hi, first question that comes to mind is why host is in non-responsive state? Please check the following: 1. vdsmd service is running on host side 2. No firewall is blocking comm. in and out 3. No network issue between host and manager Now, for your question, you can use the manual fence option (confirm host has been rebooted), which will free spm role for faulty host, and engine will elect new spm. Haim On May 27, 2012, at 18:32, T-Sinjon wrote: > Description of problem: > > i have 2 nodes > ovirt-node1.localNon ResponsiveSPM > ovirt-node2.localUpNone > > The SPM node stuck in Non-responsive status, it can't be actived, > all vms in the node went into Unknown status and the master vm domain became > inactived > > when i do "Maintenace" action to node1, it says: > Error: Cannot switch Host to Maintenance mode. > Host still has running VMs on it and is in Non-Responsive state. > > but there has no vm running in node1 , it only has 2 vms in Unknown status > > Because I can't active the SPM host , so i can't active the vm storage > domain > > 1,How can i migrated the SPM role to other host in my data center , such us > node2? > 2,How can i send the node1 to UP status?(I have did 'confirm the host has > been Rebooted' action , and rebooted the node1, but it did no sense) > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] SPM host in unknown status
Description of problem: i have 2 nodes ovirt-node1.local Non Responsive SPM ovirt-node2.local Up None The SPM node stuck in Non-responsive status, it can't be actived, all vms in the node went into Unknown status and the master vm domain became inactived when i do "Maintenace" action to node1, it says: Error: Cannot switch Host to Maintenance mode. Host still has running VMs on it and is in Non-Responsive state. but there has no vm running in node1 , it only has 2 vms in Unknown status Because I can't active the SPM host , so i can't active the vm storage domain 1,How can i migrated the SPM role to other host in my data center , such us node2? 2,How can i send the node1 to UP status?(I have did 'confirm the host has been Rebooted' action , and rebooted the node1, but it did no sense) ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users