Re: [Users] SPM host in unknown status

2012-05-27 Thread Shu Ming
How about your /var/log/vdsm.log in the two nodes?  It  seems that VDSM 
got some problems.


On 2012-5-28 11:04, T-Sinjon wrote:

1,on node1, vdsm seems strange , it's sleeping
[root@ovirt-node-1 ~]# systemctl status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
 Loaded: loaded (/lib/systemd/system/vdsmd.service; enabled)
 Active: active (running) since Mon, 28 May 2012 02:43:22 +; 9min ago
Process: 1157 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, 
status=0/SUCCESS)

Main PID: 2228 (respawn)
 CGroup: name=systemd:/system/vdsmd.service
 ? 2228 /bin/bash -e /usr/share/vdsm/respawn --minlifetime...
 ? 3573 sleep 900
2,no firewall blocked
3,network is ok, i can ssh into node1 from engine

I have used the fence option (confirm host has been rebooted), but SPM 
did not changed to other node, below is the engine.log when i do this 
action:


2012-05-28 10:49:51,846 INFO 
 [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] 
(pool-5-thread-49) [72d88732] Lock Acquired to object EngineLock 
[exclusiveLocks= key: 
org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: 
ae567034-5d8e-11e1-bdc9-a7168ad4d39f

, sharedLocks= ]
2012-05-28 10:49:51,847 INFO 
 [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] 
(pool-5-thread-49) [72d88732] Running command: FenceVdsManualyCommand 
internal: false. Entities affected :  ID: 
ae567034-5d8e-11e1-bdc9-a7168ad4d39f Type: VDS
2012-05-28 10:49:51,927 INFO 
 [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] 
(pool-5-thread-49) [72d88732] Trying to fence spm ovirt-node-1.local 
via vds ovirt-node-2.local
2012-05-28 10:49:51,933 INFO 
 [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] 
(pool-5-thread-49) [72d88732] START, FenceSpmStorageVDSCommand(vdsId = 
a522a6a6-a72e-11e1-baa3-bba876a88ef4, storagePoolId = 
524a7003-edec-4f52-a38e-b15cadfbe3ef, prevId=1, prevLVER=17), log id: 
530cb694
2012-05-28 10:49:51,965 INFO 
 [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] 
(pool-5-thread-49) [72d88732] Command 
org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand 
return value
 Class Name: 
org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
mStatus   Class Name: 
org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc

mCode 654
mMessage  Not SPM


2012-05-28 10:49:51,966 INFO 
 [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] 
(pool-5-thread-49) [72d88732] Vds: ovirt-node-2.local
2012-05-28 10:49:51,966 ERROR 
[org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-5-thread-49) 
[72d88732] Command FenceSpmStorageVDS execution failed. Exception: 
IRSNonOperationalException: IRSGenericException: IRSErrorException: 
IRSNonOperationalException: Not SPM
2012-05-28 10:49:51,966 INFO 
 [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] 
(pool-5-thread-49) [72d88732] FINISH, FenceSpmStorageVDSCommand, log 
id: 530cb694
2012-05-28 10:49:51,967 WARN 
 [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] 
(pool-5-thread-49) [72d88732] Could not fence spm on vds 
ovirt-node-2.local
2012-05-28 10:49:51,971 ERROR 
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] 
(pool-5-thread-49) [72d88732] Transaction rolled-back for command: 
org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand.
2012-05-28 10:49:51,971 INFO 
 [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] 
(pool-5-thread-49) [72d88732] Lock freed to object EngineLock 
[exclusiveLocks= key: 
org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: 
ae567034-5d8e-11e1-bdc9-a7168ad4d39f

, sharedLocks= ]
2012-05-28 10:49:57,457 INFO 
 [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(QuartzScheduler_Worker-79) hostFromVds::selectedVds - 
ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:49:57,461 ERROR 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(QuartzScheduler_Worker-79) SPM Init: could not find reported vds or 
not up - pool:BLC vds_spm_id: 1
2012-05-28 10:49:57,466 INFO 
 [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(QuartzScheduler_Worker-79) SPM selection - vds seems as spm 
ovirt-node-1.local
2012-05-28 10:49:57,466 WARN 
 [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(QuartzScheduler_Worker-79) spm vds is non responsive, stopping spm 
selection.
2012-05-28 10:50:00,002 INFO 
 [org.ovirt.engine.core.bll.AutoRecoveryManager] 
(QuartzScheduler_Worker-87) Checking autorecoverable hosts
2012-05-28 10:50:00,004 INFO 
 [org.ovirt.engine.core.bll.AutoRecoveryManager] 
(QuartzScheduler_Worker-87) Checking autorecoverable hosts done
2012-05-28 10:50:00,004 INFO 
 [org.ovirt.engine.core.bll.AutoRecoveryManager] 
(QuartzScheduler_Worker-87) Checking autorecoverable storage domains
2012-05-28 10:50:00,006 INFO 
 [org.ovirt.engine.core.bll.AutoRecoveryManager] 
(QuartzScheduler_Worker-87) Checking autorecoverable 

Re: [Users] SPM host in unknown status

2012-05-27 Thread T-Sinjon
1,on node1, vdsm seems strange , it's sleeping
[root@ovirt-node-1 ~]# systemctl status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
  Loaded: loaded (/lib/systemd/system/vdsmd.service; enabled)
  Active: active (running) since Mon, 28 May 2012 02:43:22 +; 9min 
ago
 Process: 1157 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, 
status=0/SUCCESS)
Main PID: 2228 (respawn)
  CGroup: name=systemd:/system/vdsmd.service
  ├ 2228 /bin/bash -e /usr/share/vdsm/respawn --minlifetime...
  └ 3573 sleep 900
2,no firewall blocked
3,network is ok, i can ssh into node1 from engine

I have used the fence option (confirm host has been rebooted), but SPM did not 
changed to other node, below is the engine.log when i do this action:

2012-05-28 10:49:51,846 INFO  
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) 
[72d88732] Lock Acquired to object EngineLock [exclusiveLocks= key: 
org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: 
ae567034-5d8e-11e1-bdc9-a7168ad4d39f
, sharedLocks= ]
2012-05-28 10:49:51,847 INFO  
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) 
[72d88732] Running command: FenceVdsManualyCommand internal: false. Entities 
affected :  ID: ae567034-5d8e-11e1-bdc9-a7168ad4d39f Type: VDS
2012-05-28 10:49:51,927 INFO  
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) 
[72d88732] Trying to fence spm ovirt-node-1.local via vds ovirt-node-2.local
2012-05-28 10:49:51,933 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] 
(pool-5-thread-49) [72d88732] START, FenceSpmStorageVDSCommand(vdsId = 
a522a6a6-a72e-11e1-baa3-bba876a88ef4, storagePoolId = 
524a7003-edec-4f52-a38e-b15cadfbe3ef, prevId=1, prevLVER=17), log id: 530cb694
2012-05-28 10:49:51,965 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] 
(pool-5-thread-49) [72d88732] Command 
org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand return 
value 
 Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
mStatus   Class Name: 
org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
mCode 654
mMessage  Not SPM


2012-05-28 10:49:51,966 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] 
(pool-5-thread-49) [72d88732] Vds: ovirt-node-2.local
2012-05-28 10:49:51,966 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] 
(pool-5-thread-49) [72d88732] Command FenceSpmStorageVDS execution failed. 
Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: 
IRSNonOperationalException: Not SPM
2012-05-28 10:49:51,966 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] 
(pool-5-thread-49) [72d88732] FINISH, FenceSpmStorageVDSCommand, log id: 
530cb694
2012-05-28 10:49:51,967 WARN  
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) 
[72d88732] Could not fence spm on vds ovirt-node-2.local
2012-05-28 10:49:51,971 ERROR 
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) 
[72d88732] Transaction rolled-back for command: 
org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand.
2012-05-28 10:49:51,971 INFO  
[org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) 
[72d88732] Lock freed to object EngineLock [exclusiveLocks= key: 
org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: 
ae567034-5d8e-11e1-bdc9-a7168ad4d39f
, sharedLocks= ]
2012-05-28 10:49:57,457 INFO  
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(QuartzScheduler_Worker-79) hostFromVds::selectedVds - ovirt-node-2.local, 
spmStatus Free, storage pool BLC
2012-05-28 10:49:57,461 ERROR 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(QuartzScheduler_Worker-79) SPM Init: could not find reported vds or not up - 
pool:BLC vds_spm_id: 1
2012-05-28 10:49:57,466 INFO  
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(QuartzScheduler_Worker-79) SPM selection - vds seems as spm ovirt-node-1.local
2012-05-28 10:49:57,466 WARN  
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(QuartzScheduler_Worker-79) spm vds is non responsive, stopping spm selection.
2012-05-28 10:50:00,002 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] 
(QuartzScheduler_Worker-87) Checking autorecoverable hosts
2012-05-28 10:50:00,004 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] 
(QuartzScheduler_Worker-87) Checking autorecoverable hosts done
2012-05-28 10:50:00,004 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] 
(QuartzScheduler_Worker-87) Checking autorecoverable storage domains
2012-05-28 10:50:00,006 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] 
(QuartzScheduler_Worker-87) Checking autorecoverable storage domains done
2012-05-28 10:50:07,502 INFO  
[org.ovirt.engine.c

Re: [Users] SPM host in unknown status

2012-05-27 Thread Haim Ateya
Hi, first question that comes to mind is why host is in non-responsive state? 
Please check the following:
1. vdsmd service is running on host side
2. No firewall is blocking comm. in and out
3. No network issue between host and manager

Now, for your question, you can use the  manual fence option (confirm host has 
been rebooted), which will free spm role for faulty host, and engine will elect 
new spm.

Haim

On May 27, 2012, at 18:32, T-Sinjon  wrote:

> Description of problem:
> 
> i have 2 nodes 
> ovirt-node1.localNon ResponsiveSPM
> ovirt-node2.localUpNone
> 
> The SPM node stuck in Non-responsive status, it can't be actived, 
> all vms in the node went into Unknown status and the master vm domain became 
> inactived
> 
> when i do "Maintenace" action to node1, it says:
> Error: Cannot switch Host to Maintenance mode.
> Host still has running VMs on it and is in Non-Responsive state.
> 
> but there has no vm running in node1 , it only has 2 vms in Unknown status
> 
> Because I can't active the SPM host , so  i can't active  the vm storage 
> domain
> 
> 1,How can i migrated the SPM role to other host in my data center , such us 
> node2?
> 2,How can i send the node1 to UP status?(I have did 'confirm the host has 
> been Rebooted' action , and rebooted the node1, but it did no sense)
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] SPM host in unknown status

2012-05-27 Thread T-Sinjon
Description of problem:

i have 2 nodes 
ovirt-node1.local   Non Responsive  SPM
ovirt-node2.local   Up  None

The SPM node stuck in Non-responsive status, it can't be actived, 
all vms in the node went into Unknown status and the master vm domain became 
inactived

when i do "Maintenace" action to node1, it says:
Error: Cannot switch Host to Maintenance mode.
Host still has running VMs on it and is in Non-Responsive state.

but there has no vm running in node1 , it only has 2 vms in Unknown status

Because I can't active the SPM host , so  i can't active  the vm storage domain

1,How can i migrated the SPM role to other host in my data center , such us 
node2?
2,How can i send the node1 to UP status?(I have did 'confirm the host has been 
Rebooted' action , and rebooted the node1, but it did no sense)

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users