Hi,

On our second oVirt setup in 3.4.0-1.el6 (that was running fine), I did a yum upgrade on the engine (...sigh...).
Then rebooted the engine.
This machine is hosting the NFS export domain.
Though the VM are still running, the storage domain is in invalid status. You'll find below the engine.log.

At first sight, I thought it was the same issue as :
http://lists.ovirt.org/pipermail/users/2014-March/022161.html
because it looked very similar.
But the NFS export domain connection seemed OK (tested).
I did try every trick I could thought of, restarting, checking anything...
Our cluster stayed in a broken state.

On second sight, I saw that when rebooting the engine, then NFS export domain was not mounted correctly (I wrote a static /dev/sd-something in fstab, and the iscsi manager changed the letter. Next time, I'll use LVM or a label).
So the NFS served was void/empty/black hole.

I just realized all the above, and spent my afternoon in cold sweat.
Correcting the NFS mounting and restarting the engine did the trick.
What still disturbs me is that the unavailability of the NFS export domain should NOT be a reason for the MASTER storage domain to break!

Following the URL above and the BZ opened by the user (https://bugzilla.redhat.com/show_bug.cgi?id=1072900), I see this has been corrected in 3.4.1. What gives a perfectly connected NFS export domain, but empty?

PS : I see no 3.4.1 update on CentOS repo.

Regards,

--------------------------


The engine log :


2014-05-09 14:40:37,767 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] spmStart polling started: taskId = 6d612398-fdad-49f2-9874-5f32a9bf87e2
20│2014-05-09 14:40:40,848 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] Failed in HSMGetTaskStatusVDS method
20│2014-05-09 14:40:40,850 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] spmStart polling ended: taskId = 6d612398-fdad-49f2-9874-5f32a9bf87e2 task status = finished
20│2014-05-09 14:40:40,850 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] Start SPM Task failed - result: cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Storage domain does not exist, code = 358 │ 20│2014-05-09 14:40:40,913 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] spmStart polling ended, spm status: Free
20│2014-05-09 14:40:40,932 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] START, HSMClearTaskVDSCommand(HostName = serv-vm-adm17, HostId = 049943eb-2bcc-4167-a780-7ef76a1f95e9, taskId=6d612398-fdad-49f2-9874-5f32a9bf87e2), log id: 5cfdc8ce 20│2014-05-09 14:40:40,982 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] FINISH, HSMClearTaskVDSCommand, log id: 5cfdc8ce
20│2014-05-09 14:40:40,983 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] FINISH, SpmStartVDSCommand, return: org.ovirt.engine.core.common.businessentities.SpmStatusResult@39471ba9, log id: 58ec77ee 20│2014-05-09 14:40:40,985 INFO [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] Running command: SetStoragePoolStatusCommand internal: true. Entities affected : ID: 5849b030-626e-47cb-ad90-3ce782d831b3 Type: StoragePool 20│2014-05-09 14:40:41,009 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-28) [6b69119f] Correlation ID: 6b69119f, Call Stack: null, Custom Event ID: -1, Message: Invalid status on Data Center Etat-Major3. Setting status to Non Responsive. 20│2014-05-09 14:40:41,017 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed │al se│2014-05-09 14:40:41,112 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] Irs placed on server 049943eb-2bcc-4167-a780-7ef76a1f95e9 failed. Proceed Failover
20│2014-05-09 14:40:41,206 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] hostFromVds::selectedVds - serv-vm-adm16, spmStatus Free, storage pool Etat-Major3
20│2014-05-09 14:40:41,209 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] starting spm on vds serv-vm-adm16, storage pool Etat-Major3, prevId -1, LVER -1
20│2014-05-09 14:40:41,227 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] START, SpmStartVDSCommand(HostName = serv-vm-adm16, HostId = 13a2bc0a-979a-4fcd-8597-06131030d9a0, storagePoolId = 5849b030-626e-47cb-ad90-3ce782d831b3, prevId=-1, prevLVER=-1, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFenci│ 20│ng=false), log id: 67d013a4
20│2014-05-09 14:40:41,292 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] spmStart polling started: taskId = 1046fd3e-71e4-4fcd-bbd0-f17cd6dc08e4
20│2014-05-09 14:40:44,438 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] Failed in HSMGetTaskStatusVDS method


--
Nicolas Ecarnot
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to