[ovirt-users] New user intro & some questions

George Skorup Thu, 29 Jan 2015 18:13:51 -0800

Hello oVirt Users Community,

I've been working with Red Hat and RHEL and clones for about 11 years,though I do still consider myself amateur mostly because I'm more of anetworking guy. :) One-man IT department so I get very little time totinker.

I'm evaluating oVirt (because the boss said no to VMware) and willlikely begin implementation soon to virtualize our datacenter. So I havea SuperMicro Twin2 (4 nodes) system and a cheap managed L2+ switch touse for now. Dual 6-core Xeon's and 24GB per node. The two on-board82574L's are bonded 802.3ad, no issues there (so far). I currently havetwo 1TB WD RE4 SATA drives configured as RAID1 using the Intel RAID BIOSin each node. I understand this is software RAID. That's all workingfine and I did this so that if a drive dies then I can still boot themachine(s). I have a 500MB partition formatted as ext4 for /boot. A 48GBext4 for the root. 24GB for swap. And finally the rest (800-somethingGB) is LVM and XFS for Gluster.

I've been following Jason Brooks' "Up and Running with oVirt" guides(which are great, BTW!). I have the cluster up and running with CentOS 7and oVirt 3.5, hosted-engine on CentOS 6.6 and CTDB to host a virtual IPfor the engine NFS mount. There are a couple test VMs running along withthe engine on various nodes. I found it interesting that I was able toupload a ripped ISO of Win 2k3 Enterprise (not SP2) and was able tosuccessfully boot it, after which I promptly installed SP2 and oVirtguest tools. I do very little with Windows, but there's always that oneremaining customer that needs IIS and we're not about to buy a newWindows Server 2012 license just for them.

So anyway, I'm having a problem with node reboots. They simply will notshut down and reboot cleanly. Instead, it looks like they hang after allprocesses are shut down, or at least attempted to be shut down. Thenafter a couple minutes, the hardware watchdog resets the system. I'vecame to the conclusion that sanlock and/or wdmd is causing the hangup.I'm guessing an active but non-responsive NFS mount is the culprit,possibly the ISO domain NFS mount which is on the engine? I've triedmanually shutting down all oVirt, VDSM, etc. processes, unmounting allNFS shares, but it seems sanlock still has a hold on something in/rhev/.. I've Google'd a bit and have come across posts about this aswell. Any tips here?

Then I experienced something else odd yesterday. I did a yum update forthe glibc vulnerability stuff. Gluster was updated as well which reallythrew a wrench into things because I wasn't paying attention and quorumbroke, etc. I got that fixed. Rebooted all nodes (which is when I foundthe sanlock/watchdog problem). Nodes 2, 3 and 4 came back up, but node1did not. I logged into the IPKVM console and found that it had nonetwork configuration. All /etc/sysconfig/network-scripts/ifcfg-* fileswere gone. I was able to manually reconfigure the physical interfaces,set the bonding back up and add the ovirtmgmt bridge. But then theengine reported the host as non-operational due to '..does not complywith cluster default networks... ovirtmgmt missing' which I was able toresolve by reconfiguring the host's network config within the engine GUIand all is now well. I'm just curious how/why the ifcfg files were wipedout? I haven't touched the network config on any hosts since runninghosted-engine --deploy.

Please forgive my ignorance and point me to the correct place if theseissues have been discussed and/or resolved already.

And overall I'm very much liking oVirt, especially as a viable andcost-effective alternative to vSphere.


Thanks,
George
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] New user intro & some questions

Reply via email to