On Thu, Dec 29, 2016 at 10:53 AM, Jim Kusznir <[email protected]> wrote:
> Hello: > > I've been involved in virtualization from its very early days, and been > running linux virtualization solutions off and on for a decade. > Previously, I was always frustrated with the long feature list offered by > many linux virtualization systems but with no reasonable way to manage > that. It seemed that I had to spend an inordinate amount of time doing > everything by hand. Thus, when I found oVirt, I was ecstatic! > Unfortunately, at that time I changed employment (or rather left employment > and became self-employed), and didn't have any reason to build my own virt > cluster..until now! > > So I'm back with oVirt, and actually deploying a small 3-node cluster. I > intend to run on it: > VoIP Server > Web Server > Business backend server > UniFi management server > Monitoring server (zabbix) > > Not a heavy load, and 3 servers is probably overkill, but I need this to > work, and it sounds like 3 is the magic entry level for all the > cluster/failover stuff to work. For now, my intent is to use a single SSD > on each node with gluster for the storage backend. I figure if all the > failover stuff actually working, if I loose a node due to disk failure, its > not the end of the world. I can rebuild it, reconnect gluster, and restart > everything. As this is for a startup business, funds are thin at the > moment, so I'm trying to cut a couple corners that don't affect overall > reliability. If this side of the business grows more, I would likely > invest in some dedicated servers. > Welcome back to oVirt :) > > So far, I've based my efforts around this guide on oVirt's website: > http://www.ovirt.org/blog/2016/08/up-and-running-with- > ovirt-4-0-and-gluster-storage/ > > My cluster is currently functioning, but not entirely correctly. Some of > it is gut feel, some of it is specific test cases (more to follow). First, > some areas that lacked clarity and the choices I made in them: > > Early on, Jason talks about using a dedicated gluster network for the > gluster storage sync'ing. I liked that idea, and as I had 4 nics on each > machine, I thought dedicating one or two to gluster would be fine. So, on > my clean, bare machines, I setup another network with private NiCs and put > it on a standalone switch. I added hostnames with a designator (-g on the > end) for the IPs for all three nodes into /etc/hosts on all three nodes so > now each node can resolve itself and the other nodes on the -g name (and > private IP) as well as their main host name and "more public" (but not > public) IP. > > Then, for gdeploy, I put the hostnames in as the -g hostnames, as I didn't > see anywhere to tell gluster to use the private network. I think this is a > place I went wrong, but didn't realize it until the end.... > -g hostnames are the right ones to put in for gdeploy. gdeploy peer probes the cluster and creates the gluster volumes, so it needs the gluster specific ip addresses. > > I set up the gdeploy script (it took a few times, and a few OS rebuilds to > get it just right...), and ran it, and it was successful! When complete, I > had a working gluster cluster and the right software installed on each node! > Were these errors specific to gdeploy configuration? With the latest release of gdeploy, there's an option "skip_<section-name>_errors". This could help avoid the OS rebuilds, I think. > > I set up the engine on node1, and that worked, and I was able to log in to > the web gui. I mistakenly skipped the web gui enable gluster service > before doing the engine vm reboot to complete the engine setup process, but > I did go back in after the reboot and do that. After doing that, I was > notified in the gui that there were additional nodes, did I want to add > them. Initially, I skipped that and went back to the command line as Jason > suggests. Unfortunately, it could not find any other nodes through his > method, and it didn't work. Combine that with the warnings that I should > not be using the command line method, and it would be removed in the next > release, I went back to the gui and attempted to add the nodes that way. > > Here's where things appeared to go wrong...It showed me two additional > nodes, but ONLY by their -g (private gluster) hostname. And the ssh > fingerprints were not populated, so it would not let me proceed. After > messing with this for a bit, I realized that the engine cannot get to the > nodes via the gluster interface (and as far as I knew, it shouldn't). > Working late at night, I let myself "hack it up" a bit, and on the engine > VM, I added /etc/hosts entries for the -g hostnames pointing to the main > IPs. It then populated the ssh host keys and let me add them in. Ok, so > things appear to be working..kinda. I noticed at this point that ALL > aspects of the gui became VERY slow. Clicking in and typing in any field > felt like I was on ssh over a satellite link. Everything felt a bit worse > than the early days of vSphere....Painfully slow. but it was still > working, so I pressed on. > Import host flow lists the peers as gluster understands it, and hence the -g (private gluster) hostname. Rather than importing the hosts, you should add the additional hosts using the Add Host flow, and specify the non "-g" hostname. This ensures that oVirt understands the host via the non-private hostname. Once the hosts are added, mark the gluster interface so that the bricks are correctly identified via the -g hostname. > I configured gluster storage. Eventually I was successful, but initially > it would only let me add a "Data" storage domain, the drop-down menu did > NOT contain iso, export, or anything else... Somehow, on its own, after > leaving and re-entering that tab a few times, iso and export materialized > on their own in the menu, so I was able to finish that setup. > > Ok, all looks good. I wanted to try out his little tip on adding a VM, > too. I saw "ovirt-imiage-repository" in the "external providers" section, > but he mentioned it in the storage section. It wasn't there on mine, and > in external providers, I couldn't find anyway to do anything useful. I > tried and fumbled with this, and still, I have not figured out how to use > this feature. It would be nice.... > > Anyway, I moved on for now. As I was skeptical that things were set up > correctly, i tried putting node 1 (which was running my engine, and was NOT > set up with the -g hostname) into maintence mode, to see if it really did > smoothly failover. It failed to go into maintence mode (left it for 12 > hours, too!). I suspect its because of the hostnames/networks in use. > > Oh, I forgot to mention...I did follow the instructions in Jason's guide > to set up the gluster network in ovirt and map that to the right physical > interface on all 3 nodes. I also moved migration from the main network to > the gluster network as Jason had suggested. > > So...How badly did I do? How do I fix the issues? (I'm not opposed to > starting from scratch again, either...I've already done that 3-4 times in > the early phases of getting the gdeploy script down, and I already have > kickstart files setup with a network environment...I was rebuilding that > often! I just need to know how to fix my setup this time....) > > I do greatly appreciate others' help and insight. I am in the IRC channel > under kusznir currently, too. > > --Jim > > > > _______________________________________________ > Users mailing list > [email protected] > http://lists.ovirt.org/mailman/listinfo/users > >
_______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

