Re: [ovirt-users] [Users] Migrate cluster 3.3 -> 3.4 hosted on existing hosts
On 4/2/2014 1:58 AM, Yedidyah Bar David wrote: - Original Message - From: "Ted Miller" To: "users" Sent: Tuesday, April 1, 2014 10:40:38 PM Subject: [Users] Migrate cluster 3.3 -> 3.4 hosted on existing hosts Current setup: * 3 identical hosts running on HP GL180 g5 servers * gluster running 5 volumes in replica 3 * engine running on VMWare Server on another computer (that computer is NOT available to convert to a host) Where I want to end up: * 3 identical hosted-engine hosts running on HP GL180 g5 servers * gluster running 6 volumes in replica 3 * new volume will be nfs storage for engine VM * hosted engine in oVirt VM * as few changes to current setup as possible The two pages I found on the wiki are: Hosted Engine Howto and Migrate to Hosted Engine . Both were written during the testing process, and have not been updated to reflect production status. I don't know if anything in the process has changed since they were written. Basically things remained the same, with some details changing perhaps. Process outlined in above two pages (as I understand it): have nfs file store ready to hold VM Do minimal install (not clear if ovirt node, Centos, or Fedora was used--I am Centos-based) Fedora/Centos/RHEL are supposed to work. ovirt node is currently not supported - iirc it's planned to be supported soon, not sure. # yum install ovirt-hosted-engine-setup # hosted-engine --deploy Install OS on VM return to host console at "Please install the engine in the VM" prompt on host on VM console # yum install ovirt-engine on old engine: service ovirt-engine stop chkconfig ovirt-engine off set up dns for new engine # engine-backup --mode=backup --file=backup1 --log=backup1.log scp backup file to new engine VM on new VM: Please see [1]. Specifically, if you had a local db, you'll first have to create it yourself. [1] http://www.ovirt.org/Ovirt-engine-backup#Howto # engine-backup --mode=restore --file=backup1 --log=backup1-restore.log --change-db-credentials --db-host=didi-lap --db-user=engine --db-password --db-name=engine The above assumes a db was already created and ready to use (access etc) using the supplied credentials. You'll naturally have to provide your own. # engine-setup on host: run script until: "The system will wait until the VM is down." on new VM: # reboot on Host: finish script My questions: 1. Is the above still the recommended way to do a hosted-engine install? Yes. 2. Will it blow up at me if I use my existing host (with glusterfs all set up, etc) as the starting point, instead of a clean install? a. Probably yes, for now. I did not hear much about testing such a migration using an existing host - ovirt or gluster or both. I did not test that myself either. If at all possible, you should use a new clean host. Do plan well and test. Also see discussions on the mailing lists, e.g. this one: http://lists.ovirt.org/pipermail/users/2014-March/thread.html#22441 Good luck, and please report back! I have good news and bad news. I migrated the 3 host cluster from 3.4 to 3.4 hosted. The process went fairly smoothly. Engine ran, I was able to add the three hosts to the engine's domain, etc. That was all working about Thursday. (I did not get fencing set up). Friday, at the end of the day, I shut down the entire system (it is not yet in production) because I was leaving for a week's vacation/holiday. I am fairly certain that I put the system into global maintenance mode before shutting down. I know I shut down the engine before shutting down the hosts. Monday (10 days later) I came back from vacation and powered up the three machines. The hosts came up fine, but the engine will not start. (I found some gluster split-brain errors, and chased that for a couple of days, until I realized that the split-brain was not the fundamental problem.) During bootup /var/log/messages shows: May 21 19:22:00 s2 ovirt-ha-broker mgmt_bridge.MgmtBridge ERROR Failed to getVdsCapabilities: VDSM initialization timeout May 21 19:22:00 s2 ovirt-ha-broker mem_free.MemFree ERROR Failed to getVdsStats: VDSM initialization timeout May 21 19:22:00 s2 ovirt-ha-broker cpu_load_no_engine.EngineHealth ERROR Failed to getVmStats: VDSM initialization timeout May 21 19:22:00 s2 ovirt-ha-broker engine_health.CpuLoadNoEngine ERROR Failed to getVmStats: VDSM initialization timeout May 21 19:22:03 s2 vdsm vds WARNING Unable to load the json rpc server module. Please make sure it is installed. and then /var/log/ovirt-hosted-engine-ha/agent.log shows: MainThread::ERROR::2014-05-21 19:22:04,198::hosted_engine::414::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) Failed trying to connect storage: MainThread::CRITICAL::2014-05-21 19:22:04,199::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::
Re: [ovirt-users] sanlock + gluster recovery -- RFE
Vijay, I am not a member of the developer list, so my comments are at end. On 5/23/2014 6:55 AM, Vijay Bellur wrote: On 05/21/2014 10:22 PM, Federico Simoncelli wrote: - Original Message - From: "Giuseppe Ragusa" To: fsimo...@redhat.com Cc: users@ovirt.org Sent: Wednesday, May 21, 2014 5:15:30 PM Subject: sanlock + gluster recovery -- RFE Hi, - Original Message - From: "Ted Miller" To: "users" Sent: Tuesday, May 20, 2014 11:31:42 PM Subject: [ovirt-users] sanlock + gluster recovery -- RFE As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space Although this is possible (at the moment) we are working hard to avoid it. The hardest part here is to ensure that the gluster volume is properly configured. The suggested configuration for a volume to be used with ovirt is: Volume Name: (...) Type: Replicate Volume ID: (...) Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: (...three bricks...) Options Reconfigured: network.ping-timeout: 10 cluster.quorum-type: auto The two options ping-timeout and quorum-type are really important. You would also need a build where this bug is fixed in order to avoid any chance of a split-brain: https://bugzilla.redhat.com/show_bug.cgi?id=1066996 It seems that the aforementioned bug is peculiar to 3-bricks setups. I understand that a 3-bricks setup can allow proper quorum formation without resorting to "first-configured-brick-has-more-weight" convention used with only 2 bricks and quorum "auto" (which makes one node "special", so not properly any-single-fault tolerant). Correct. But, since we are on ovirt-users, is there a similar suggested configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management properly configured and tested-working? I mean a configuration where "any" host can go south and oVirt (through the other one) fences it (forcibly powering it off with confirmation from IPMI or similar) then restarts HA-marked vms that were running there, all the while keeping the underlying GlusterFS-based storage domains responsive and readable/writeable (maybe apart from a lapse between detected other-node unresposiveness and confirmed fencing)? We already had a discussion with gluster asking if it was possible to add fencing to the replica 2 quorum/consistency mechanism. The idea is that as soon as you can't replicate a write you have to freeze all IO until either the connection is re-established or you know that the other host has been killed. Adding Vijay. There is a related thread on gluster-devel [1] to have a better behavior in GlusterFS for prevention of split brains with sanlock and 2-way replicated gluster volumes. Please feel free to comment on the proposal there. Thanks, Vijay [1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-May/040751.html One quick note before my main comment: I see references to quorum being "N/2 + 1". Isn't if more accurate to say that quorum is "(N + 1)/2" or "N/2 + 0.5"? Now to my main comment. I see a case that is not being addressed. I have no proof of how often this use-case occurs, but I believe that is does occur. (It could (theoretically) occur in any situation where multiple bricks are writing to different parts of the same file.) Use-case: sanlock via fuse client. Steps to produce originally (not tested for reproducibility, because I was unable to recover the ovirt cluster after occurrence, had to rebuild from scratch), time frame was late 2013 or early 2014 2 node ovirt cluster using replicated gluster storage ovirt cluster up and running VMs remove power from network switch restore power to network switch after a few minutes Result both copies of .../dom_md/ids file accused the other of being out of sync Hypothesis of cause servers (ovirt nodes and gluster bricks) are called A and B At the moment when network communication was lost, or just a moment after communication was lost A had written to local ids file A had started process to send write to B A had not received write confirmation from B and B had written to local ids file B had started process to send write to A B had not received write confirmation from A Thus, each file had a segment that had been written to the local file, but had not been confirmed written on the remote file. Each file correctl
Re: [ovirt-users] sanlock + gluster recovery -- RFE
On 5/21/2014 11:15 AM, Giuseppe Ragusa wrote: Hi, > - Original Message - > > From: "Ted Miller" > > To: "users" > > Sent: Tuesday, May 20, 2014 11:31:42 PM > > Subject: [ovirt-users] sanlock + gluster recovery -- RFE > > > > As you are aware, there is an ongoing split-brain problem with running > > sanlock on replicated gluster storage. Personally, I believe that this is > > the 5th time that I have been bitten by this sanlock+gluster problem. > > > > I believe that the following are true (if not, my entire request is probably > > off base). > > > > > > * ovirt uses sanlock in such a way that when the sanlock storage is on a > > replicated gluster file system, very small storage disruptions can > > result in a gluster split-brain on the sanlock space > > Although this is possible (at the moment) we are working hard to avoid it. > The hardest part here is to ensure that the gluster volume is properly > configured. > > The suggested configuration for a volume to be used with ovirt is: > > Volume Name: (...) > Type: Replicate > Volume ID: (...) > Status: Started > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > (...three bricks...) > Options Reconfigured: > network.ping-timeout: 10 > cluster.quorum-type: auto > > The two options ping-timeout and quorum-type are really important. > > You would also need a build where this bug is fixed in order to avoid any > chance of a split-brain: > > https://bugzilla.redhat.com/show_bug.cgi?id=1066996 It seems that the aforementioned bug is peculiar to 3-bricks setups. I understand that a 3-bricks setup can allow proper quorum formation without resorting to "first-configured-brick-has-more-weight" convention used with only 2 bricks and quorum "auto" (which makes one node "special", so not properly any-single-fault tolerant). But, since we are on ovirt-users, is there a similar suggested configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management properly configured and tested-working? I mean a configuration where "any" host can go south and oVirt (through the other one) fences it (forcibly powering it off with confirmation from IPMI or similar) then restarts HA-marked vms that were running there, all the while keeping the underlying GlusterFS-based storage domains responsive and readable/writeable (maybe apart from a lapse between detected other-node unresposiveness and confirmed fencing)? Furthermore: is such a suggested configuration possible in a self-hosted-engine scenario? Regards, Giuseppe > > How did I get into this mess? > > > > ... > > > > What I would like to see in ovirt to help me (and others like me). Alternates > > listed in order from most desirable (automatic) to least desirable (set of > > commands to type, with lots of variables to figure out). > > The real solution is to avoid the split-brain altogether. At the moment it > seems that using the suggested configurations and the bug fix we shouldn't > hit a split-brain. > > > 1. automagic recovery > > > > 2. recovery subcommand > > > > 3. script > > > > 4. commands > > I think that the commands to resolve a split-brain should be documented. > I just started a page here: > > http://www.ovirt.org/Gluster_Storage_Domain_Reference I suggest you add these lines to the Gluster configuration, as I have seen this come up multiple times on the User list: storage.owner-uid: 36 storage.owner-gid: 36 Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] sanlock + gluster recovery -- RFE
Itamar, I am addressing this to you because one of your assignments seems to be to coordinate other oVirt contributors when dealing with issues that are raised on the ovirt-users email list. As you are aware, there is an ongoing split-brain problem with running sanlock on replicated gluster storage. Personally, I believe that this is the 5th time that I have been bitten by this sanlock+gluster problem. I believe that the following are true (if not, my entire request is probably off base). * ovirt uses sanlock in such a way that when the sanlock storage is on a replicated gluster file system, very small storage disruptions can result in a gluster split-brain on the sanlock space o gluster is aware of the problem, and is working on a different way of replicating data, which will reduce these problems. * most (maybe all) of the sanlock locks have a short duration, measured in seconds * there are only a couple of things that a user can safely do from the command line when a file is in split-brain o delete the file o rename (mv) the file * x _How did I get into this mess?_ had 3 hosts running ovirt 3.3 each hosted VMs gluster replica 3 storage engine was external to cluster upgraded 3 hosts from ovirt 3.3 to 3.4 hosted-engine deploy used new gluster volume (accessed via nfs) for storage storage was accessed using localhost:engVM1 link (localhost was probably a poor choice) created new engine on VM (did not transfer any data from old engine) added 3 hosts to new engine via web-gui ran above setup for 3 days shut entire system down before I left on vacation (holiday) came back from vacation powered on hosts found that iptables did not have rules for gluster access (a continuing problem if host installation is allowed to set up firewall) added rules for gluster glusterfs now up and running added storage manually tried "hosted-engine --vm-start" vm did not start logs show sanlock errors "gluster volume heal engVM1full: "gluster volume heal engVM1 info split-brain" showed 6 files in split-brain all 5 prefixed by /rhev/data-center/mnt/localhost\:_engVM1 UUID/dom_md/ids UUID/images/UUID/UUID (VM hard disk) UUID/images/UUID/UUID.lease UUID/ha_agent/hosted-engine.lockspace UUID/ha_agent/hosted-engine.metadata I copied each of the above files off of each of the three bricks to a safe place (15 files copied) I renamed the 5 files on /rhev/ I copied the 5 files from one of the bricks to /rhev/ files can now be read OK (e.g. cat ids) sanlock.log shows error sets like these: 2014-05-20 03:23:39-0400 36199 [2843]: s3358 lockspace 5ebb3b40-a394-405b-bbac-4c0e21ccd659:1:/rhev/data-center/mnt/localhost:_engVM1/5ebb3b40-a394-405b-bbac-4c0e21ccd659/dom_md/ids:0 2014-05-20 03:23:39-0400 36199 [18873]: open error -5 /rhev/data-center/mnt/localhost:_engVM1/5ebb3b40-a394-405b-bbac-4c0e21ccd659/dom_md/ids 2014-05-20 03:23:39-0400 36199 [18873]: s3358 open_disk /rhev/data-center/mnt/localhost:_engVM1/5ebb3b40-a394-405b-bbac-4c0e21ccd659/dom_md/ids error -5 2014-05-20 03:23:40-0400 36200 [2843]: s3358 add_lockspace fail result -19 I am now stuck What I would like to see in ovirt to help me (and others like me). Alternates listed in order from most desirable (automatic) to least desirable (set of commands to type, with lots of variables to figure out). 1. automagic recovery * When a host is not able to access sanlock, it writes a small "problem" text file into the shared storage o the host-ID as part of the name (so only one host ever accesses that file) o a status number for the error causing problems o time stamp o time stamp when last sanlock lease will expire o if sanlock is able to access the file, the "problem" file is deleted * when time passes for its last sanlock lease to be expired, highest number host does a survey o did all other hosts create "problem" files? o do all "problem" files show same (or compatible) error codes related to file access problems? o are all hosts communicating by network? o if yes to all above * delete all sanlock storage space * initialize sanlock from scratch * restart whatever may have given up because of sanlock * restart VM if necessary 2. recovery subcommand * add "hosted-engine --lock-initialize" command that would delete sanlock, start over from scratch 3. script * publish a script (in ovirt packages or available on web) which, when run, does all (or most) of the recovery process needed. 4. commands * publish on the web a "recipe" for dealing with files that commonly go split-brain o ids o *.lease o *.lockspace Any chance of any help on any of the above levels? Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] SPM error
On 4/12/2014 12:23 PM, Itamar Heim wrote: On 04/12/2014 03:40 PM, Maurice James wrote: What did you do to try to fix the sanlock? Anything is better than nothing at this point My thread is at http://lists.ovirt.org/pipermail/users/2014-January/020394.html - Original Message - From: "Ted Miller" To: "Maurice James" Sent: Friday, April 11, 2014 7:27:24 PM Subject: Re: [ovirt-users] SPM error I did receive some help on one stage of rebuilding my sanlock, but there were too many other things wrong to get it started again. Only advice I have is -- look at your sanlock logs, and see if you can find anything there that is helpful. On 4/11/2014 7:23 PM, Maurice James wrote: Nooo. Sent from my Galaxy S®III Original message ---- From: Ted Miller Date:04/11/2014 7:08 PM (GMT-05:00) To: Maurice James Subject: Re: [ovirt-users] SPM error I didn't, really. I did something wrong along the way, and ended up having to rebuild the engine and hosts. (My problems were due to a glusterfs split-brain.) Ted Miller On 4/11/2014 6:03 PM, Maurice James wrote: How did you fix it? Sent from my Galaxy S®III Original message From: Ted Miller Date:04/11/2014 6:00 PM (GMT-05:00) To: users@ovirt.org Subject: Re: [ovirt-users] SPM error On 4/11/2014 2:05 PM, Maurice James wrote: I have an error trying to bring the master DC back online. After several reboots, no luck. I took the other cluster members offline to try to troubleshoot. The remaining host is constantly in contention with itself for SPM ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-40) [38d400ea] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed I'm no expert, but the last time I beat my head on that rock, something was wrong with my sanlock storage. YMMV Ted Miller Elkhart, IN, USA Maurice - which type of storage is this? -- "He is no fool who gives what he cannot keep, to gain what he cannot lose." - - Jim Elliot For more information about Jim Elliot and his unusual life, see http://www.christianliteratureandliving.com/march2003/carolyn.html. Ted Miller Design Engineer HCJB Global Technology Center, a ministry of Reach Beyond 2830 South 17th St Elkhart, IN 46517 574--970-4272 my desk 574--970-4252 receptionist ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] SPM error
On 4/11/2014 2:05 PM, Maurice James wrote: I have an error trying to bring the master DC back online. After several reboots, no luck. I took the other cluster members offline to try to troubleshoot. The remaining host is constantly in contention with itself for SPM ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-40) [38d400ea] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed I'm no expert, but the last time I beat my head on that rock, something was wrong with my sanlock storage. YMMV Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] Migrate cluster 3.3 -> 3.4 hosted on existing hosts
Current setup: * 3 identical hosts running on HP GL180 g5 servers o gluster running 5 volumes in replica 3 * engine running on VMWare Server on another computer (that computer is NOT available to convert to a host) Where I want to end up: * 3 identical hosted-engine hosts running on HP GL180 g5 servers o gluster running 6 volumes in replica 3 + new volume will be nfs storage for engine VM * hosted engine in oVirt VM * as few changes to current setup as possible The two pages I found on the wiki are: Hosted Engine Howto <http://www.ovirt.org/Hosted_Engine_Howto> and Migrate to Hosted Engine <http://www.ovirt.org/Migrate_to_Hosted_Engine>. Both were written during the testing process, and have not been updated to reflect production status. I don't know if anything in the process has changed since they were written. Process outlined in above two pages (as I understand it): have nfs file store ready to hold VM Do minimal install (not clear if ovirt node, Centos, or Fedora was used--I am Centos-based) # yum install ovirt-hosted-engine-setup # hosted-engine --deploy Install OS on VM return to host console at "Please install the engine in the VM" prompt on host on VM console # yum install ovirt-engine on old engine: service ovirt-engine stop chkconfig ovirt-engine off set up dns for new engine # engine-backup --mode=backup --file=backup1 --log=backup1.log scp backup file to new engine VM on new VM: # engine-backup --mode=restore --file=backup1 --log=backup1-restore.log --change-db-credentials --db-host=didi-lap --db-user=engine --db-password --db-name=engine # engine-setup on host: run script until: "The system will wait until the VM is down." on new VM: # reboot on Host: finish script My questions: 1. Is the above still the recommended way to do a hosted-engine install? 2. Will it blow up at me if I use my existing host (with glusterfs all set up, etc) as the starting point, instead of a clean install? Thank you for letting me benefit from your experience, Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Force certain VMs to be on different hosts
Scott On 3/19/2014 10:30 AM, Scott Ocken wrote: Ted, Yes! This is exactly what I was looking for. I think you described it better than I did. This feature would be really nice. Thanks Scott Quoting Ted Miller : I think what the OP is asking for a designation as a "redundant group 1". He may have 10 hosts and 3 VMs in "redundant group 1". He doesn't care which hosts they run on, as long as they are three separate hosts. I can see this as being fairly widely applicable. If you have multiple web servers for load sharing, you don't want them all running on the same host, because VM load is going to peak on them at the same times. oVirt has no way of knowing that unless you give oVirt a hint to spread things around. The web group might also want to split up the server that spreads the jobs around, and a database server used by all the web hosts. I can see easily ending up with a group of 5 machines (3 web servers, a load sharing controller, and a database server) that you want spread across any 5 of the 15 servers in a cluster, because their loads are all going to spike together. You don't want oVirt having to try to migrate some of them during a load spike, because oVirt noticed that a host with 3 of the 5 is overloaded. Not my situation, but one I can see the usefulness of. Ted Miller Elkhart, IN, USA On 3/18/2014 11:54 AM, Meital Bourvine wrote: Hi Scott, Click on a vm Edit Show Advanced Options Host "Start Running on" - Original Message - From: "Scott Ocken" To: Users@ovirt.org Sent: Tuesday, March 18, 2014 5:08:24 PM Subject: [Users] Force certain VMs to be on different hosts Is there a way to have certain VMs to be on different hosts? (assuming there are enough hosts) IE. I have a db cluster of 3 VMs. I would like each one to always be on different hosts. That way if a host goes down my db cluster is still happy while migration happens. Or if migration fails I am still good. Thanks Scott Ted Miller It looks like the Negative Affinity/Anti-Affinity feature that Itamar Hein pointed out in his email, with a feature page at http://www.ovirt.org/Features/VM-Affinity includes what you are trying to do. This is in 3.4, which in the QA process now. Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Force certain VMs to be on different hosts
I think what the OP is asking for a designation as a "redundant group 1". He may have 10 hosts and 3 VMs in "redundant group 1". He doesn't care which hosts they run on, as long as they are three separate hosts. I can see this as being fairly widely applicable. If you have multiple web servers for load sharing, you don't want them all running on the same host, because VM load is going to peak on them at the same times. oVirt has no way of knowing that unless you give oVirt a hint to spread things around. The web group might also want to split up the server that spreads the jobs around, and a database server used by all the web hosts. I can see easily ending up with a group of 5 machines (3 web servers, a load sharing controller, and a database server) that you want spread across any 5 of the 15 servers in a cluster, because their loads are all going to spike together. You don't want oVirt having to try to migrate some of them during a load spike, because oVirt noticed that a host with 3 of the 5 is overloaded. Not my situation, but one I can see the usefulness of. Ted Miller Elkhart, IN, USA On 3/18/2014 11:54 AM, Meital Bourvine wrote: Hi Scott, Click on a vm Edit Show Advanced Options Host "Start Running on" - Original Message - From: "Scott Ocken" To: Users@ovirt.org Sent: Tuesday, March 18, 2014 5:08:24 PM Subject: [Users] Force certain VMs to be on different hosts Is there a way to have certain VMs to be on different hosts? (assuming there are enough hosts) IE. I have a db cluster of 3 VMs. I would like each one to always be on different hosts. That way if a host goes down my db cluster is still happy while migration happens. Or if migration fails I am still good. Thanks Scott ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- "He is no fool who gives what he cannot keep, to gain what he cannot lose." - - Jim Elliot For more information about Jim Elliot and his unusual life, see http://www.christianliteratureandliving.com/march2003/carolyn.html. Ted Miller Design Engineer HCJB Global Technology Center, a ministry of Reach Beyond 2830 South 17th St Elkhart, IN 46517 574--970-4272 my desk 574--970-4252 receptionist ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] [RFI] GUI Changes for oVirt 4.0
On 3/18/2014 6:29 AM, Itamar Heim wrote: we are brainstorming on what should we change in the oVirt UI for 4.0. for current brainstorming phase, "anything goes" - i.e., I'd like us to ignore current limitations and flows, and envision/fantasize the "perfect solution". SO - what do YOU think we should consider for 4.0 UI concept, flows, etc. I have an idea, though it may be relevant only to smaller setups. I would like one place that I can go and see the health of my system. Right now I am running a cluster in test mode, and I have to look at several places before I have confidence that all is well: Data Center Storage (I am using gluster) Hosts VMs The natural place for me to look would seem to be the left bar. If the icons there had a color change to reflect status, I could just hit "Expand All" and the color would immediately tell me the system status. Same icons would work, with just a background or little square of color. I realize that my little 3-host system is the exception, because (so far) I can hit "Expand all" and I can still see the whole thing. I will not have to deploy many more VMs before it will not all fit. There may be a better way to do this, e.g. another choice between "Expand all" and "Collapse All" that would expand all except the lowest level. It would then show me categories like "Storage", "Networks", "Hosts", "Volumes" and "VMs" with a health color indication for each cluster. If I see anything I am not expecting, I can expand that heading and see the status of the individual items. There may be a better way to do this, but I know that it is somewhat frustrating to check first thing in the morning and have to do some many clicks before I have confidence that nothing bad happened overnight. I have actually found it faster to click on the "Events" tab and see if there are any nasty messages there, rather than checking current status. I look forward to the insights of others as to how they monitor cluster status. Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] SPICE causes migration failure?
On 3/3/2014 12:26 PM, Dafna Ron wrote: I don't see a reason why open monitor will fail migration - at most, if there is a problem I would close the spice session on src and restarted it at the dst. can you please attach vdsm/libvirt/qemu logs from both hosts and engine logs so that we can see the migration failure reason? Thanks, Dafna On 03/03/2014 05:16 PM, Ted Miller wrote: I just got my Data Center running again, and am proceeding with some setup & testing. I created a VM (not doing anything useful) I clicked on the "Console" and had a SPICE console up (viewed in Win7). I had it printing the time on the screen once per second (while date;do sleep 1; done). I tried to migrate the VM to another host and got in the GUI: Migration started (VM: web1, Source: s1, Destination: s3, User: admin@internal). Migration failed due to Error: Fatal error during migration (VM: web1, Source: s1, Destination: s3). As I started the migration I happened to think "I wonder how they handle the SPICE console, since I think that is a link from the host to my machine, letting me see the VM's screen." After the failure, I tried shutting down the SPICE console, and found that the migration succeeded. I again opened SPICE and had a migration fail. Closed SPICE, migration failed. I can understand how migrating SPICE is a problem, but, at least could we give the victim of this condition a meaningful error message? I have seen a lot of questions about failed migrations (mostly due to attached CDs), but I have never seen this discussed. If I had not had that particular thought cross my brain at that particular time, I doubt that SPICE would have been where I went looking for a solution. If this is the first time this issue has been raised, I am willing to file a bug. Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users In finding the right one-minute slice of the logs, I saw something that makes me think this is due to a missing method in the glusterfs support. Others who understand more of what the logs are saying can verify or correct my hunch. Was trying to migrate from s2 to s1. Logs on fpaste.org: http://ur1.ca/gr48c http://ur1.ca/gr48r http://ur1.ca/gr493 http://ur1.ca/gr49e http://ur1.ca/gr49i http://ur1.ca/gr49x http://ur1.ca/gr4a6 Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] SPICE causes migration failure?
I just got my Data Center running again, and am proceeding with some setup & testing. I created a VM (not doing anything useful) I clicked on the "Console" and had a SPICE console up (viewed in Win7). I had it printing the time on the screen once per second (while date;do sleep 1; done). I tried to migrate the VM to another host and got in the GUI: Migration started (VM: web1, Source: s1, Destination: s3, User: admin@internal). Migration failed due to Error: Fatal error during migration (VM: web1, Source: s1, Destination: s3). As I started the migration I happened to think "I wonder how they handle the SPICE console, since I think that is a link from the host to my machine, letting me see the VM's screen." After the failure, I tried shutting down the SPICE console, and found that the migration succeeded. I again opened SPICE and had a migration fail. Closed SPICE, migration failed. I can understand how migrating SPICE is a problem, but, at least could we give the victim of this condition a meaningful error message? I have seen a lot of questions about failed migrations (mostly due to attached CDs), but I have never seen this discussed. If I had not had that particular thought cross my brain at that particular time, I doubt that SPICE would have been where I went looking for a solution. If this is the first time this issue has been raised, I am willing to file a bug. Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Opinions needed: 3 node gluster replica 3 | NFS async | snapshots for consistency
server does not get along with gluster. You need to run gluster's own NFS server, and turn off the kernel NFS server. Gluster's own NFS server is gluster-aware, so I think some of the problems you envision may be covered in that server. Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] vmware image conversion
On 2/19/2014 8:54 PM, Bob Doolittle wrote: Yes. So: VMware non-ESX -> ESX, using VMware's tool, then ESX -> RHEV using virt-v2v no? NO!!! You apparently have not done this, only talk about it? The last step (ESX -> RHEV using virt-v2v) doesn't work. It had problems with some images, and they pulled it completely from current versions of virt-v2v. See https://rhn.redhat.com/errata/RHBA-2013-1749.html Ted Miller If this is viable, it's easy to understand why nobody wants to put effort into supporting a bevy of VMware VM formats, when there's a tool already available to convert to one and they can focus on it. -Bob On 02/19/2014 05:52 PM, Maurice James wrote: I want to change it from VMware to RHEV/oVirt -Original Message- From: Bob Doolittle [mailto:b...@doolittle.us.com] Sent: Wednesday, February 19, 2014 8:51 PM To: Maurice James; 'Ted Miller'; users@ovirt.org Subject: Re: [Users] vmware image conversion My recollection is that VMware provides a converter to change your VMware non-ESX VMs into ESX format. Do you have to buy ESX to gain access to it? -Bob On 02/19/2014 05:46 PM, Maurice James wrote: I even open a feature request that they closed pretty quickly with WONTFIX https://bugzilla.redhat.com/show_bug.cgi?id=1062910 . Why is this such a touchy issue? -Original Message- From: users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] On Behalf Of Ted Miller Sent: Wednesday, February 19, 2014 7:28 PM To: users@ovirt.org Subject: Re: [Users] vmware image conversion On 2/9/2014 4:27 PM, Itamar Heim wrote: On 02/09/2014 10:28 PM, Maurice James wrote: The instructions assume that I have an ESX instance to connect to. How do I do this with an already exported vmware image with no esx available to connect to? I have a turnkey drupal vm in ovf format -Original Message- From: Itamar Heim [mailto:ih...@redhat.com] Sent: Sunday, February 09, 2014 2:24 PM To: Maurice James; 'users' Subject: Re: [Users] vmware image conversion On 02/09/2014 07:02 PM, Maurice James wrote: According to this https://rhn.redhat.com/errata/RHBA-2013-1749.html It does not do it please review: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterpris e _Virtua lization/3.3/html-single/V2V_Guide/index.html -Original Message- From: Itamar Heim [mailto:ih...@redhat.com] Sent: Sunday, February 09, 2014 4:52 AM To: Maurice James; 'users' Subject: Re: [Users] vmware image conversion On 02/08/2014 04:18 PM, Maurice James wrote: I submitted an RFE to have vmware image conversion added to 3.5. I think that is a key feature that is lacking. Im just trying to get some eyes on it here. https://bugzilla.redhat.com/show_bug.cgi?id=1062910 ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users can you comment on the gaps from virt-v2v which does this today (in the bug as well). thanks, Itamar iirc, you need an ESX currently. Some of us are stuck without a way to try out ovirt because of this. ESX is not the only platform that people run VMWare on. I am trying to bring over VMs from an old VMWare Server setup on Centos 5. Works fine, but there is no migration path. Other people may have VMs on VMWare Workstation or other, older products. We just get told to go fly a kite? If the only choice is to bring up a full-blown, working ESX instance, I may bring up ESXi and stay there. Ted Miller Elkhart, IN ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- "He is no fool who gives what he cannot keep, to gain what he cannot lose." - - Jim Elliot For more information about Jim Elliot and his unusual life, see http://www.christianliteratureandliving.com/march2003/carolyn.html. Ted Miller Design Engineer HCJB Global Technology Center 2830 South 17th St Elkhart, IN 46517 574--970-4272 my desk 574--970-4252 receptionist ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] vmware image conversion
On 2/9/2014 4:27 PM, Itamar Heim wrote: On 02/09/2014 10:28 PM, Maurice James wrote: The instructions assume that I have an ESX instance to connect to. How do I do this with an already exported vmware image with no esx available to connect to? I have a turnkey drupal vm in ovf format -Original Message- From: Itamar Heim [mailto:ih...@redhat.com] Sent: Sunday, February 09, 2014 2:24 PM To: Maurice James; 'users' Subject: Re: [Users] vmware image conversion On 02/09/2014 07:02 PM, Maurice James wrote: According to this https://rhn.redhat.com/errata/RHBA-2013-1749.html It does not do it please review: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtua lization/3.3/html-single/V2V_Guide/index.html -Original Message- From: Itamar Heim [mailto:ih...@redhat.com] Sent: Sunday, February 09, 2014 4:52 AM To: Maurice James; 'users' Subject: Re: [Users] vmware image conversion On 02/08/2014 04:18 PM, Maurice James wrote: I submitted an RFE to have vmware image conversion added to 3.5. I think that is a key feature that is lacking. Im just trying to get some eyes on it here. https://bugzilla.redhat.com/show_bug.cgi?id=1062910 ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users can you comment on the gaps from virt-v2v which does this today (in the bug as well). thanks, Itamar iirc, you need an ESX currently. Some of us are stuck without a way to try out ovirt because of this. ESX is not the only platform that people run VMWare on. I am trying to bring over VMs from an old VMWare Server setup on Centos 5. Works fine, but there is no migration path. Other people may have VMs on VMWare Workstation or other, older products. We just get told to go fly a kite? If the only choice is to bring up a full-blown, working ESX instance, I may bring up ESXi and stay there. Ted Miller Elkhart, IN ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Asking for advice on hosted engine
On 2/17/2014 4:20 AM, Giorgio Bersano wrote: Hello everybody, I discovered oVirt a couple of months ago when I was looking for the best way to manage our small infrastructure. I have read any document I considered useful but I would like to receive advice from the many experts that are on this list. I think it worths an introduction (I hope doesn't get you bored). I work in a small local government entity and I try to manage effectively our limited resources. We have many years of experience with Linux and especially with CentOS which we have deployed on PC (i.e. for using as firewall in remote locations) and moreover on servers. We have been using Xen virtualization from the early days of CentOS 5 and we have built our positive experience on KVM too. I have to say that libvirt in a small environment like ours is really a nice tool. So nothing to regret. Trying to go a little further, as already said, I stumbled upon oVirt and I've found the project intriguing. At the moment we are thinking of deploying it on a small environment of four very similar servers each having: - a couple of Xeon E5504 - 6 x 1Gb ethernet interfaces - 40 GB of RAM two of them have 72 GB of disk (mirrored) two of them have almost 500GB of useful RAID array Moreover we have an HP iSCSI storage that should easily satisfy our current storage requirement. So, given our small server pool, the necessity of another host just to run the supervisor seems a requirement too high. Enter "hosted engine" and the picture takes brighter colors. Well, I'm usually not the adventurous guy but after experimenting a little with oVirt 3.4 I developed better confidence. We would want to install the engine over the two hosts with smaller disks. For what I know, installing hosted engine mandates NFS storage. But we want this to be highly available too, and possibly to have it on the very same hosts. Here is my solution: make a gluster replicated volume across the two hosts and take advantage of that NFS server. Then I put 127.0.0.1 as the address of the NFS server in the hosted-engine-setup so the host is always able to reach the storage server (itself). GlusterFS configuration is done outside of oVirt that, regarding engine's storage, doesn't even know that it's a gluster thing. Relax, we've finally reached the point where I'm asking advice :-) Storage and virtualization experts, do you see in this configuration any pitfall that I've overlooked given my inexperience in oVirt, Gluster, NFS or clustered filesystems? Do you think that not only it's feasable (I know it is, I made it and it's working now) but it's also reliable and dependable and I'm not risking my neck on this setup? I've obviously made some test but I'm not at the confidence level of saying that all is right in the way it is designed. OK, I think I've already written too much, better I stop and humbly wait for your opinion but I'm obviously here if any clarification by my part is needed. Thank you very much for reading until this point. Best Regards, Giorgio. Giorgio, Gluster on two hosts only is not a good idea. Installed for high reliability (quorum activated), gluster requires that >50% of the nodes be working before anything can be written. When you have only two nodes, that means both nodes must be up before anything can happen. You can turn off quorum, but then you are almost guaranteeing yourself a split-brain headache the first time communication between the two hosts is interrupted, even briefly (been there, done that). Ovirt is constantly writing to the storage, so if they are not communicating you WILL get different things written to the same files in both servers, especially the sanlock files. This is called split-brain, and it will give you a splitting headache. For replicated gluster to work well, you need a minimum of three gluster nodes in replica mode. Two nodes is a recipe for unhappiness. It is either low-availability (quorum on) or a split-brain waiting to spring on you (quorum off). You don't want either one. Figure out how to use some storage on some third computer to provide a third gluster node. That way only two of the three have to be working for things to keep working. Ted Miller Elkhart, IN ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Import VMware
From: Maurice James Do you know of the best way to get a vmware guest into ovirt without virt-v2v by chance? From: users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] On Behalf Of Ted Miller >> On 2/4/2014 10:49 AM, Maurice James wrote: >>> Is it possible to import vmware images into ovirt 3.3, Or is a running Esx >>> instance still required? >> This bug https://rhn.redhat.com/errata/RHBA-2013-1749.html officially >> withdrew support for importing image files directly, because it didn't >> always work. >> Ted Miller > From: Maurice James > Do you know of the best way to get a vmware guest into ovirt without virt-v2v > by chance? No. I had a gluster + sanlock problem take out my ovirt cluster (2 hosts), and I only have it partially back up. My dozen VMs are currently available only when my (dual boot) hardware isn't running oVirt. Or, to put it the other way, I can only run oVirt when I can take down the VMWare group, because I don't have spare hardware. Working on rebuilding one VM in KVM today (VMWare copy had a problem). The only way I have heard succeed is to use ESX/ESXi or the "hollow pig" method. Create a VM in ovirt, including the hard drive. Replace hard drive file with file from VMWare (or otherwise get data into file). Fiddle with VM hardware & settings until it runs. Ted Miller Elkhart, IN ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Import VMware
On 2/4/2014 10:49 AM, Maurice James wrote: Is it possible to import vmware images into ovirt 3.3, Or is a running Esx instance still required? This bug https://rhn.redhat.com/errata/RHBA-2013-1749.html officially withdrew support for importing image files directly, because it didn't always work. Ted Miller ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Data Center stuck between "Non Responsive" and "Contending"
Federico, thank you for your help so far. Lots of more information below. On 1/27/2014 4:46 PM, Federico Simoncelli wrote: - Original Message - From: "Ted Miller" On 1/27/2014 3:47 AM, Federico Simoncelli wrote: Maybe someone from gluster can identify easily what happened. Meanwhile if you just want to repair your data-center you could try with: $ cd /rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ $ touch ids $ sanlock direct init -s 0322a407-2b16-40dc-ac67-13d387c6eb4c:0:ids:1048576 I tried your suggestion, and it helped, but it was not enough. [root@office4a ~]$ cd /rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ [root@office4a dom_md]$ touch ids [root@office4a dom_md]$ sanlock direct init -s 0322a407-2b16-40dc-ac67-13d387c6eb4c:0:ids:1048576 init done 0 Let me explain a little. When the problem originally happened, the sanlock.log started having -223 error messages. 10 seconds later the log switched from -223 messages to -90 messages. Running your little script changed the error from -90 back to -223. I hope you can send me another script that will get rid of the -223 messages. Here is the sanlock.log as I ran your script: 2014-01-27 19:40:41-0500 39281 [3803]: s13 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0 2014-01-27 19:40:41-0500 39281 [22751]: 0322a407 aio collect 0 0x7f54240008c0:0x7f54240008d0:0x7f5424101000 result 0:0 match len 512 2014-01-27 19:40:41-0500 39281 [22751]: read_sectors delta_leader offset 512 rv -90 /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids 2014-01-27 19:40:42-0500 39282 [3803]: s13 add_lockspace fail result -90 2014-01-27 19:40:47-0500 39287 [3803]: s14 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0 2014-01-27 19:40:47-0500 39287 [22795]: 0322a407 aio collect 0 0x7f54240008c0:0x7f54240008d0:0x7f5424101000 result 0:0 match len 512 2014-01-27 19:40:47-0500 39287 [22795]: read_sectors delta_leader offset 512 rv -90 /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids 2014-01-27 19:40:48-0500 39288 [3803]: s14 add_lockspace fail result -90 2014-01-27 19:40:56-0500 39296 [3802]: s15 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0 2014-01-27 19:40:56-0500 39296 [22866]: verify_leader 2 wrong magic 0 /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids 2014-01-27 19:40:56-0500 39296 [22866]: leader1 delta_acquire_begin error -223 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c host_id 2 2014-01-27 19:40:56-0500 39296 [22866]: leader2 path /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids offset 0 2014-01-27 19:40:56-0500 39296 [22866]: leader3 m 0 v 0 ss 0 nh 0 mh 0 oi 0 og 0 lv 0 2014-01-27 19:40:56-0500 39296 [22866]: leader4 sn rn ts 0 cs 0 2014-01-27 19:40:57-0500 39297 [3802]: s15 add_lockspace fail result -223 2014-01-27 19:40:57-0500 39297 [3802]: s16 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0 2014-01-27 19:40:57-0500 39297 [22870]: verify_leader 2 wrong magic 0 /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids 2014-01-27 19:40:57-0500 39297 [22870]: leader1 delta_acquire_begin error -223 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c host_id 2 2014-01-27 19:40:57-0500 39297 [22870]: leader2 path /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids offset 0 2014-01-27 19:40:57-0500 39297 [22870]: leader3 m 0 v 0 ss 0 nh 0 mh 0 oi 0 og 0 lv 0 2014-01-27 19:40:57-0500 39297 [22870]: leader4 sn rn ts 0 cs 0 2014-01-27 19:40:58-0500 39298 [3802]: s16 add_lockspace fail result -223 2014-01-27 19:41:07-0500 39307 [3802]: s17 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0 Unfortunately, I think the error looks about the same to vdsm, because /var/log/messages shows the same two lines in the calling scripts on the callback lists (66 & 425, if I remember right). When I get up in the morning, I will be looking for another magic potion from your pen. :) Federico, I won't be able to do anything to the ovirt setup for another 5 hours or so (it is a trial system I am working on at home, I am at work), but I will try your repair script and report
Re: [Users] Data Center stuck between "Non Responsive" and "Contending"
On 1/27/2014 3:47 AM, Federico Simoncelli wrote: - Original Message - From: "Itamar Heim" To: "Ted Miller" , users@ovirt.org, "Federico Simoncelli" Cc: "Allon Mureinik" Sent: Sunday, January 26, 2014 11:17:04 PM Subject: Re: [Users] Data Center stuck between "Non Responsive" and "Contending" On 01/27/2014 12:00 AM, Ted Miller wrote: On 1/26/2014 4:00 PM, Itamar Heim wrote: On 01/26/2014 10:51 PM, Ted Miller wrote: On 1/26/2014 3:10 PM, Itamar Heim wrote: On 01/26/2014 10:08 PM, Ted Miller wrote: is this gluster storage (guessing sunce you mentioned a 'volume') yes (mentioned under "setup" above) does it have a quorum? Volume Name: VM2 Type: Replicate Volume ID: 7bea8d3b-ec2a-4939-8da8-a82e6bda841e Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.41.65.2:/bricks/01/VM2 Brick2: 10.41.65.4:/bricks/01/VM2 Brick3: 10.41.65.4:/bricks/101/VM2 Options Reconfigured: cluster.server-quorum-type: server storage.owner-gid: 36 storage.owner-uid: 36 auth.allow: * user.cifs: off nfs.disa (there were reports of split brain on the domain metadata before when no quorum exist for gluster) after full heal: [root@office4a ~]$ gluster volume heal VM2 info Gathering Heal info on volume VM2 has been successful Brick 10.41.65.2:/bricks/01/VM2 Number of entries: 0 Brick 10.41.65.4:/bricks/01/VM2 Number of entries: 0 Brick 10.41.65.4:/bricks/101/VM2 Number of entries: 0 [root@office4a ~]$ gluster volume heal VM2 info split-brain Gathering Heal info on volume VM2 has been successful Brick 10.41.65.2:/bricks/01/VM2 Number of entries: 0 Brick 10.41.65.4:/bricks/01/VM2 Number of entries: 0 Brick 10.41.65.4:/bricks/101/VM2 Number of entries: 0 noticed this in host /var/log/messages (while looking for something else). Loop seems to repeat over and over. Jan 26 15:35:52 office4a sanlock[3763]: 2014-01-26 15:35:52-0500 14678 [30419]: read_sectors delta_leader offset 512 rv -90 /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids Jan 26 15:35:53 office4a sanlock[3763]: 2014-01-26 15:35:53-0500 14679 [3771]: s1997 add_lockspace fail result -90 Jan 26 15:35:58 office4a vdsm TaskManager.Task ERROR Task=`89885661-88eb-4ea3-8793-00438735e4ab`::Unexpected error#012Traceback (most recent call last):#012 File "/usr/share/vdsm/storage/task.py", line 857, in _run#012 return fn(*args, **kargs)#012 File "/usr/share/vdsm/logUtils.py", line 45, in wrapper#012res = f(*args, **kwargs)#012 File "/usr/share/vdsm/storage/hsm.py", line 2111, in getAllTasksStatuses#012 allTasksStatus = sp.getAllTasksStatuses()#012 File "/usr/share/vdsm/storage/securable.py", line 66, in wrapper#012 raise SecureError()#012SecureError Jan 26 15:35:59 office4a sanlock[3763]: 2014-01-26 15:35:59-0500 14686 [30495]: read_sectors delta_leader offset 512 rv -90 /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids Jan 26 15:36:00 office4a sanlock[3763]: 2014-01-26 15:36:00-0500 14687 [3772]: s1998 add_lockspace fail result -90 Jan 26 15:36:00 office4a vdsm TaskManager.Task ERROR Task=`8db9ff1a-2894-407a-915a-279f6a7eb205`::Unexpected error#012Traceback (most recent call last):#012 File "/usr/share/vdsm/storage/task.py", line 857, in _run#012 return fn(*args, **kargs)#012 File "/usr/share/vdsm/storage/task.py", line 318, in run#012return self.cmd(*self.argslist, **self.argsdict)#012 File "/usr/share/vdsm/storage/sp.py", line 273, in startSpm#012 self.masterDomain.acquireHostId(self.id)#012 File "/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId#012 self._clusterLock.acquireHostId(hostId, async)#012 File "/usr/share/vdsm/storage/clusterlock.py", line 189, in acquireHostId#012raise se.AcquireHostIdFailure(self._sdUUID, e)#012AcquireHostIdFailure: Cannot acquire host id: ('0322a407-2b16-40dc-ac67-13d387c6eb4c', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long')) fede - thoughts on above? (vojtech reported something similar, but it sorted out for him after some retries) Something truncated the ids file, as also reported by: [root@office4a ~]$ ls /rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ -l total 1029 -rw-rw 1 vdsm kvm 0 Jan 22 00:44 ids -rw-rw 1 vdsm kvm 0 Jan 16 18:50 inbox -rw-rw 1 vdsm kvm 2097152 Jan 21 18:20 leases -rw-r--r-- 1 vdsm kvm 491 Jan 21 18:20 metadata -rw-rw 1 vdsm kvm 0 Jan 16 18:50 outbox In the past I saw that happening because of a glusterfs bug: https://bugzilla.redhat.com/show_bug.cgi?id=862975 Anyway in general it seems that glusterfs is not always able to reconcile the ids file (as it's written by all the hosts at the same time). Maybe someone from gluster can
Re: [Users] Data Center stuck between "Non Responsive" and "Contending"
On 1/26/2014 6:24 PM, Ted Miller wrote: On 1/26/2014 5:17 PM, Itamar Heim wrote: On 01/27/2014 12:00 AM, Ted Miller wrote: On 1/26/2014 4:00 PM, Itamar Heim wrote: On 01/26/2014 10:51 PM, Ted Miller wrote: On 1/26/2014 3:10 PM, Itamar Heim wrote: On 01/26/2014 10:08 PM, Ted Miller wrote: My Data Center is down, and won't come back up. Data Center Status on the GUI flips between "Non Responsive" and "Contending" Also noted: Host sometimes seen flipping between "Low" and "Contending" in SPM column. Storage VM2 "Data (Master)" is in "Cross Data-Center Status" = Unknown VM2 is "up" under "Volumes" tab Created another volume for VM storage. It shows up in "volumes" tab, but when I try to add "New Domain" in storage tab, says that "There are No Data Centers to which the Storage Domain can be attached" Setup: 2 hosts w/ glusterfs storage 1 engine all 3 computers Centos 6.5, just updated ovirt-engine 3.3.0.1-1.el6 ovirt-engine-lib 3.3.2-1.el6 ovirt-host-deploy.noarch 1.1.3-1.el6 glusterfs.x86_64 3.4.2-1.el6 This loop seems to repeat in the ovirt-engine log (grep of log showing only DefaultQuartzScheduler_Worker-79 thread: 2014-01-26 14:44:58,416 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) Irs placed on server 9a591103-83be-4ca9-b207-06929223b541 failed. Proceed Failover 2014-01-26 14:44:58,511 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) hostFromVds::selectedVds - office4a, spmStatus Free, storage pool mill 2014-01-26 14:44:58,550 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) SpmStatus on vds 127ed939-34af-41a8-87a0-e2f6174b1877: Free 2014-01-26 14:44:58,571 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) starting spm on vds office4a, storage pool mill, prevId 2, LVER 15 2014-01-26 14:44:58,579 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) START, SpmStartVDSCommand(HostName = office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, storagePoolId = 536a864d-83aa-473a-a675-e38aafdd9071, prevId=2, prevLVER=15, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log id: 74c38eb7 2014-01-26 14:44:58,617 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling started: taskId = e8986753-fc80-4b11-a11d-6d3470b1728c 2014-01-26 14:45:00,662 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-79) Failed in HSMGetTaskStatusVDS method 2014-01-26 14:45:00,664 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-79) Error code AcquireHostIdFailure and error message VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2014-01-26 14:45:00,665 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling ended: taskId = e8986753-fc80-4b11-a11d-6d3470b1728c task status = finished 2014-01-26 14:45:00,666 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) Start SPM Task failed - result: cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2014-01-26 14:45:00,695 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling ended, spm status: Free 2014-01-26 14:45:00,702 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-79) START, HSMClearTaskVDSCommand(HostName = office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, taskId=e8986753-fc80-4b11-a11d-6d3470b1728c), log id: 336ec5a6 2014-01-26 14:45:00,722 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-79) FINISH, HSMClearTaskVDSCommand, log id: 336ec5a6 2014-01-26 14:45:00,724 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) FINISH, SpmStartVDSCommand, return: org.ovirt.engine.core.common.businessentities.SpmStatusResult@13652652, log id: 74c38eb7 2014-01-26 14:45:00,733 INFO [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (DefaultQuartzScheduler_Worker-79) Running command: SetStoragePoolStatusCommand internal: true. Entities affected : ID: 536a864d-83aa-473a-a675-e38aafdd9071 Type: StoragePool 2014-01-26 14:45:00,778 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) IrsBroker::Failed::GetStoragePoolInfoVDS due t
[Users] sanlock can't read empty 'ids' file
igure is that something is supposed to be in dom_md/ids, but that file is empty: ls /rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ -l total 1029 -rw-rw 1 vdsm kvm 0 Jan 22 00:44 ids -rw-rw 1 vdsm kvm 0 Jan 16 18:50 inbox -rw-rw 1 vdsm kvm 2097152 Jan 21 18:20 leases -rw-r--r-- 1 vdsm kvm 491 Jan 21 18:20 metadata -rw-rw 1 vdsm kvm 0 Jan 16 18:50 outbox Any hints as to how to put whatever is needed into 'ids', or reinitialize the sanlock system--or a better diagnosis and solution--gladly accepted. Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Data Center stuck between "Non Responsive" and "Contending"
On 1/26/2014 5:17 PM, Itamar Heim wrote: On 01/27/2014 12:00 AM, Ted Miller wrote: On 1/26/2014 4:00 PM, Itamar Heim wrote: On 01/26/2014 10:51 PM, Ted Miller wrote: On 1/26/2014 3:10 PM, Itamar Heim wrote: On 01/26/2014 10:08 PM, Ted Miller wrote: My Data Center is down, and won't come back up. Data Center Status on the GUI flips between "Non Responsive" and "Contending" Also noted: Host sometimes seen flipping between "Low" and "Contending" in SPM column. Storage VM2 "Data (Master)" is in "Cross Data-Center Status" = Unknown VM2 is "up" under "Volumes" tab Created another volume for VM storage. It shows up in "volumes" tab, but when I try to add "New Domain" in storage tab, says that "There are No Data Centers to which the Storage Domain can be attached" Setup: 2 hosts w/ glusterfs storage 1 engine all 3 computers Centos 6.5, just updated ovirt-engine 3.3.0.1-1.el6 ovirt-engine-lib 3.3.2-1.el6 ovirt-host-deploy.noarch 1.1.3-1.el6 glusterfs.x86_64 3.4.2-1.el6 This loop seems to repeat in the ovirt-engine log (grep of log showing only DefaultQuartzScheduler_Worker-79 thread: 2014-01-26 14:44:58,416 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) Irs placed on server 9a591103-83be-4ca9-b207-06929223b541 failed. Proceed Failover 2014-01-26 14:44:58,511 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) hostFromVds::selectedVds - office4a, spmStatus Free, storage pool mill 2014-01-26 14:44:58,550 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) SpmStatus on vds 127ed939-34af-41a8-87a0-e2f6174b1877: Free 2014-01-26 14:44:58,571 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) starting spm on vds office4a, storage pool mill, prevId 2, LVER 15 2014-01-26 14:44:58,579 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) START, SpmStartVDSCommand(HostName = office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, storagePoolId = 536a864d-83aa-473a-a675-e38aafdd9071, prevId=2, prevLVER=15, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log id: 74c38eb7 2014-01-26 14:44:58,617 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling started: taskId = e8986753-fc80-4b11-a11d-6d3470b1728c 2014-01-26 14:45:00,662 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-79) Failed in HSMGetTaskStatusVDS method 2014-01-26 14:45:00,664 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-79) Error code AcquireHostIdFailure and error message VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2014-01-26 14:45:00,665 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling ended: taskId = e8986753-fc80-4b11-a11d-6d3470b1728c task status = finished 2014-01-26 14:45:00,666 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) Start SPM Task failed - result: cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2014-01-26 14:45:00,695 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling ended, spm status: Free 2014-01-26 14:45:00,702 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-79) START, HSMClearTaskVDSCommand(HostName = office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, taskId=e8986753-fc80-4b11-a11d-6d3470b1728c), log id: 336ec5a6 2014-01-26 14:45:00,722 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-79) FINISH, HSMClearTaskVDSCommand, log id: 336ec5a6 2014-01-26 14:45:00,724 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) FINISH, SpmStartVDSCommand, return: org.ovirt.engine.core.common.businessentities.SpmStatusResult@13652652, log id: 74c38eb7 2014-01-26 14:45:00,733 INFO [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (DefaultQuartzScheduler_Worker-79) Running command: SetStoragePoolStatusCommand internal: true. Entities affected : ID: 536a864d-83aa-473a-a675-e38aafdd9071 Type: StoragePool 2014-01-26 14:45:00,778 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSEr
Re: [Users] Data Center stuck between "Non Responsive" and "Contending"
On 1/26/2014 4:00 PM, Itamar Heim wrote: On 01/26/2014 10:51 PM, Ted Miller wrote: On 1/26/2014 3:10 PM, Itamar Heim wrote: On 01/26/2014 10:08 PM, Ted Miller wrote: My Data Center is down, and won't come back up. Data Center Status on the GUI flips between "Non Responsive" and "Contending" Also noted: Host sometimes seen flipping between "Low" and "Contending" in SPM column. Storage VM2 "Data (Master)" is in "Cross Data-Center Status" = Unknown VM2 is "up" under "Volumes" tab Created another volume for VM storage. It shows up in "volumes" tab, but when I try to add "New Domain" in storage tab, says that "There are No Data Centers to which the Storage Domain can be attached" Setup: 2 hosts w/ glusterfs storage 1 engine all 3 computers Centos 6.5, just updated ovirt-engine 3.3.0.1-1.el6 ovirt-engine-lib 3.3.2-1.el6 ovirt-host-deploy.noarch 1.1.3-1.el6 glusterfs.x86_64 3.4.2-1.el6 This loop seems to repeat in the ovirt-engine log (grep of log showing only DefaultQuartzScheduler_Worker-79 thread: 2014-01-26 14:44:58,416 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) Irs placed on server 9a591103-83be-4ca9-b207-06929223b541 failed. Proceed Failover 2014-01-26 14:44:58,511 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) hostFromVds::selectedVds - office4a, spmStatus Free, storage pool mill 2014-01-26 14:44:58,550 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) SpmStatus on vds 127ed939-34af-41a8-87a0-e2f6174b1877: Free 2014-01-26 14:44:58,571 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) starting spm on vds office4a, storage pool mill, prevId 2, LVER 15 2014-01-26 14:44:58,579 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) START, SpmStartVDSCommand(HostName = office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, storagePoolId = 536a864d-83aa-473a-a675-e38aafdd9071, prevId=2, prevLVER=15, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log id: 74c38eb7 2014-01-26 14:44:58,617 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling started: taskId = e8986753-fc80-4b11-a11d-6d3470b1728c 2014-01-26 14:45:00,662 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-79) Failed in HSMGetTaskStatusVDS method 2014-01-26 14:45:00,664 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-79) Error code AcquireHostIdFailure and error message VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2014-01-26 14:45:00,665 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling ended: taskId = e8986753-fc80-4b11-a11d-6d3470b1728c task status = finished 2014-01-26 14:45:00,666 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) Start SPM Task failed - result: cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2014-01-26 14:45:00,695 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling ended, spm status: Free 2014-01-26 14:45:00,702 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-79) START, HSMClearTaskVDSCommand(HostName = office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, taskId=e8986753-fc80-4b11-a11d-6d3470b1728c), log id: 336ec5a6 2014-01-26 14:45:00,722 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-79) FINISH, HSMClearTaskVDSCommand, log id: 336ec5a6 2014-01-26 14:45:00,724 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) FINISH, SpmStartVDSCommand, return: org.ovirt.engine.core.common.businessentities.SpmStatusResult@13652652, log id: 74c38eb7 2014-01-26 14:45:00,733 INFO [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (DefaultQuartzScheduler_Worker-79) Running command: SetStoragePoolStatusCommand internal: true. Entities affected : ID: 536a864d-83aa-473a-a675-e38aafdd9071 Type: StoragePool 2014-01-26 14:45:00,778 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed Ted Miller Elkhart, IN, USA
[Users] Data Center stuck between "Non Responsive" and "Contending"
My Data Center is down, and won't come back up. Data Center Status on the GUI flips between "Non Responsive" and "Contending" Also noted: Host sometimes seen flipping between "Low" and "Contending" in SPM column. Storage VM2 "Data (Master)" is in "Cross Data-Center Status" = Unknown VM2 is "up" under "Volumes" tab Created another volume for VM storage. It shows up in "volumes" tab, but when I try to add "New Domain" in storage tab, says that "There are No Data Centers to which the Storage Domain can be attached" Setup: 2 hosts w/ glusterfs storage 1 engine all 3 computers Centos 6.5, just updated ovirt-engine 3.3.0.1-1.el6 ovirt-engine-lib 3.3.2-1.el6 ovirt-host-deploy.noarch 1.1.3-1.el6 glusterfs.x86_64 3.4.2-1.el6 This loop seems to repeat in the ovirt-engine log (grep of log showing only DefaultQuartzScheduler_Worker-79 thread: 2014-01-26 14:44:58,416 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) Irs placed on server 9a591103-83be-4ca9-b207-06929223b541 failed. Proceed Failover 2014-01-26 14:44:58,511 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) hostFromVds::selectedVds - office4a, spmStatus Free, storage pool mill 2014-01-26 14:44:58,550 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) SpmStatus on vds 127ed939-34af-41a8-87a0-e2f6174b1877: Free 2014-01-26 14:44:58,571 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) starting spm on vds office4a, storage pool mill, prevId 2, LVER 15 2014-01-26 14:44:58,579 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) START, SpmStartVDSCommand(HostName = office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, storagePoolId = 536a864d-83aa-473a-a675-e38aafdd9071, prevId=2, prevLVER=15, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log id: 74c38eb7 2014-01-26 14:44:58,617 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling started: taskId = e8986753-fc80-4b11-a11d-6d3470b1728c 2014-01-26 14:45:00,662 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-79) Failed in HSMGetTaskStatusVDS method 2014-01-26 14:45:00,664 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-79) Error code AcquireHostIdFailure and error message VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2014-01-26 14:45:00,665 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling ended: taskId = e8986753-fc80-4b11-a11d-6d3470b1728c task status = finished 2014-01-26 14:45:00,666 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) Start SPM Task failed - result: cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2014-01-26 14:45:00,695 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) spmStart polling ended, spm status: Free 2014-01-26 14:45:00,702 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-79) START, HSMClearTaskVDSCommand(HostName = office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, taskId=e8986753-fc80-4b11-a11d-6d3470b1728c), log id: 336ec5a6 2014-01-26 14:45:00,722 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-79) FINISH, HSMClearTaskVDSCommand, log id: 336ec5a6 2014-01-26 14:45:00,724 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-79) FINISH, SpmStartVDSCommand, return: org.ovirt.engine.core.common.businessentities.SpmStatusResult@13652652, log id: 74c38eb7 2014-01-26 14:45:00,733 INFO [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (DefaultQuartzScheduler_Worker-79) Running command: SetStoragePoolStatusCommand internal: true. Entities affected : ID: 536a864d-83aa-473a-a675-e38aafdd9071 Type: StoragePool 2014-01-26 14:45:00,778 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-79) IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] fencing: HP ilo100 status does NMI, reboots computer
On 1/22/2014 2:44 PM, Joop wrote: Ted Miller wrote: I am having trouble getting fencing to work on my HP DL180 g6 servers. They have ilo100 controllers. The documentation mentions ipmi compliance, but there are problems. The ipmilan driver gets a response, but it is the wrong response. A status request results in the NMI line being asserted, which (in standard PC architecture) is the same as pressing the reset button (which these servers don't have). Thats weird. I have 2 ML110 G6 desktop servers, as storage servers, and those have ilo100 controllers too and I just checked and they are setup as ipmilan in engine. I have used the Test button more than once and never had problems. My Summary page says: *IPMI Version:* 2.0 *Firmware Version:* 4.23 *Hardware Version:* 1.0 *Description:* ProLiant ML110 G6 *System GUID:* 33221100-5544-7766-8899-AABBCCDDEEFF Mayb that helps you to track down the problem. If you have got question, please ask. Joop Joop, thanks for the info. That tells me I was not totally off track when I was trying the ipmilan driver. I have firmware 4.21 in my controllers. I'll have to see about updating that. One thing I have figured out, this problem would not have been so noticeable, except that something is causing host s1 to go "non responsive" every few hours. That provokes a Restart -> Stop -> Status -> Start sequence from the fencing system. I will have to deal with what is causing the "non responsive" condition, but first I want to work through the fencing problem. I tried the ilo2 driver, but the test of that produced even more convulsive messages from the computer. Ted Miller ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] fencing: HP ilo100 status does NMI, reboots computer
I am having trouble getting fencing to work on my HP DL180 g6 servers. They have ilo100 controllers. The documentation mentions ipmi compliance, but there are problems. The ipmilan driver gets a response, but it is the wrong response. A status request results in the NMI line being asserted, which (in standard PC architecture) is the same as pressing the reset button (which these servers don't have). Here are some log excerpts: 16:33 just after re-running re-install from engine, which ended: *From oVirt GUI "Events" tab *Host s1 installed State was set to up for host s1. Host s3 from cluster Default was *chosen* as a proxy to execute Status command on Host s1 Host s1 power management was verified successfully 16:34 *on ssh screen:* Message from syslogd@s1 at Jan 21 16:34:14 ... kernel:Uhhuh. NMI received for unknown reason 31 on CPU 0. Message from syslogd@s1 at Jan 21 16:34:14 ... kernel:Do you have a strange power saving mode enabled? Message from syslogd@s1 at Jan 21 16:34:14 ... kernel:Dazed and confused, but trying to continue ***from IPMI web interface event log:* Generic 01/21/2014 21:34:15Gen ID 0x21 Bus Uncorrectable Error Assertion Generic 01/21/2014 21:34:15IOH_NMI_DETECT State Asserted Assertion * From oVirt GUI "Events" tab *Host s1 is non responsive Host s3 from cluster Default was chosen as a proxy to execute Restart command on Host s1 Host s3 from cluster Default was chosen as a proxy to execute Stop command on Host s1 Host s3 from cluster Default was chosen as a proxy to execute Status command on Host s1 Host s1 was stopped by engine Manual fence for host s1 was started Host s3 from cluster Default was chosen as a proxy to execute Status command on Host s1 Host s3 from cluster Default was chosen as a proxy to execute Start command on Host s1 Host s3 from cluster Default was chosen as a proxy to execute Status command on Host s1 Host s1 was started by engine Host s1 is rebooting State was set to up for host s1. Host s3 from cluster Default was chosen as a proxy to execute Status command on Host s1 16:41 saw kernel panic output on remote KVM terminal computer rebooted itself I have searched for ilo100, but find nothing related to ovirt, so am clueless as to what is the "correct" driver for this hardware. So far I have seen this mostly on server1 (s1), but that is also the one I have cycled up and down most often. I have also seen where the commands are apparently issued too fast (these servers are fairly slow booting). For example, I found that one server was powered down when the boot process had gotten to the stage where the RAID controller screen was up, so it had not had time to complete the boot that was already in progress. Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] online storage domain resize
On 1/20/2014 9:57 AM, JiÅà Sléžka wrote: Hello, I'm just curious and I didn't try it already. I'm using FC storage (Dell MD3620f) with some logical disks on it. I should be able online increase virtual disk capacity using storage management (I have some free capacity on disk group). Is there any way to on-line extend volume group used for vm's images storage and don't break anything? I just found this hint by Eduardo from list 1. Shutdown all VMs 2. Manually connect iscsi on the SPM host 3. Run pvresize on the LUN 4. Put the domain in maintenance 5. Activate the domains Is it possible to do this on-line without shutting down all vms? If not, it could be really nice feature for oncoming releases. Thanks in advance Jiri I don't know if oVirt is "breaking the rules" for using LVM, but in regular Linux all you have to do is: pvcreate to create a new PV on the available space vgextend to add the new pv to the existing vg enjoy additional space. Others will have to chime in on whether oVirt breaks this process somehow (I am using gluster for my storage), but I doubt it. Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Making v2v easier?
Unfortunately I have not ESX or ESXi server. These VMs were running on a VMWare Server. Ted On 1/20/2014 4:23 AM, Sander Grendelman wrote: FWIW, importing directly from an ESX server still works: virt-v2v-host: - RHEL/CentOS 6.5 physical host ( virt-v2v uses qemu-kvm = extra++ slow on a VM) - Packages: virt-v2v-0.9.1-5.el6_5.x86_64 libguestfs-winsupport-1.0-7.el6.x86_64 libguestfs-tools-c-1.20.11-2.el6.x86_64 libguestfs-tools-1.20.11-2.el6.x86_64 libguestfs-1.20.11-2.el6.x86_64 virtio-win-1.6.7-2.el6.noarch ( RHEL only? ) - network acces to: oVirt export domain (NFS) esx host(s) to import from (HTTPS) - virt-v2v has to run as root to mount the oVirt NFS export domain - Edit ~/.netrc and add a line for the esx host(s) to import from (change the <> parts): machine login password - Fix permissions on netrc file: chmod 600 ~/.netrc - Run virt-v2v ( again: change the <> parts, ?no_verify=1 is needed when esx uses self signed certs) LIBGUESTFS_DEBUG=1 virt-v2v -ic esx:///?no_verify=1 -o rhev -os --network Conversion can take quite some time after the disk copy, especially when virt-v2v removes the vmware tools. Running on a physical host (or using nested virtualization) helps. On Mon, Jan 20, 2014 at 8:59 AM, Sander Grendelman wrote: https://rhn.redhat.com/errata/RHBA-2013-1749.html """ This update fixes the following bug: * An update to virt-v2v included upstream support for the import of OVA images exported by VMware servers. Unfortunately, testing has shown that VMDK images created by recent versions of VMware ESX cannot be reliably supported, thus this feature has been withdrawn. (BZ#1028983) Users of virt-v2v are advised to upgrade to this updated package, which fixes this bug. """ -- "He is no fool who gives what he cannot keep, to gain what he cannot lose." - - Jim Elliot For more information about Jim Elliot and his unusual life, see http://www.christianliteratureandliving.com/march2003/carolyn.html. Ted Miller Design Engineer HCJB Global Technology Center 2830 South 17th St Elkhart, IN 46517 574--970-4272 my desk 574--970-4252 receptionist ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Making v2v easier?
I am wide open to suggestions (see discussion at bottom, as usual). On 1/19/2014 5:25 AM, Gianluca Cecchi wrote: On Sun, Jan 19, 2014 at 7:13 AM, Ted Miller wrote: * BEAT HEAD AGAINST WALL because virt-v2v.x86_64 0.9.1-5.el6_5 from Centos updates doesn't seem to know about .ova files. I was following the instructions in Red_Hat_Enterprise_Virtualization-3.3-Beta-V2V_Guide-en-US.pdf guide, but I figured out that the v2v they are talking about has an "-i ova" option, while the help file for the version I am using does not list ova as an option for -i, and if I try to use it, it tells me that it is an invalid option, and if I leave it off it goes off looking for a qemu///system to attach to. help files for v2v say nothing at all about .ova files. I am wondering where to find a v2v program that knows about .ova files, or else am I going to have to import all my VMWare files to my (non-ovirt) KVM host, and then drag them into ovirt from libvirt? I made a bit of research about this Strange I just update a CentOS 6.4 VM to latest 6.5 and see that there (also matching RHEL 6.5 I think) there is indeed as you wrote: virt-v2v-0.9.1-5.el6_5.x86_64 And it seems ova is missing as an option... Instead on a Fedora 19 system with virt-v2v-0.9.0-3.fc19.x86_64 I have it So for any reason was it removed in newer packages? It seems also strange to see a Fedora package (even if 19 and not 20) older than a RH EL 6 one ... RHEL 6 version bumped this way skipping 0.9.0: * Wed Jun 12 2013 Matthew Booth - 0.9.1-1 - Rebase to new upstream release * Mon Oct 22 2012 Matthew Booth - 0.8.9-2 while fedora 19 has been currently stopped at * Wed Jul 03 2013 Richard W.M. Jones - 0.9.0-3 - Default to using the appliance backend, since in Fedora >= 18 the libvirt backend doesn't support the 'iface' parameter which virt-v2v requires. - Add BR perl(Sys::Syslog), required to run the tests. - Remove some cruft from the spec file. BTW in F20 we do have ova too: virt-v2v-0.9.0-5.fc20.x86_64 and in fact it has the older version... For RHEL 6 I remained here: http://lists.ovirt.org/pipermail/users/2013-May/014457.html ANd no particular virt-v2v package in rhev source repo http://ftp.redhat.com/redhat/linux/enterprise/6Server/en/RHEV/SRPMS/ For sure the rhev 3.3 beta guide is incorrect at the moment https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.2/html-single/V2V_Guide/index.html#chap-V2V_Guide-Installing_virt_v2v because it says " virt-v2v is available on Red Hat Network (RHN) in the Red Hat Enterprise Linux Server (v.6 for 64-bit x86_64) or Red Hat Enterprise Linux Workstation (v.6 for x86_64) channel. Ensure the system is subscribed to the appropriate channel before installing virt-v2v. " and some lines below " 7.1. virt-v2v Parameters The following parameters can be used with virt-v2v: -i input Specifies the input method to obtain the guest for conversion. The default is libvirt. Supported options are: libvirt Guest argument is the name of a libvirt domain. libvirtxml Guest argument is the path to an XML file containing a libvirt domain. ova " Any light to shed on this? Thanks Gianluca I think I have some light, but you'd better get out your rose-colored glasses, because that is the only way the light will look good. ;( I spun up a Fedora 20 64-bit VM (in my brand new oVirt environment) to take advantage of your wonderful discovery. I did a minimal install, then "yum upgrade", then "yum install virt-v2v" brought in 489MB of dependencies! This is what I found (not necessarily in the order I found them). 1. virt-v2v has a missing dependency: perl-Archive-Tar yum install perl-Archive-Tar 2. virt-v2v would error out fairly early in the conversion process: Error extracting archive '/media/VMold/vmware.dud/Fedora13A.ova': /usr/bin/tar: Fedora13A-disk1.vmdk: Wrote only 6144 of 10240 bytes that error went away when I increased VM memory from 1G to 4G (see below). 3. The *.ova files produced by vmware-vdiskmanager build 835872 (downloaded yesterday as part of VDDK 5.0) give the following error messages when running through virt-v2v: Use of uninitialized value $file in hash element at /usr/share/perl5/vendor_perl/Sys/VirtConvert/Connection/VMwareOVASource.pm line 261, <$manifest> line 2. Reading from filehandle failed at /usr/share/perl5/vendor_perl/Sys/VirtConvert/Connection/VMwareOVASource.pm line 271. Though I am not a Perl programmer, I took a look at the code, stuck in some debugging "print" statements, and came to this conclusion: The *.ova files produced by my version of vmware-vdiskmanager contain a "blank" line at the end of the manifest with about 62 spaces in it (nothing else). "sub _verify_manifest" in the file "VMwareOVASource.pm" it throws the error.
Re: [Users] Making v2v easier?
On 01/17/2014 10:19 AM, Itamar Heim wrote: I see a lot of threads about v2v pains (mostly from ESX?) I'm interested to see if we can make this simpler/easier. if you have experience with this, please describe the steps you are using (also the source platform), and how you would like to see this make simpler (I'm assuming that would start from somewhere in the webadmin probably). I have spent most of the day trying to do this, and so far have failed. Source: VMWare Server 2.0 disk files (.vmx, .vmdk, etc.), about 10 VMs to transfer. Eliminating all the false starts and detours along the way, this is what I have done so far. * copy my tree of vmware files to local storage; in case I goof up or get fumble-fingered and need to start over clean again. * Set up a 32-bit VM (running Centos 6) because vmware-vdiskmanager only seems to come in 32 bit in the VDSDK package I found to download. I was planning to do this anyway, to run GoogleEarth and other software that doesn't come in pure-64bit format. * run vmware-vdiskmanager -R to clean up errors that kept next step from happening on about 1/3 of *.vmdk files. * run ovftool .vmx .ova to turn .vmx and .vmdk files into .ova files -- long process * BEAT HEAD AGAINST WALL because virt-v2v.x86_64 0.9.1-5.el6_5 from Centos updates doesn't seem to know about .ova files. I was following the instructions in Red_Hat_Enterprise_Virtualization-3.3-Beta-V2V_Guide-en-US.pdf guide, but I figured out that the v2v they are talking about has an "-i ova" option, while the help file for the version I am using does not list ova as an option for -i, and if I try to use it, it tells me that it is an invalid option, and if I leave it off it goes off looking for a qemu///system to attach to. help files for v2v say nothing at all about .ova files. I am wondering where to find a v2v program that knows about .ova files, or else am I going to have to import all my VMWare files to my (non-ovirt) KVM host, and then drag them into ovirt from libvirt? My setup: All hosts running Centos 6.5, fully up to date. 2 hosts engine running in KVM VM, hosted on a non-oVirt KVM host. gluster replica 3 file system across 2 ovirt hosts and on KVM host. Going to bed now to give head some rest. Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] engine-iso-uploader -- REST API not usable?
I ran into this problem when I tried to use engine-iso-uploader, but reading on the lists makes it sound like it may be a more general problem. There was a bug that caused this, but that was back in the ver. 3.0/3.1 days, and doesn't seem common since then. Back on Dec 24 I was able to upload an ISO file OK, so I am not sure what has changed since then. I am running a test setup, fully up to date: office2a host w/ glusterfs Centos 6 office4a host w/ glusterfs Centos 6 ov-eng01 engine on Centos 6 VM (not hosted on oVirt) office9 KVM host (not oVirt) for ov-eng01 whether I log in to ov-eng01 by ssh or execute the command from the console, I get: # engine-iso-uploader list -v Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to abort): ERROR: Problem connecting to the REST API. Is the service available and does the CA certificate exist? checking on some things suggested on a thread about engine-iso-uploader back in March, I get: # ls -la /etc/pki/ovirt-engine/ca.pem -rw-r--r--. 1 root root 4569 Nov 10 15:13 /etc/pki/ovirt-engine/ca.pem # cat /var/log/ovirt-engine/ovirt-iso-uploader/ovirt-iso-uploader/20140117112938.log 2014-01-17 11:29:44::ERROR::engine-iso-uploader::512::root:: Problem connecting to the REST API. Is the service available and does the CA certificate exist? The thread back in March gave a work-around to upload ISO images directly, so I am not "blocked" from uploading images, but I would like to get things working "right", as I am afraid the problem will "turn around and bite me" down the road. Ted Miller ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] tuned profile for Centos hosts -- new Bugzilla or Regression
I posted a script (a while back) to get oVirt running on Centos hosts. One of the items in it has to do with what "tuned" profile to use. At the time I first ran into it, this was a fatal error. It is now just a warning, so it does not prevent installing a host. But, as a warning, a lot of people are probably missing it. When using Centos 6 as the host OS, the script tries to install a "rhs-virtualization" profile. That profile is not included in Centos. I substituted the "virtual-host" profile. I believe that this may be a regression as a result of Bugzilla 987293<https://bugzilla.redhat.com/show_bug.cgi?id=987293>, where "rhs-virtualization" was substituted for "virtual-host" for RHEV + RHS. I am guessing that whatever is used as a switch to determine RHEV + RHS is also shoving Centos into that same path, which is not appropriate. My suggestion would be to write the script so that it uses "rhs-virtualization" when present, and if it is not present, then it falls back to "virtual-host". (I don't know what (if any) differences there are between the two profiles.) Should I open a new bug, make a comment on 987293, or take some other path? Ted Miller ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Centos 6.5 host configuration script -- a tale begun
On 12/13/2013 2:59 AM, Sven Kieske wrote: Hi, first, thanks for this script! I'll have to setup some CentOS 6.5 machines too, maybe it will help. Here are some questions/improvements from me: There's no need to use "localinstall" anymore, "install" is fine with yum :-) Then you install "virt-manager", for what purpose, may I ask? I also never needed to manually create the ovirt-management bridge has this behaviour changed in recent vdsm/CentOS Release? Am 13.12.2013 05:48, schrieb Ted Miller: # script to prepare Centos 6.5 for ovirt host install process echo "= Ted's personal preferences--early " yum -y install nano deltarpm yum-plugin-priorities yum-presto mlocate echo "=== end of Ted's personal preferences--early " yum -y upgrade echo "install some repos (if not already done).." cd /etc/yum.repos.d if [ ! -f glusterfs-epel.repo ] ; then echo "..installing gluster repo..." yum -y install wget wget http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo echo "..done installing gluster repo.." fi if [ ! -f el6-ovirt.repo ] ; then echo "..installing ovirt repo..." yum -y localinstall http://ovirt.org/releases/ovirt-release-el.noarch.rpm echo "..done installing ovirt repo.." fi if [ ! -f epel.repo ] ; then echo "..installing epel repo..." yum -y localinstall http://mirror.us.leaseweb.net/epel/6/i386/epel-release-6-8.noarch.rpm echo "..done installing epel repo.." fi echo "install libvirt" yum -y install libvirt qemu-kvm tuned echo "install virt-manager" # with unlisted dependencies yum -y install virt-manager xorg-x11-xauth dejavu-lgc-sans-mono-fonts # create the ovirtmgmt bridge if [ ! -f /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt ]; then echo "creating ovirtmgmt bridge.." service libvirtd start service libvirtd status virsh net-destroy default virsh net-undefine default virsh iface-bridge eth0 ovirtmgmt service network restart service libvirtd stop service libvirtd status fi echo ".copy tuned profile..." # copy virtual-host --> rhs-virtualization so ovirt is happy cp -r /etc/tune-profiles/virtual-host /etc/tune-profiles/rhs-virtualization yum -y install vdsm echo "= Ted's personal preferences--late =" #add lines to send messages to TTY12 cat /etc/rsyslog.conf | grep tty12 if [ ! $? -eq 0 ] ; then echo "...Adding for tty12" echo " " >> /etc/rsyslog.conf echo "# Log everything to tty12" >> /etc/rsyslog.conf echo "*.* /dev/tty12" >> /etc/rsyslog.conf service rsyslog restart fi echo "...install gkrellm.." yum -y install gkrellm #add poll=0 to kill noveau messages cat /boot/grub/grub.conf | grep poll=0 if [ ! $? -eq 0 ] ; then echo "Adding poll=0.." sed -i '/^[ \t]kernel*/ s/$/ drm-kms-helper.poll=0/g' /boot/grub/grub.conf fi I followed your instructions (and similar suggestions by others). I was installing the host only, so commented out everything in my file except install EPEL and ovirt repos. I'm not interested in hearing about what "should" happen, or what happened on 6.4. These are the (unfixed) problems with 6.5, and they are real. INSTALL FAILED with message: Failed to install Host office4a. Yum [u'glusterfs-server-3.4.0-8.el6.x86_64 requires glusterfs-libs = 3.4.0-8.el6', u'glusterfs-server-3.4.0-8.el6.x86_64 requires glusterfs = 3.4.0-8.el6', u'glusterfs-server-3.4.0-8.el6.x86_64 requires glusterfs-fuse = 3.4.0-8.el6', u'glusterfs-cli-3.4.0-8.el6.x86_64 requires glusterfs-libs = 3.4.0-8.el6']. # yum list gluster base: mirror.oss.ou.edu epel: mirrors.servercentral.net extras: mirror.dattobackup.com updates: centos.sonn.com Available Packages glusterfs.x86_64 3.4.0.36rhs-1.el6 base glusterfs-api.x86_64 3.4.0.36rhs-1.el6 base glusterfs-api-devel.x86_643.4.0.36rhs-1.el6 base glusterfs-cli.x86_64 3.4.0-8.el6 glusterfs-epel glusterfs-debuginfo.x86_643.4.0-8.el6 glusterfs-epel glusterfs-devel.x86_643.4.0.36rhs-1.el6 base glusterfs-fuse.x
Re: [Users] simple networking? [SOLVED] mostly
On 12/13/2013 7:56 AM, Bob Doolittle wrote: On 12/12/2013 11:04 PM, Ted Miller wrote: From: users-boun...@ovirt.org on behalf of Ted Miller Sent: Wednesday, November 27, 2013 12:18 PM To: users@ovirt.org Subject: [Users] simple networking? I am trying to set up a testing network using o-virt, but the networking is refusing to cooperate. I am testing for possible use in two different production setups. My previous experience has been with VMWare. I have always set up a single bridged network on each host. All my hosts, VMs, and non-VM computers were peers on the LAN. They could all talk to each other, and things worked very well. There was a firewall/gateway that provided access to the Internet, and hosts, VMs, and could all communicate with the Internet as needed. o-virt seems to be compartmentalizing things beyond all reason. Is there any way to set up simple networking, so ALL computers can see each other? Is there anywhere that describes the philosophy behind the networking setup? What reason is there that networks are so divided? After banging my head against the wall trying to configure just one host, I am very frustrated. I have spent several HOURS Googling for a coherent explanation of how/why networking is supposed to work, but only fine obscure references like "letting non-VMs see VM traffic would be a huge security violation". I have no concept of what king of an installation the o-virt designers have in mind, but it is obviously worlds different from what I am trying to do. The best I can tell, o-virt networking works like this (at least when you have only one NIC): there must be an ovirtmgt network, which cannot be combined with any other network. the ovirtmgt network cannot talk to VMs (unless that VM is running the engine) the ovirtmgt network can only talk to hosts, not to other non-VM computers a VM network can talk only to VMs cannot talk to hosts cannot talk to non-VMs hosts cannot talk to my LAN hosts cannot talk to VMs VMs cannot talk to my LAN All of the above are enforced by a boatload of firewall rules that o-virt puts into every host and VM under its jurisdiction. All of the above is inferred from things I Googled, because I can't find anywhere that explains what or how things are supposed to work--only things telling people WHAT THEY CANT DO. All I see on the mailing lists is people getting their hands slapped because they are trying to do SIMPLE SETUPS that should work, but don't (due to either design restrictions or software bugs). My use case A: * My (2 or 3) hosts have only one physical NIC. * My VMs exist to provide services to non-VM computers. * The VMs do not run X-windows, but they provide GUI programs to non-VMs via "ssh -X" connections. * MY VMs need access to storage that is shared with hosts and non-VMs on the LAN. Is there some way to TURN OFF network control in o-virt? My systems are small and static. I can hand-configure the networking a whole lot easier than I can deal with o-virt (as I have used it so far). Mostly I would need to be able to turn off the firewall rules on both hosts and VMs. banging head against wall, Ted * I have spent the last three days getting a Centos 6.5 host running under O-virt. Since the networking was just a small part of this, I am going to open an new thread to discuss the Centos 6.5 host setup process. Look for a thread titled something like "Centos 6.5 host configuration" if you want the gory details, or want to try if for yourself. My biggest problem is that the o-virt GUI is apparently incapable of setting up a bridge in Centos, which turned out to be what I needed. I had to set up the bridge BEFORE adding the host to the ovirt cluster. If the bridge was not set up ahead of time, the whole installation failed completely. The bridge was only one of a list of things that had to be done ahead of time, in order for the process to complete correctly. Ted, I have RHEL 6.5 running in a VM, and it can talk to all my VMs and hosts on my LAN, and I didn't have to do anything special. I didn't define any new networks or bridges or anything of the sort, either in oVirt or on my host or engine. It just worked. I am running RHEL 6.5 on both my engine and my host, as well in this particular VM. -Bob Do you have the Engine on a separate machine, or did you set up the host as an All-In-One? Did you install 6.5 or upgrade to 6.5? Ted ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] Centos 6.5 host configuration script -- a tale begun
I have been working since Monday to get a Centos 6.5 host node added to ovirt. Since 6.5 is just out, I figured I might as well use the latest and greatest to build my host. Today I succeeded (I think). I have not actually added a VM to the host, but at least ovirt is willing to accept that the host is part of the default cluster. My next task will be bringing up gluster on that same node. Since I will be doing at least three more hosts, I wrote a shell script to do a the setup that is needed to make the ovirt process succeed. No, I do not have a similar script for the engine. I got that running under Centos 6.4 in a VM, without too much problem. That VM is temporarily running on a KVM host (but that host is not under ovirt). Feel welcome to stare at or run my script and make any comments or observations. * The script is run after a clean install of Centos 6.5 from the "minimal" ISO. * I will try to remember what each element was there for, if anything is not clear. * There are probably a few (not many) things there that are not needed * Mostly they result in doing something ahead of time that ovirt was going to do later anyway. * Feel free to point out a better way to do whatever needs to be done. * Bits and pieces of the script were stolen from googleing here and there. * Parts of the script were cooked up by stewing logs over low heat until something useful bubbled to the top. * The number of clean reinstalls to test the script is beyond count. * I almost broke down and learned how to write a kickstart file (but didn't). * No guarantees or representations. * So far this script has been tested on exactly one set of bare-metal hardware * That hardware is not server-grade. (ovirt keeps complaining because I have not configured Power Management :) * There are a few things that are personal preferences (things I install on all my Linux machines) * I believe those preferences are clearly marked. * I am leaving them in because they may (incidentally) be installing some dependencies that influence the outcome of the process. I hope to see a day when a similar script is either not needed, or is available and maintained as part of the Centos distro, or as part of ovirt. Meanwhile we try to muddle through. I will copy my script into this webmail interface (OWA) (since I am writing at home and this is all I have to work with) and see how bad it mangles it. You'll probably need a wide window so that lines don't wrap, as Microsoft thinks this OWA interface doesn't ever need to let me specify text as "preformat". I called my script ov_host-start.sh # script to prepare Centos 6.5 for ovirt host install process echo "= Ted's personal preferences--early " yum -y install nano deltarpm yum-plugin-priorities yum-presto mlocate echo "=== end of Ted's personal preferences--early " yum -y upgrade echo "install some repos (if not already done).." cd /etc/yum.repos.d if [ ! -f glusterfs-epel.repo ] ; then echo "..installing gluster repo..." yum -y install wget wget http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo echo "..done installing gluster repo.." fi if [ ! -f el6-ovirt.repo ] ; then echo "..installing ovirt repo..." yum -y localinstall http://ovirt.org/releases/ovirt-release-el.noarch.rpm echo "..done installing ovirt repo.." fi if [ ! -f epel.repo ] ; then echo "..installing epel repo..." yum -y localinstall http://mirror.us.leaseweb.net/epel/6/i386/epel-release-6-8.noarch.rpm echo "..done installing epel repo.." fi echo "install libvirt" yum -y install libvirt qemu-kvm tuned echo "install virt-manager" # with unlisted dependencies yum -y install virt-manager xorg-x11-xauth dejavu-lgc-sans-mono-fonts # create the ovirtmgmt bridge if [ ! -f /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt ]; then echo "creating ovirtmgmt bridge.." service libvirtd start service libvirtd status virsh net-destroy default virsh net-undefine default virsh iface-bridge eth0 ovirtmgmt service network restart service libvirtd stop service libvirtd status fi echo ".copy tuned profile..." # copy virtual-host --> rhs-virtualization so ovirt is happy cp -r /etc/tune-profiles/virtual-host /etc/tune-profiles/rhs-virtualization yum -y install vdsm echo "= Ted's personal preferences--late =" #add lines to send messages to TTY12 cat /etc/rsyslog.conf | grep tty12 if [ ! $? -eq 0 ] ; then ech
[Users] Re: simple networking? [SOLVED] mostly
From: users-boun...@ovirt.org on behalf of Ted Miller Sent: Wednesday, November 27, 2013 12:18 PM To: users@ovirt.org Subject: [Users] simple networking? I am trying to set up a testing network using o-virt, but the networking is refusing to cooperate. I am testing for possible use in two different production setups. My previous experience has been with VMWare. I have always set up a single bridged network on each host. All my hosts, VMs, and non-VM computers were peers on the LAN. They could all talk to each other, and things worked very well. There was a firewall/gateway that provided access to the Internet, and hosts, VMs, and could all communicate with the Internet as needed. o-virt seems to be compartmentalizing things beyond all reason. Is there any way to set up simple networking, so ALL computers can see each other? Is there anywhere that describes the philosophy behind the networking setup? What reason is there that networks are so divided? After banging my head against the wall trying to configure just one host, I am very frustrated. I have spent several HOURS Googling for a coherent explanation of how/why networking is supposed to work, but only fine obscure references like "letting non-VMs see VM traffic would be a huge security violation". I have no concept of what king of an installation the o-virt designers have in mind, but it is obviously worlds different from what I am trying to do. The best I can tell, o-virt networking works like this (at least when you have only one NIC): there must be an ovirtmgt network, which cannot be combined with any other network. the ovirtmgt network cannot talk to VMs (unless that VM is running the engine) the ovirtmgt network can only talk to hosts, not to other non-VM computers a VM network can talk only to VMs cannot talk to hosts cannot talk to non-VMs hosts cannot talk to my LAN hosts cannot talk to VMs VMs cannot talk to my LAN All of the above are enforced by a boatload of firewall rules that o-virt puts into every host and VM under its jurisdiction. All of the above is inferred from things I Googled, because I can't find anywhere that explains what or how things are supposed to work--only things telling people WHAT THEY CANT DO. All I see on the mailing lists is people getting their hands slapped because they are trying to do SIMPLE SETUPS that should work, but don't (due to either design restrictions or software bugs). My use case A: * My (2 or 3) hosts have only one physical NIC. * My VMs exist to provide services to non-VM computers. * The VMs do not run X-windows, but they provide GUI programs to non-VMs via "ssh -X" connections. * MY VMs need access to storage that is shared with hosts and non-VMs on the LAN. Is there some way to TURN OFF network control in o-virt? My systems are small and static. I can hand-configure the networking a whole lot easier than I can deal with o-virt (as I have used it so far). Mostly I would need to be able to turn off the firewall rules on both hosts and VMs. banging head against wall, Ted * I have spent the last three days getting a Centos 6.5 host running under O-virt. Since the networking was just a small part of this, I am going to open an new thread to discuss the Centos 6.5 host setup process. Look for a thread titled something like "Centos 6.5 host configuration" if you want the gory details, or want to try if for yourself. My biggest problem is that the o-virt GUI is apparently incapable of setting up a bridge in Centos, which turned out to be what I needed. I had to set up the bridge BEFORE adding the host to the ovirt cluster. If the bridge was not set up ahead of time, the whole installation failed completely. The bridge was only one of a list of things that had to be done ahead of time, in order for the process to complete correctly. Ted Miller ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] simple networking?
Thank you for your response, Mike. I am slow answering because of the American Thanksgiving holiday. Answers are below. On 11/28/2013 1:41 AM, Mike Kolesnik wrote: - Original Message - I am trying to set up a testing network using o-virt, but the networking is refusing to cooperate. I am testing for possible use in two different production setups. My previous experience has been with VMWare. I have always set up a single bridged network on each host. All my hosts, VMs, and non-VM computers were peers on the LAN. They could all talk to each other, and things worked very well. There was a firewall/gateway that provided access to the Internet, and hosts, VMs, and could all communicate with the Internet as needed. o-virt seems to be compartmentalizing things beyond all reason. Is there any way to set up simple networking, so ALL computers can see each other? Is there anywhere that describes the philosophy behind the networking setup? What reason is there that networks are so divided? Yes there is lack of documentation in this area, it's a shame but given it's an open source project with an open wiki, everyone is invited to contribute and improve this. I'll see if I can get a page started.. Please post a link if you succeed. After banging my head against the wall trying to configure just one host, I am very frustrated. I have spent several HOURS Googling for a coherent explanation of how/why networking is supposed to work, but only fine obscure references like "letting non-VMs see VM traffic would be a huge security violation". I have no concept of what king of an installation the o-virt designers have in mind, but it is obviously worlds different from what I am trying to do. The best I can tell, o-virt networking works like this (at least when you have only one NIC): there must be an ovirtmgt network, which cannot be combined with any other network. the ovirtmgt network cannot talk to VMs (unless that VM is running the engine) the ovirtmgt network can only talk to hosts, not to other non-VM computers a VM network can talk only to VMs cannot talk to hosts cannot talk to non-VMs hosts cannot talk to my LAN hosts cannot talk to VMs VMs cannot talk to my LAN All of the above are enforced by a boatload of firewall rules that o-virt puts into every host and VM under its jurisdiction. Not sure what you mean by all these "restrictions", from what I know the firewall rules that are set on each host are to allow host to talk to engine (ssh, vdsm, VM consoles traffic, etc) no more no less.. Usually the default behavior of firewall is to block almost all communication so when you add a host and check the "Configure firewall" box it modifies it so that your host can function properly. I need my host to be on my LAN (for multiple reasons). Ovirtmgt "stole" the LAN connection, and cut off the host from the LAN, a connection which worked fine until then. oVirt has no sense of firewall otherwise. For all it cares you can turn it off completely, or configure it by yourself (manually or via puppet/chef/foreman/etc) and not use the capability of the system to configure it for you. How do I keep the engine from reconfiguring the firewall again if I change it manually? I saw a blog post that mentioned being able to uncheck a box (on the o-virt web GUI) called "configure IPTables". That /might/ be what I need. I didn't see that box, but I wasn't looking for it (and at the moment I don't have o-virt available to me). You can also change it so that it uses the rules you want by modifying IPTablesConfig via engine-config tool. Where can I find documentation on changing firewall rules using engine-config? From what I understand, I want my LAN to be my non-VLAN bridge. Can I move the ovirtmgt functionality to run over the LAN, or can I/will I have to put ovirt-mgt onto a VLAN? All of the above is inferred from things I Googled, because I can't find anywhere that explains what or how things are supposed to work--only things telling people WHAT THEY CANT DO. All I see on the mailing lists is people getting their hands slapped because they are trying to do SIMPLE SETUPS that should work, but don't (due to either design restrictions or software bugs). My use case A: * My (2 or 3) hosts have only one physical NIC. * My VMs exist to provide services to non-VM computers. * The VMs do not run X-windows, but they provide GUI programs to non-VMs via "ssh -X" connections. * MY VMs need access to storage that is shared with hosts and non-VMs on the LAN. Your VMs will be sitting on the ovirtmgmt network, or on a VLAN? I want them to sit on the LAN (which may be ovirtmgt, if I can get the IP filtering turned off). If they have to be on something else too, that is OK, as long as it does not interfere with them being on the LAN. FYI, the LANs on both of my applications are fairly small. One of them less than 10 nodes, the other less than
Re: [Users] simple networking?
On 11/28/2013 3:54 AM, noc wrote: On 27-11-2013 18:18, Ted Miller wrote: I am trying to set up a testing network using o-virt, but the networking is refusing to cooperate. I am testing for possible use in two different production setups. My previous experience has been with VMWare. I have always set up a single bridged network on each host. All my hosts, VMs, and non-VM computers were peers on the LAN. They could all talk to each other, and things worked very well. There was a firewall/gateway that provided access to the Internet, and hosts, VMs, and could all communicate with the Internet as needed. o-virt seems to be compartmentalizing things beyond all reason. That is a way to use oVirt, but the following simple setup should work and give you a way to check against your setup. I have two setups, one at home and one at work. The one at home is a setup of 2 hosts and one of those is a hacked up host/engine. engine/host1: standard fedora19 kde install, static ip (192.168.1.11) configured with my NAS (192.168.1.16) as dhcp/dns server and my internet router (192.168.1.254) as gateway Just make sure that NetworkManager is off and that your interfaces are not NM managed, network on. This was a allinone setup but I got a NAS with NFS so I turned my aio setup into a engine/host system. It has problems with that but nothing network related. Host2: same as above but without the engine install, ip:192.168.1.22, gw 192.168.1.254 DNS:192.168.1.16. How does it all come together? Well in your case, and mine if I were to start over, start with a static network which is NOT managed by NetworkManager. Use either Fedora or Centos which ever you more comfortable with and it also depends on whether you want to test/use all the features in oVirt. Currently, there are a few features not available in Centos because the versions of libvirt/kvm/qemu/gluster are too old in Centos. Install ovirt-engine on your first 'server', probably choose NFS as your storage domain, either on your engine server or from somewhere else on your network. Make sure its nfs-v3 and not v4!, local default is v4! Make sure that ip addresses on you network are resolvable, either through /etc/hosts or through DNS! Engine-setup will complain if this doesn't work, using localhost will not work either! On the engine server there will be no bridge and nothing will change the network config. Next the first host. Prepare the host in a similar way you did the engine server. You can choose a minimal install of either Centos or Fedora or install a full desktop but make sure that ips are static and NOT managed by NetworkManager, hostname resolvable, ovirt repo available. From the webui add your prepared host and if everything went OK you'll see that on that host you will now have a bridge, ovirtmgmt, which acts as the primary interface. Create a VMs and choose ovirtmgmt as a network for its nics, can't choose anything else. Either give the VMs a static address or use a dhcp server but the VMs should be able to talk to each other, to the host(s), the engine and to the internet. Every host that you add after the first will also has its network turned into a bridge, ovirtmgmt, and communication/migration/display/etc will take place over this network. One caveat, storage domain mapping is from the host to the storage, the engine, if it is NOT the NFS server, doesn't have to have access to the storage. If you have servers with more that 1 nic then you can create additional networks using the webui of oVirt and assign these to clusters and to VMs. If you need vlans to coexist with ovirtmgmt on the same physical nic, I think that is possible but haven't tried it myself. In theory you need to setup the network first outside of oVirt, including you vlan structure and then install ovirt. Some concepts: oVirt engine: is just the manager, does 'nothing' related to running VMs itself. You can turn it off and all hosts with their VMs will keep running. You just can't start new ones, in short manage them. oVirt host: is the real workhorse and is managed using oVirt-engine. Runs VDSM which communicates with engine and starts/manages the VMs on the host on behalf of engine. oVirt node: is a special slimmed down Fedora distro that includes VDSM and a small setup so that it can be used as a oVirt host People tend to mix and match ovirt-host and ovirt-node which makes for nice communication problems :-) If you haven't done so, there is an irc channel, ovirt, on irc.oftc.net with helpful people, if they are awake. Joop -- #irc jvandewege When I get another project out of the way (hopefully this week), I will be able to get back to my test setup and try again. Between your info, something I stumbled onto on a blog, and the info from Mike, I hope to have enough to make some progress when I take another stab at it. Ted Miller _
Re: [Users] simple networking?
On 11/27/2013 4:35 PM, Thomas Suckow wrote: On 11/27/2013 01:00 PM, Ted Miller wrote: I am not using an all-in-one. Do you have more than one host? If not, that is a very different story, because it only has to "talk to itself". I have the engine on a VM (at the moment on a KVM host not managed by ovirt). I was trying to bring up one host, but couldn't get past that point. Will then have to add another host, and migrate the engine to running on one of those two hosts. Ted Miller I don't currently, I had dabbled with adding another host but found out the other server had a different processor and removed it. That said, my vms can talk to eachother and the host can talk to vms and vice versa. That still doesn't offer what I need: VMs and host all talking on LAN to all other LAN residents. It works better than when I just used virt-manager. After setting up the bridge on the host does it lose all network connectivity? No, it could still talk to ovirt-engine. It seemed to work the way o-virt wanted it to, just not the way I need it to. If so it may be the same issue I was having where I had to manually manipulate the network configuration to fix the bridge. Thanks for the answer, Ted Miller ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] simple networking?
I am trying to set up a testing network using o-virt, but the networking is refusing to cooperate. I am testing for possible use in two different production setups. My previous experience has been with VMWare. I have always set up a single bridged network on each host. All my hosts, VMs, and non-VM computers were peers on the LAN. They could all talk to each other, and things worked very well. There was a firewall/gateway that provided access to the Internet, and hosts, VMs, and could all communicate with the Internet as needed. o-virt seems to be compartmentalizing things beyond all reason. Is there any way to set up simple networking, so ALL computers can see each other? Is there anywhere that describes the philosophy behind the networking setup? What reason is there that networks are so divided? After banging my head against the wall trying to configure just one host, I am very frustrated. I have spent several HOURS Googling for a coherent explanation of how/why networking is supposed to work, but only fine obscure references like "letting non-VMs see VM traffic would be a huge security violation". I have no concept of what king of an installation the o-virt designers have in mind, but it is obviously worlds different from what I am trying to do. The best I can tell, o-virt networking works like this (at least when you have only one NIC): there must be an ovirtmgt network, which cannot be combined with any other network. the ovirtmgt network cannot talk to VMs (unless that VM is running the engine) the ovirtmgt network can only talk to hosts, not to other non-VM computers a VM network can talk only to VMs cannot talk to hosts cannot talk to non-VMs hosts cannot talk to my LAN hosts cannot talk to VMs VMs cannot talk to my LAN All of the above are enforced by a boatload of firewall rules that o-virt puts into every host and VM under its jurisdiction. All of the above is inferred from things I Googled, because I can't find anywhere that explains what or how things are supposed to work--only things telling people WHAT THEY CANT DO. All I see on the mailing lists is people getting their hands slapped because they are trying to do SIMPLE SETUPS that should work, but don't (due to either design restrictions or software bugs). My use case A: * My (2 or 3) hosts have only one physical NIC. * My VMs exist to provide services to non-VM computers. * The VMs do not run X-windows, but they provide GUI programs to non-VMs via "ssh -X" connections. * MY VMs need access to storage that is shared with hosts and non-VMs on the LAN. Is there some way to TURN OFF network control in o-virt? My systems are small and static. I can hand-configure the networking a whole lot easier than I can deal with o-virt (as I have used it so far). Mostly I would need to be able to turn off the firewall rules on both hosts and VMs. banging head against wall, Ted ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users