Re: [ovirt-users] Storage latency message
Once upon a time, Nir Soffer said: > Ovirt is reading 4k from the metadata special volume every 10 secods. If > the read takes more than 5 seconds, you will see this warning in engine > event log. > > Maybe your storage or the host was overloaded at that time (e.g. vm backup)? I don't see any evidence that the storage was having any problem. The times the message gets logged are not at any high-load times either (either scheduled backups or just high demand). I wrote a perl script to replicate the check, and I ran it on a node in maintenance mode (so no other traffic on the node). My script opens a block device with O_DIRECT, reads the first 4K, and closes it, reporting the time. I do see some latency jumps with that check, but not on the raw block device, just the LV. By that I mean I'm running it on two devices: the multipath device that is the PV and the metadata LV. The multipath device latency is pretty stable, running around 0.3 to 0.5ms. The LV latency is higher (just a little normally) but has a higher variability and spikes to 50-125ms (at the same time that reading the multipath device took under 0.5ms). Seems like this might be a problem somewhere in the Linux logical volume layer, not the block or network layer (or with the network/storage itself). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] LACP Bonding issue
Once upon a time, Bryan Sockel said: > It seems that is some disconnect between my network bridge, the bond and my > interfaces. I would like to some how get my bond to use all 4 interfaces. > On reboot, it always seems to reset consistently to EM1. Are you sure the switch side is all the same LACP group? Sounds like one port may accidentally be in a separate group, and that happens to be em1. You might try swapping wires between em1 and another port and reboot and see which ports come up - if all but the port with the wire formerly in em1 come up, it points to the switch config. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] LACP Bonding issue
Once upon a time, Bryan Sockel said: > We checked the port groups, and servers are cabled correctly. > > After server is rebooted, em1 is the only interface passing traffic. > Other 3 nics sitting idle. We can down each port on the switch and > confirm it is down on the server. > > > I am pretty sure it is related to the bridge that was created to pass > vm-host-altn traffic when the appliance was first installed. > > > > Original message > From: Chris Adams > Date: 4/20/17 5:40 PM (GMT-06:00) > To: users@ovirt.org > Subject: Re: [ovirt-users] LACP Bonding issue > > _ > > >From : Chris Adams [c...@cmadams.net] > To : users@ovirt.org [users@ovirt.org] > Date : Thursday, April 20 2017 17:40:25 > Once upon a time, Bryan Sockel said: > > It seems that is some disconnect between my network bridge, the bond > and my > > interfaces. I would like to some how get my bond to use all 4 > interfaces. > > On reboot, it always seems to reset consistently to EM1. > > Are you sure the switch side is all the same LACP group? Sounds like > one port may accidentally be in a separate group, and that happens to be > em1. > > You might try swapping wires between em1 and another port and reboot and > see which ports come up - if all but the port with the wire formerly in > em1 come up, it points to the switch config. > > -- > Chris Adams > _______ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] LACP Bonding issue
Sorry about the message with nothing new... Once upon a time, Bryan Sockel said: > We checked the port groups, and servers are cabled correctly. > > After server is rebooted, em1 is the only interface passing traffic. > Other 3 nics sitting idle. We can down each port on the switch and > confirm it is down on the server. > > I am pretty sure it is related to the bridge that was created to pass > vm-host-altn traffic when the appliance was first installed. Well, I don't have any problem with that setup on multiple oVirt clusters (including a bunch of R610 servers), so I don't think that's it. I configure oVirt for "custom" bonding options; I use: mode=802.3ad lacp_rate=1 xmit_hash_policy=layer2+3 Is it possible to move the wires around temporarily, so different server ports are connected to different switch ports? It would be interested to see if the "solo" behavior stayed with the port or the wire. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Seamless SAN HA failovers with oVirt?
Once upon a time, Sven Achtelik said: > I was failing over by rebooting one of the TrueNas nodes and this took some > time for the other node to take over. I was thinking about asking the TN guys > if there is a command or procedure to speed up the failover. That's the way TrueNAS failover works; there is no "graceful" failover, you just reboot the active node. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Seamless SAN HA failovers with oVirt?
Once upon a time, Juan Pablo said: > I think its not related to something on the trueNAS side. if you are using > iscsi multipath you should be using round-robin TrueNAS HA is active/standby, so multipath has nothing to do with rebooting/upgrading a TrueNAS. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Seamless SAN HA failovers with oVirt?
Once upon a time, Juan Pablo said: > Im saying you can do it with multipath and not rely on truenas/freenas. > with an active/active configuration on the virt side...instead of > active/passive on the storage side. But there's still only one active system (the active TrueNAS node) connected to the hard drives, and the only way to upgrade is to reboot it. Multipath doesn't bypass that. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Seamless SAN HA failovers with oVirt?
Once upon a time, Juan Pablo said: > Chris, if you have active-active with multipath: you upgrade one system, > reboot it, check it came active again, then upgrade the other. Yes, but that's still not how a TrueNAS (and most other low- to mid-range SANs) works, so is not relevant. The TrueNAS only has a single active node talking to the hard drives at a time, because having two nodes talking to the same storage at the same time is a hard problem to solve (typically requires custom hardware with active cache coherency and such). You can (and should) use multipath between servers and a TrueNAS, and that protects against NIC, cable, and switch failures, but does not help with a controller failure/reboot/upgrade. Multipath is also used to provide better bandwidth sharing between links than ethernet LAGs. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Add disk image from node command line?
I have a qcow2 disk image sitting on the local filesystem of one node. Is there a way to copy this image to oVirt (into an iSCSI storage domain) without copying it to my desktop and uploading through the web UI? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Software RAID on oVirt Node
Once upon a time, Vinícius Ferrão said: > On typical deployment scenarios of oVirt which is the recommended RAID > technologies for oVirt Node installation? Should I use controller based RAID > or mdadm can be used instead? Is this recommended? > > I’m asking this because other vendors requires hardware RAID, even those 100% > based on CentOS, like XenServer. There’s not even a way to install it with > mdadm (Software Raid). I use Linux software RAID under oVirt just fine. I'm not using oVirt Node though (I just installed CentOS and then installed oVirt). Note that I have an iSCSI SAN for VM storage - things might be different if you are planning to use the local disks for VMs (local storage or Gluster). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Replacing engine SSL cert
I'm writing a script to install a new SSL key/cert pair (from Let's Encrypt) for the engine web UI on oVirt 4.1. I'm looking at this, but it's a little confusing. https://www.ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL/ It sounds like steps 1 and 3 are referring to the CA-supplied intermediate cert(s), not the actual issue cert for the server. Is that right? Does anything actually use the PCKS12 format file referred to in step 4? I don't normally see that format from regular CAs; they usually provide cert+intermediate(s) in PEM format. With Apache 2.4, it is normal to just put the cert+intermediate(s) chain in one file and configure Apache with SSLCertificateFile. You aren't supposed to put the CA-supplied cert in the SSLCACertificateFile like oVirt appears to do; that's intended to be used for validating client certs, not the intermediate(s) for the server cert. It really just looks like the cert+intermediate(s) should go in /etc/pki/ovirt-engine/certs/apache.cer, the corresponding key put in /etc/pki/ovirt-engine/keys/apache.key.nopass, and then Apache needs to be restarted. Since oVirt doesn't use the engine web UI cert for anything internally (right?), do any of the other steps on the above page matter? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Different link speeds in LACP LAG?
I have a small oVirt setup for one customer, with two servers each connected to a two-switch stack with 1G links. Now the customer would like to upgrade the server links to 10G. My question is this: can I add a 10G NIC and do this with minimal "fuss" by just adding the 10G links to the same LAG, then removing the 1G links? I would have the host in maintenance mode no matter what. I haven't checked the switch to see if it'll support that yet, figured I'd start on the oVirt side. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Question about cold start
I have an oVirt cluster that was hard shutdown last night (fire is bad, and firemen killed the generators for their safety). When it came back up, it did not start any VMs other than the hosted engine. Is that expected? I know this is not a normal use case, but is there a way to set VMs to start on cluster boot? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Question about cold start
Once upon a time, Charles Kozler said: > I believe you would accomplish this by setting a VM to be highly available > (like the engine). Then engine makes sure this VM is up on at least one > node through lease agreements (IIRC). In either case, I think this is what > you want That keeps VMs up as long as the cluster is up, but does not bring them back if the whole cluster goes down (unless there's some other setting I'm missing). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Question about cold start
Once upon a time, Martin Sivak said: > Can you please describe your use-case there to make sure we do not > forget and to make it obvious there is a need for this feature? Thanks, added. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Multiple NICs on hosted engine?
I have installed the first node of a new oVirt 3.5 setup with a hosted engine VM. I have multiple networks: one public-accessible and one private (with storage, iDRAC/IPMI, etc.). I set the engine VM up on the public LAN, but now realize that it can't access the power control. I tried to add a second NIC to the engine VM through the web interface, but of course that doesn't work (because it isn't really managed there). How can I add a second NIC to the hosted engine VM? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Multiple NICs on hosted engine?
Once upon a time, Simone Tiraboschi said: > Sorry, I forgot you cannot add that nic on the engine VM from the engine UI. > Please try what I explained plus Darrel's trick. It worked. I added the network in the UI, added it to the host (so it got the bridge set up on that interface) in the UI, and then edited the vm.conf file on the host. Migrated back and forth and all appears well. Thanks. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Multiple NICs on hosted engine?
Once upon a time, Darrell Budic said: > Glad it worked. Make sure you add it to the vm.conf file on all your ha > hosts, otherwise you’ll drop it if ha-agent restarts it as opposed to a > migration. Wasn’t clear if you’d done that or not. Based on some other notes I found via Google, here's what I did (for the archives): - Created the network in the UI - hosted-engine --set-maintenance --mode=global - edited /etc/ovirt-hosted-engine/vm.conf; duplicated the existing network line, changing the MAC, UUID, and network name (changed on all hosted-engine nodes) - hosted-engine --vm-shutdown - hosted-engine --vm-start - hosted-engine --set-maintenance --mode=none That appears to be working correctly. I did then figure out that I probably didn't need it, at least for what I thought: power management. I didn't realize that the engine doesn't talk to the IPMI devices directly, that it instead proxies through a node. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] iptables management
During setup, I allowed the script to change iptables rules. Is this necessary? Also, is it an "active" management (where oVirt will make changes), or just a one-time thing? I ask because I have some other iptables setup I want (such as limited SSH access), and I don't want to make changes to iptables that oVirt will override later or anything like that. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] iptables management
Once upon a time, Alon Bar-Lev said: > I guess you mean engine setup, right? Yes, that and hosted-engine --deploy. > Each time you run engine-setup you will be prompt if you want to override > iptables settings. > If you choose to override, the current settings will be backed up and you can > diff and re-apply your own. > If you choose to keep your settings, setup will write the iptables rules into > own location and you can diff and apply the changes manually. Okay, so that's the only time iptables are changed? That makes sense, and I can work with that. Thanks. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Gluster access tied to one node?
So this may be a dumb question, but here goes... I set up a replicated Gluster volume for VM disk image storage. I specified the path as node9:gluster1 (where node9 is one of the two nodes with a brick). However, when I shut down node9, the VM using that storage automatically gets paused. I thought the replicated storage was supposed to work even if one node was down. Is there some other way to specify the path (did I do that wrong)? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Gluster access tied to one node?
Once upon a time, Gabi C said: > tried localhost:gluster1? No, I haven't - is that expected to work? Also, as it turns out, I can't try it. I can edit the storage domain and change the path, but the change is ignored. I cannot remove the storage domain either; that's greyed out. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Hosted engine: sending ioctl 5401 to a partition!
I have set up oVirt with hosted engine, on an iSCSI volume. On both nodes, the kernel logs the following about every 10 seconds: Nov 21 15:27:49 node8 kernel: ovirt-ha-broker: sending ioctl 5401 to a partition! Is this a known bug, something that I need to address, etc.? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Hosted engine: sending ioctl 5401 to a partition!
Once upon a time, Federico Simoncelli said: > > I have set up oVirt with hosted engine, on an iSCSI volume. On both > > nodes, the kernel logs the following about every 10 seconds: > > > > Nov 21 15:27:49 node8 kernel: ovirt-ha-broker: sending ioctl 5401 to a > > partition! > > > > Is this a known bug, something that I need to address, etc.? > > Is this on centos or fedora? Oops, sorry to leave that out. CentOS 7 and oVirt 3.5 (all up-to-date). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] how to see new iSCSI lun added
Once upon a time, Gianluca Cecchi said: > Am I missing anyting simple? Yep... > On the server offering iSCSI target: > # tgtadm --lld iscsi --mode target --op show > Target 1: iqn.2014-07.local.localdomain:store1 > LUN: 1 > SCSI ID: p_iscsi_store1_l > LUN: 2 > SCSI ID: p_iscsi_store1_l Both LUNs have the same ID name, which confuses discovery (I've done the same thing before). This is a 16 character string, so make sure they are distinct/unique in that 16 characters. It is annoying that scsi-target-utils lets you do this. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] vdsm losing connection to libvirt
I have a oVirt setup that has three nodes, all running CentOS 7, with a hosted engine running CentOS 6. Two of the nodes (node8 and node9) are configured for hosted engine, and the third (node2) is just a "regular" node (as you might guess from the names, more nodes are coming as I migrate VMs to oVirt). On one node, node8, vdsm periodically loses its connection to libvirt, which causes vdsm to restart. There doesn't appear to be any trigger that I can see (not time of day, load, etc. related). The engine VM is up and running on node8 (don't know if that has anything to do with it). I get some entries in /var/log/messages repeated continuously; the "ovirt-ha-broker: sending ioctl 5401 to a partition" I mentioned before, and the following: Dec 15 20:56:23 node8 journal: User record for user '107' was not found: No such file or directory Dec 15 20:56:23 node8 journal: Group record for user '107' was not found: No such file or directory I don't think those have any relevance (don't know where they come from); filtering those out, I see: Dec 15 20:56:33 node8 journal: End of file while reading data: Input/output error Dec 15 20:56:33 node8 journal: Tried to close invalid fd 0 Dec 15 20:56:38 node8 journal: vdsm root WARNING connection to libvirt broken. ecode: 1 edom: 7 Dec 15 20:56:38 node8 journal: vdsm root CRITICAL taking calling process down. Dec 15 20:56:38 node8 journal: vdsm vds ERROR libvirt error Dec 15 20:56:38 node8 journal: ovirt-ha-broker mgmt_bridge.MgmtBridge ERROR Failed to getVdsCapabilities: Error 16 from getVdsCapabilities: Unexpected exception Dec 15 20:56:45 node8 journal: End of file while reading data: Input/output error Dec 15 20:56:45 node8 vdsmd_init_common.sh: vdsm: Running run_final_hooks Dec 15 20:56:45 node8 systemd: Starting Virtual Desktop Server Manager... It is happening about once a day, but not at any regular interval or time (was 02:23 Sunday, then 20:56 Monday). vdsm.log has this at that time: Thread-601576::DEBUG::2014-12-15 20:56:38,715::BindingXMLRPC::1132::vds::(wrapper) client [127.0.0.1]::call getCapabilities with () {} Thread-601576::DEBUG::2014-12-15 20:56:38,718::utils::738::root::(execCmd) /sbin/ip route show to 0.0.0.0/0 table all (cwd None) Thread-601576::DEBUG::2014-12-15 20:56:38,746::utils::758::root::(execCmd) SUCCESS: = ''; = 0 Thread-601576::WARNING::2014-12-15 20:56:38,754::libvirtconnection::135::root::(wrapper) connection to libvirt broken. ecode: 1 edom: 7 Thread-601576::CRITICAL::2014-12-15 20:56:38,754::libvirtconnection::137::root::(wrapper) taking calling process down. MainThread::DEBUG::2014-12-15 20:56:38,754::vdsm::58::vds::(sigtermHandler) Received signal 15 Thread-601576::DEBUG::2014-12-15 20:56:38,755::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 1 edom: 7 level: 2 message: internal error: client socket is closed MainThread::DEBUG::2014-12-15 20:56:38,755::protocoldetector::135::vds.MultiProtocolAcceptor::(stop) Stopping Acceptor MainThread::INFO::2014-12-15 20:56:38,755::__init__::563::jsonrpc.JsonRpcServer::(stop) Stopping JsonRPC Server Detector thread::DEBUG::2014-12-15 20:56:38,756::protocoldetector::106::vds.MultiProtocolAcceptor::(_cleanup) Cleaning Acceptor MainThread::INFO::2014-12-15 20:56:38,757::vmchannels::188::vds::(stop) VM channels listener was stopped. MainThread::INFO::2014-12-15 20:56:38,758::momIF::91::MOM::(stop) Shutting down MOM MainThread::DEBUG::2014-12-15 20:56:38,759::task::595::Storage.TaskManager.Task::(_updateState) Task=`26c7680c-23e2-42bb-964c-272e778a168a`::moving from state init -> state preparing MainThread::INFO::2014-12-15 20:56:38,759::logUtils::44::dispatcher::(wrapper) Run and protect: prepareForShutdown(options=None) Thread-601576::ERROR::2014-12-15 20:56:38,755::BindingXMLRPC::1142::vds::(wrapper) libvirt error Traceback (most recent call last): File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 1135, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 463, in getCapabilities ret = api.getCapabilities() File "/usr/share/vdsm/API.py", line 1245, in getCapabilities c = caps.get() File "/usr/share/vdsm/caps.py", line 615, in get caps.update(netinfo.get()) File "/usr/lib/python2.7/site-packages/vdsm/netinfo.py", line 812, in get nets = networks() File "/usr/lib/python2.7/site-packages/vdsm/netinfo.py", line 119, in networks allNets = ((net, net.name()) for net in conn.listAllNetworks(0)) File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 129, in wrapper __connections.get(id(target)).pingLibvirt() File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3642, in getLibVersion if ret == -1: raise libvirtError ('virConnectGetLibVersion() failed', conn=self) libvirtError: i
Re: [ovirt-users] 3. vdsm losing connection to libvirt (Chris Adams)
Once upon a time, Nikolai Sednev said: > Can I get engine, libvirt, vdsm, mom, logs from host8 and connectivity log? > Have you tried installing clean OSs on hosts, especially on problematic host? > I'd also try to disable JSONRPC on hosts, by putting them to maintenance and > then removing JSONRPC from the check box on all hosts, just to compare if it > resolves the issue. Just to follow up... (tl;dr: issues may be just my own fault) I tried to put node8 into maintenance mode, but then vdsm died while migrating active VMs and the node rebooted. At that point, ovirt-ha-agent.service would exit and sanlock logged errors. I finally realized sanlock was logging "-13" (would be nice to strerr() here, as -13 is not intuitive), which is EACCESS aka permission denied. I realized I didn't have the latest SELinux policy, but had enabled enforcing mode since the last reboot (from permissive, so no relabel needed). The latest CentOS 7 policy includes this in the changelog: * Mon Nov 10 2014 Miroslav Grepl 3.12.1-153.el7_0.13 - Add support for vdsm. Resolves:#1172146 - ALlow sanlock to send a signal to virtd_t. - ALlow sanlock_t to read sysfs. Resolves:#1172147 * Tue Nov 04 2014 Miroslav Grepl 3.12.1-153.el7_0.12 - Allow logrotate to manage virt_cache_t type Resolves:#1159834 So, this may have all just been self-inflicted. I've switched back to permissive mode until I next apply updates; hopefully that'll fix my other issues as well. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] NUMA and non-NUMA nodes and migration
So, new problem (I'm good at breaking things I guess?). Same setup, CentOS 7 + oVirt 3.5.0. Some of my nodes have 2 four-core CPUs, and some have 1 eight-core CPU (same number of available cores); all Intel Xeons of Nehalem or newer type. The systems with 2 CPUs apparently have NUMA support, although I haven't configured anything related to it. The problem: I am unable to live migrate a VM from a node with NUMA to a node without NUMA (haven't tried the other direction). I get messages like: Dec 16 15:36:05 node8 journal: internal error: Process exited prior to exec: libvirt: error : internal error: NUMA node 1 is out of range I see this mentioned in RHBZ 1147644, but it doesn't have a clear resolution to this issue there (multiple issues came up in the same ticket). Is this something that is supposed to be fixed already, will be fixed in 3.5.1 (or later release), or has fallen through the cracks? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] NUMA and non-NUMA nodes and migration
Once upon a time, Gilad Chaplik said: > Hi Chris, > > The fix didn't make it to 3.5, will be available in 3.5.1 Okay, thanks. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Changing the engine HA ping address?
I have an up-to-date hosted-engine 3.5.1 setup (CentOS 7 for the nodes, CentOS 6 for the engine), and the engine keeps jumping between the two nodes running the hosted-engine HA (sometimes after just 10-20 minutes, sometimes after a day or two). I figured out that it is failing on pinging the gateway sometimes. The gateway IP is a layer-3 switch, and I think sometimes it just is not responding to ICMP echo request in a timely fashion (traffic is routing just fine though). How is the HA ping implemented? How many requests does it send (and how many responses are required to be considered "good")? If I can't tweak the sensitivity of the ping, I'd like to ping a different IP (on a HA load balancer setup). The oVirt HA config refers to it as "gateway" though; is it really used as a gateway in any case, or is that just the recommended IP? Can I just edit /etc/ovirt-hosted-engine/hosted-engine.conf on the two nodes and restart the ovirt-ha-broker service? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Changing the engine HA ping address?
Once upon a time, Chris Adams said: > The gateway IP is a layer-3 switch, and I think sometimes it just is not > responding to ICMP echo request in a timely fashion (traffic is routing > just fine though). How is the HA ping implemented? How many requests > does it send (and how many responses are required to be considered > "good")? I see ovirt_hosted_engine_he/broker/submonitors/ping.py that only one packet is sent. That's probably not a great way to do things; there are a number of routers/firewalls/etc. that put ICMP echo requests to the device (as opposed to through the device) at the very lowest priority, and drop them under any load. A better way would be to send multiple requests, with only one answer required. "ping -c 1 -i 0.2 -w -W " should do that. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] VDSM hook for setting DSCP bits?
Is there a VDSM hook available that can set DSCP bits on a VM's network interface? I want to do some QoS for some traffic across my network, and it would be easier if I could set DSCP bits outside the VM. I see vdsm-hook-qos, but that appears to just set bandwidth control in the Linux host, not DSCP on packets for the rest of the network. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VDSM hook for setting DSCP bits?
Once upon a time, Dan Yasny said: > shouldn't be hard to do. Can you provide the details of what you need to > happen to the VM exactly? > - domxml changes > - other host level changes > - whether the VM should be able to live migrate It looks like libvirt supports setting up DSCP bits with nwfilter, per: https://libvirt.org/formatnwfilter.html I will play with this some to see exactly how to use it (haven't tried it before). If that's the case, there shouldn't be any host-level changes required. I would want the VM to be able to live migrate still (with the DSCP still applied). I'll test this out on a bare libvirt VM and see if that'll do the job, and report back with what XML is needed. Thanks. > On Tue, Feb 10, 2015 at 2:34 PM, Chris Adams wrote: > > > Is there a VDSM hook available that can set DSCP bits on a VM's network > > interface? I want to do some QoS for some traffic across my network, > > and it would be easier if I could set DSCP bits outside the VM. > > > > I see vdsm-hook-qos, but that appears to just set bandwidth control in > > the Linux host, not DSCP on packets for the rest of the network. > > -- > > Chris Adams > > ___ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VDSM hook for setting DSCP bits?
Once upon a time, Chris Adams said: > Once upon a time, Dan Yasny said: > > shouldn't be hard to do. Can you provide the details of what you need to > > happen to the VM exactly? > > - domxml changes > > - other host level changes > > - whether the VM should be able to live migrate > > It looks like libvirt supports setting up DSCP bits with nwfilter, per: > > https://libvirt.org/formatnwfilter.html Oh, on reading this, nwfilter can only match, not set, so that won't help. It doesn't look like libvirt has a way to set something like that. Do VDSM hooks only act on the XML, or is there a way to configure things outside of libvirt? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Port mirroring outside traffic into a VM?
I have a network traffic monitor that is on a physical machine right now. It has two network interfaces: one with an IP on a regular switch port, and one without an IP on a switch port that is the target of a port mirror/monitor session for the desired VLAN. I'd like to move this system to an oVirt VM (I'm running 3.5.1). Is this the right way to go about it (and still have the VM migratable)? - I have several hosts with extra network interfaces; pick at least a couple, connect them to switch ports that are configured for mirror/monitor session. - In oVirt admin console, choose the Networks tab, click New. Give the network a name (like "monitor"), leave VLAN tagging de-selected and VM Network selected. Under the Cluster section, de-select Required (because the mirror won't go to all hosts). Click OK to create. - Click on the network, select the vNIC Profiles tab, edit the default profile and select Port Mirroring. - Go to the Hosts tab. For each host with a port mirror, click on the host, then choose the Network Interfaces tab and Setup Host Networks. Drag the new network to its attached port, click the pencil, and set Boot Protocol to None. - Go to the Virtual Machines tab. Click on the VM, choose the Network Interfaces tab, and click New. Choose the monitor network in the Profile. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Port mirroring outside traffic into a VM?
Once upon a time, Genadi Chereshnya said: > If I understand you correctly you are trying to replace the physical device > mirroring with VM? Yes, that is correct. > If this is the case I don't think it's possible to do it with port mirroring > oVIRT feature. > The existing oVIRT port mirroing feature is for mirroring traffic between VM > devices for specific Network. > So if you have 3 VMs with network you can monitor on 1 VM that specific > network that is used between 2 other VMs. Ah, I see. Is there a way to get an external network interface (that happens to be a target of an external switch's port mirror/monitor session) to pass through to a VM? A way that still allows for live migration would be best of course, but even without that would be a start. Thanks. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] HA of VMs
Once upon a time, Matt Wells said: > I've been poking around for a better way to perform HA. With VM's like IPA > or even HA web sites behind an HAProxy; how do I ensure that they are never > on the same host? You just need to set up affinity groups. Negative affinity means keep VMs away from each other, and enforcing means _never_ do it (even it means shutting down a VM rather than moving it). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VDSM memory consumption
Once upon a time, Federico Alberto Sayd said: > I am experiencing troubles with VDSM memory consuption. > > I am running > > Engine: ovirt 3.5.1 > > Nodes: > > Centos 6.6 > VDSM 4.16.10-8 > Libvirt: libvirt-0.10.2-46 > Kernel: 2.6.32 > > When the host boots, memory consuption is normal, but after 2 or 3 > days running, VDSM memory consuption grows and it consumes more > memory that all vm's running in the host. If I restart the vdsm > service, memory consuption normalizes, but then it start growing > again. > > I have seen some BZ about vdsm and supervdsm about memory leaks, but > I don't know if VDSM 4.6.10.8 is still affected by a related bug. Can't help, but I see the same thing with CentOS 7 nodes and the same version of vdsm. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt / ROM images / PXE
Once upon a time, Paul Heinlein said: > Summary: To get oVirt-managed VMs to boot using PXE, I had to > replace the rhel6-*.rom files with their ipxe equivalents. I'm PXE booting oVirt VMs with no trouble. I have CentOS 7 nodes, running oVirt 3.5.1 (hosted engine on CentOS 6). Each node has a pair of NICs in a LACP bond to a switch stack, running 802.1q on top of that, with several VLANs (only one VLAN has a DHCP server and a local CentOS repo, so I put VMs on that VLAN for install). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt / ROM images / PXE
Once upon a time, Paul Heinlein said: > Good data point! Can you tell me the compatibility version of your > data center and its cluster(s)? How about the cluster CPU type? One DC, one cluster, version 3.5. Intel Nehalem CPU. I've PXE booted CentOS 5, 6, and 7 VMs (64 bit for all and 32 bit for 5/6). I'd suspect something in the network setup. I have VLANs on an 802.1q trunk on an LACP bond (with oVirt bridging the VLANs to VMs). My DHCP server (separate physical CentOS 6 box) is also running VLANs on 802.1q on LACP bond, with dnsmasq listening on one VLAN. I'd look at traffic coming out of the VM on the node, and coming into the DHCP server, and see who sees what (are the requests coming out of the VM, is the DHCP server seeing them, is it replying, does the VM get the reply). If the DHCP requests are making it to the server, the next thing to see is if there is any difference in the DHCP options requested between the different ROM images (maybe your DHCP config isn't matching up correctly in some case that works on mine?). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] oVirt / ROM images / PXE
Once upon a time, Paul Heinlein said: > So it might be helpful to look at the DHCP options, but the server > is making OFFERs, so I'm not really sure what bits might be suspect. Do you see a difference between the DHCP options with the "bad" and "good" ROMs? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VDSM memory consumption
Once upon a time, Dan Kenigsberg said: > I'm afraid that we are yet to find a solution for this issue, which is > completly different from the horrible leak of supervdsm < 4.16.7. > > Could you corroborate the claim of > Bug 1147148 - M2Crypto usage in vdsm leaks memory > ? Does the leak disappear once you start using plaintext transport? So, to confirm, it looks like to do that, the steps would be: - In the [vars] section of /etc/vdsm/vdsm.conf, set "ssl = false". - Restart the vdsmd service. Is that all that is needed? Is it safe to restart vdsmd on a node with active VMs? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Communication errors between engine and nodes?
Setup: oVirt 3.5.1 w/hosted engine, nodes: CentOS 7, engine: CentOS 6 I am periodically seeing errors like this in my engine web UI: 2015-Mar-10, 04:42 Host node5 is not responding. It will stay in Connecting state for a grace period of 89 seconds and after that an attempt to fence the host will be issued. 2015-Mar-10, 04:42 Host node3 from cluster c1 was chosen as a proxy to execute Status command on Host node5. 2015-Mar-10, 04:42 Status of host node5 was set to Up. 2015-Mar-10, 04:42 Host node5 power management was verified successfully. The engine.log file has this: 2015-03-10 04:42:23,310 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ListVDSCommand] (DefaultQuartzScheduler_Worker-40) [75b9e6d9] Command ListVDSCommand(HostName = node5, HostId = 8dfd0195-f386-4e16-9379-a5287221d5bd, vds=Host[node5,8dfd0195-f386-4e16-9379-a5287221d5bd]) execution failed. Exception: VDSNetworkException: VDSGenericException: VDSNetworkException: Heartbeat exeeded This seems to happen with a random node sometimes. The VMs on the node stay up and don't appear to experience any problem. I can't find any sign of a network problem on either the node, the engine, the node hosting the engine, or the switches. I don't see anything obvious in the logs on any of the systems involved either. The node network setup is VLANs on top of a bond of two NICs, each connected to a different switch in a two-switch stack. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Communication errors between engine and nodes?
Once upon a time, Chris Adams said: > 2015-03-10 04:42:23,310 ERROR > [org.ovirt.engine.core.vdsbroker.vdsbroker.ListVDSCommand] > (DefaultQuartzScheduler_Worker-40) [75b9e6d9] Command ListVDSCommand(HostName > = node5, HostId = 8dfd0195-f386-4e16-9379-a5287221d5bd, > vds=Host[node5,8dfd0195-f386-4e16-9379-a5287221d5bd]) execution failed. > Exception: VDSNetworkException: VDSGenericException: VDSNetworkException: > Heartbeat exeeded I'm trying to dig into this some on my own (without knowing about oVirt's internals); can somebody tell me the timeout for the dispatching of commands to vdsm? I get different things happening when the engine thinks a node has "gone away", but they all start with the same org.ovirt.engine.core.vdsbroker.vdsbroker bit (and have a network timeout of some type). I don't see anything in common in any of the logs at the time of the error, so I'm trying to roll back to when the request was sent (but I don't know how long it took for the engine to time out before the error was logged). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Communication errors between engine and nodes?
Once upon a time, Lior Vernia said: > If I'm not mistaken, heartbeat intervals are configured to 10 seconds by > default. Okay, thanks. > The command times out queries for the status of VMs on a host - any > reason to suspect why that's taking long? Does it happen on specific hosts? No idea. It seemed to happen on node5 a bunch over a week, but then there were errors on other nodes as well. It isn't always "Heartbeet exceeded", sometimes it is "VDSNetworkException: Message timeout which can be caused by communication issues". I haven't been able to find any network issues that could cause this (no errors logged anywhere). There doesn't seem to be any pattern to when it happens either. The log entry I posted was from 04:42 local time, and a bunch of the VMs are CentOS 5, which does log rotation at 04:00 by default (which can spike the CPU and disk I/O), but they are all done long before 04:42. It happened in the middle of the afternoon a couple of days ago, while I was logged-in to the web UI, and I didn't notice any unusual behavior. One other odd thing: I have also been experiencing an issue where I randomly get logged out of the web UI. Usually nothing else was going on, but a couple of times it seemed to correspond with one of the node errors (hard to tell). It looked like the same error as BZ 1198493 (I'd see a bunch of "Failed to log User null@N/A out" messages). I don't know if these issues are related or that was just coincidence. To try to rule out any unseen network issues, I started an fping to all seven nodes and the engine from another physical system on the same VLAN. It is sending one ping to each of the eight hosts every 0.2 seconds. That has not shown a dropped packet since I started yesterday afternoon. However, during that time, I also have not seen any engine/vdsm timeouts. I was going to say I had not been logged out of the web UI, but that just happened while I was typing the previous sentence. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VDSM memory consumption
Once upon a time, Sven Kieske said: > On 13/03/15 12:29, Kapetanakis Giannis wrote: > > We also face this problem since 3.5 in two different installations... > > Hope it's fixed soon > > Nothing will get fixed if no one bothers to > open BZs and send relevants log files to help > track down the problems. There's already an open BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1158108 I'm not sure if that is exactly the same problem I'm seeing or not; my vdsm process seems to be growing faster (RSS grew 952K in a 5 minute period just now; VSZ didn't change). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Communication errors between engine and nodes?
Once upon a time, Roel de Rooy said: > We are observing the same thing with our oVirt environment. > At random moments (could be a couple of times a day , once a day or even once > every couple of days), we receive the "VDSNetworkException" message on one of > our nodes. > Haven't seen the "heartbeat exceeded" message, but could be that I overlooked > it within our logs. > At some rare occasions, we also do see "Host cannot access the Storage > Domain(s) attached to the Data Center", within the GUI. > > VM's will continue to run normally and most of the times the nodes will be in > "UP" state again within the same minute. > > Will still haven't found the root cause of this issue. > Our engine is CentOS 6.6 based and it's happing with both Centos 6 and Fedora > 20 nodes. > We are using a LCAP bond of 1Gbit ports for our management network. > > As we didn't see any reports about this before, we are currently looking if > something network related is causing this. I just opened a BZ on it (since it isn't just me): https://bugzilla.redhat.com/show_bug.cgi?id=1201779 My cluster went a couple of days without hitting this (as soon as I posted to the list of course), but then it happened several times overnight. Interestingly, one error logged was communicating with the node currently running my hosted engine. That should rule out external network (e.g. switch and such) issues, as those packets should not have left the physical box. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Communication errors between engine and nodes?
So, in my case, I'm wondering if maybe there is some kind of weird network issue happening. The node that seems to be showing up most for the last day or two is one of the two nodes running the hosted-engine HA, and is _not_ currently hosting the engine. It seems that, at the same time the engine has trouble communicating with that node, the hosted-engine HA running on that node has trouble seeing the engine. I still can't find any actual network problem. Using another physical system, I ran fping to all the nodes and the engine with a 0.2 second interval, and that didn't show any problem (I ran it until I also saw an instance of the engine->node communication error). I'm watching ARP traffic now to see if something is sending bad answers. I'm pretty stumped at this point of what to look at next. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [QE][ACTION REQUIRED] oVirt 3.5.2 and 3.5.3 status
Once upon a time, Sandro Bonazzola said: > We have 3 open blockers for 3.5.2[1]: Any chance the vdsm memory leak fix (RHBZ 1158108) will make 3.5.2? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Problem with hosted engine setup on VLAN/bond combo
I am trying to install oVirt with the hosted engine. The physical system is CentOS 6.5 x86_64 (with all current updates). It is connected to a two-switch stack via bond0 (running LACP), which is a VLAN trunk, and the management interface is vlan51. This doesn't work with oVirt 3.4.2, but I see that both 3.4.3 and 3.5 have "support engine on bond" and "support engine on vlan" in the release notes, so I tried again with today's 3.4.3 RC. I got a different error from "hosted-engine --deploy": [ ERROR ] Failed to execute stage 'Misc configuration': Command '/usr/bin/vdsClient' failed to execute I see this in /var/log/vdsm/supervdsm.log: MainProcess|Thread-16::INFO::2014-07-10 10:20:23,003::configNetwork::275::root::(addNetwork) Adding network ovirtmgmt with vlan=51, bonding=None, nics=['bond0'], bondingOptions=None, mtu=None, bridged=True, defaultRoute=True,options={'bootproto': 'static', 'ONBOOT': 'yes'} MainProcess|Thread-16::ERROR::2014-07-10 10:20:23,003::supervdsmServer::100::SuperVdsm.ServerCallback::(wrapper) Error in addNetwork Traceback (most recent call last): File "/usr/share/vdsm/supervdsmServer", line 98, in wrapper res = func(*args, **kwargs) File "/usr/share/vdsm/supervdsmServer", line 190, in addNetwork return configNetwork.addNetwork(bridge, **options) File "/usr/share/vdsm/configNetwork.py", line 186, in wrapped return func(*args, **kwargs) File "/usr/share/vdsm/configNetwork.py", line 287, in addNetwork blockingdhcp=blockingdhcp, **options) File "/usr/share/vdsm/configNetwork.py", line 121, in objectivizeNetwork topNetDev = Nic(nic, configurator, mtu=mtu, _netinfo=_netinfo) File "/usr/share/vdsm/netmodels.py", line 80, in __init__ raise ConfigNetworkError(ne.ERR_BAD_NIC, 'unknown nic: %s' % name) ConfigNetworkError: (23, 'unknown nic: bond0') Is there maybe still a problem combining a VLAN on a bond? A little background (if it helps): this is my first attempt with oVirt. I'm installing on a clean CentOS install on a spare box, with the intent to get oVirt up and running and then convert (one node at a time) an existing VM setup (running old stand-alone Xen installs) to oVirt. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Problem with hosted engine setup on VLAN/bond combo
Once upon a time, Sven Kieske said: > Am 10.07.2014 18:02, schrieb Chris Adams: > > Is there maybe still a problem combining a VLAN on a bond? > > Yes exactly, but just with hosted-engine. > See this BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1072027 > (Should be resolved with 3.5) Well, the same notes are in the 3.4.3 release notes as 3.5: oVirt Hosted Engine Setup BZ 162 - [RFE] [ovirt-hosted-engine-setup] add support for bonded interfaces BZ 1117634 - [RFE] Hosted Engine deploy should support VLAN-tagged interfaces Since I got a different error when I tried 3.4.3-RC vs. 3.4.2 (and I'm trying with a VLAN on top of a bond), I was concerned that the same fix as 3.5 was in 3.4.3-RC, and would not actually fix my combo setup. > There is also a workaround in the BZ, didn't try it myself. Yeah, I can't (easily anyway) disable the VLAN trunk and bond and then re-enable them (since that breaks network access, and I'm not sitting at the same location as the system). I will try Robert Story's suggestion of manually configuring the bridge before running the deploy. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VDSM respawning too quickly
Once upon a time, Kyle Gordon said: > Following an upgrade from 3.3 to 3.4, I've been greeted with this > message in /var/log/messages, on my CentOS 6.5 server. I'm hitting the same thing with an up-to-date CentOS 6.5 trying to install hosted-engine. It appears the problem is an updated pythong-pthreading package in EPEL, version 0.1.3-2. There's already a 0.1.3-3 in koji that rolls back the patch in 0.1.3-2. http://koji.fedoraproject.org/koji/buildinfo?buildID=543650 -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Odd messages on new node/hosted engine
mstore_engine/74cb6a07-5745-4b21-ba4b-d9012acb5cae Thread-40::DEBUG::2014-07-16 14:34:19,856::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend Thread-40::DEBUG::2014-07-16 14:34:19,860::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=hosted_storage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=ON', 'LOCKRENEWALINTERVALSEC=5', 'MASTER_VERSION=1', 'POOL_DESCRIPTION=c1', 'POOL_DOMAINS=74cb6a07-5745-4b21-ba4b-d9012acb5cae:Active', 'POOL_SPM_ID=-1', 'POOL_SPM_LVER=0', 'POOL_UUID=b15478ff-1ae1-4065-8e52-19c808d39597', 'REMOTE_PATH=nfs.c1.api-digital.com:/vmstore/engine', 'ROLE=Master', 'SDUUID=74cb6a07-5745-4b21-ba4b-d9012acb5cae', 'TYPE=NFS', 'VERSION=3', '_SHA_CKSUM=4f007c871da3177ba5546459bcebc8be8aff689e'] Thread-40::DEBUG::2014-07-16 14:34:19,863::fileSD::609::Storage.StorageDomain::(imageGarbageCollector) Removing remnants of deleted images [] Thread-40::INFO::2014-07-16 14:34:19,863::sd::383::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace 74cb6a07-5745-4b21-ba4b-d9012acb5cae_imageNS already registered Thread-40::INFO::2014-07-16 14:34:19,863::sd::391::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace 74cb6a07-5745-4b21-ba4b-d9012acb5cae_volumeNS already registered Thread-40::DEBUG::2014-07-16 14:34:19,868::fileSD::259::Storage.Misc.excCmd::(getReadDelay) '/bin/dd iflag=direct if=/rhev/data-center/mnt/nfs.c1.api-digital.com:_vmstore_engine/74cb6a07-5745-4b21-ba4b-d9012acb5cae/dom_md/metadata bs=4096 count=1' (cwd None) Thread-40::DEBUG::2014-07-16 14:34:19,885::fileSD::259::Storage.Misc.excCmd::(getReadDelay) SUCCESS: = '0+1 records in\n0+1 records out\n476 bytes (476 B) copied, 0.000548138 s, 868 kB/s\n'; = 0 -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] starting VDSM gives "libvir: XML-RPC error : authentication failed: authentication failed" error
Once upon a time, ybronhei said: > On 07/21/2014 01:45 AM, Jorick Astrego wrote: > >Hi, > > > >Some more info, I think this is the problem and I also have it on the > >node image: > > > >Jul 20 22:41:49 localhost libvirtd: unable to open Berkeley db > >/etc/libvirt/passwd.db: No such file or directory > > > Hey > > Did you check if libvirtd services is up? > can you share libvirtd.conf file? > I'm not sure what is exactly the issue, but you can try "vdsm-tool > configure --module libvirt" command to see if you set the vdsm > configuration for libvirt as required. I see the same thing with a new 3.5-beta install. libvirtd is running; I ran the above vdsm-tool command, but that made no difference. The libvirtd.conf has the following config (set by vdsm install): ## beginning of configuration section by vdsm-4.13.0 keepalive_interval=-1 log_outputs="1:file:/var/log/libvirt/libvirtd.log" unix_sock_rw_perms="0770" auth_unix_rw="sasl" log_filters="3:virobject 3:virfile 2:virnetlink 3:cgroup 3:event 3:json 1:libvirt 1:util 1:qemu" cert_file="/etc/pki/vdsm/certs/vdsmcert.pem" unix_sock_group="qemu" listen_addr="0.0.0.0" ca_file="/etc/pki/vdsm/certs/cacert.pem" key_file="/etc/pki/vdsm/keys/vdsmkey.pem" host_uuid="74e1d154-d83f-4852-9c35-3c931f8b45cf" ## end of configuration section by vdsm-4.13.0 If I comment out the auth_unix_rw line and restart libvirtd, vdsmd will start successfully. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] starting VDSM gives "libvir: XML-RPC error : authentication failed: authentication failed" error
Once upon a time, Chris Adams said: > Once upon a time, ybronhei said: > > On 07/21/2014 01:45 AM, Jorick Astrego wrote: > > >Hi, > > > > > >Some more info, I think this is the problem and I also have it on the > > >node image: > > > > > >Jul 20 22:41:49 localhost libvirtd: unable to open Berkeley db > > >/etc/libvirt/passwd.db: No such file or directory > > > > > Hey > > > > Did you check if libvirtd services is up? > > can you share libvirtd.conf file? > > I'm not sure what is exactly the issue, but you can try "vdsm-tool > > configure --module libvirt" command to see if you set the vdsm > > configuration for libvirt as required. > > I see the same thing with a new 3.5-beta install. libvirtd is running; > I ran the above vdsm-tool command, but that made no difference. The problem is that /usr/lib64/python2.6/site-packages/vdsm/constants.py (from vdsm-python-4.16.0-3.git601f786.el6.x86_64) sets EXT_SASLPASSWD2 to /sbin/saslpasswd2, but the binary is actually in /usr/sbin. I fixed that and manually set the password (and fixed the ovirt-ha* init scripts), but still can't deploy a 3.5-beta hosted engine. When configuring the management bridge, it leaves the interface down (both the bridge and the underlying interface); I brought them back up, but the setup then hits this: [ INFO ] Verifying sanlock lockspace initialization [ ERROR ] Failed to execute stage 'Misc configuration': [Errno 2] No such file or directory I see this in the setup log: 2014-07-22 11:27:39 DEBUG otopi.context context._executeMethod:152 method exception Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/sanlock/lockspace.py", line 163, in _misc lockspace + '.metadata': md_size, File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 336, in create service_size=size) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 268, in create_volume raise RuntimeError(response["status"]["message"]) RuntimeError: [Errno 2] No such file or directory 2014-07-22 11:27:39 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Misc configuration': [Errno 2] No such file or directory Not sure what is happening there. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Misc architecture questions
Once upon a time, Sandro Bonazzola said: > >> Unfortunately, it seems like Ovirt 3.4 does not support installing > >> self-hosted engine on bond + vlan. I tried 3.5 but there were to much > >> bug to be usable and the project is set to be deployed in 2 months. > > > > should be available via: > > http://gerrit.ovirt.org/#/c/29730/ > > ? > > Yes, it's available in 3.4.3[1]. However, looks like VDSM has some issue with > the network [2][3] As far as I can tell, "hosted-engine --deploy" still fails if you try to use a VLAN on top of a bond, in both 3.4 and 3.5 (I just tried again with a clean install of the latest 3.5 snapshot). The deploy script dies when it tries to configure the management bridge. vdsm.log ends with this (bond0 definately already exists): Thread-16::ERROR::2014-07-23 08:40:59,436::API::1363::vds::(addNetwork) unknown nic: bond0 Traceback (most recent call last): File "/usr/share/vdsm/API.py", line 1361, in addNetwork supervdsm.getProxy().addNetwork(bridge, options) File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ return callMethod() File "/usr/share/vdsm/supervdsm.py", line 48, in **kwargs) File "", line 2, in addNetwork File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod raise convert_to_error(kind, result) ConfigNetworkError: (23, 'unknown nic: bond0') If I manually create the ovirtmgmt bridge first, I can install a 3.5-snapshot hosted engine, although the install gets stuck at the end on "Still waiting for VDSM host to become operational". It results in an install that thinks the ovirtmgmt network is unsynchronized and can't be synchronized because the network is being used (of course, it is being used by the engine). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Misc architecture questions
Once upon a time, Chris Adams said: > If I manually create the ovirtmgmt bridge first, I can install a > 3.5-snapshot hosted engine, although the install gets stuck at the end > on "Still waiting for VDSM host to become operational". It results in > an install that thinks the ovirtmgmt network is unsynchronized and can't > be synchronized because the network is being used (of course, it is > being used by the engine). I found the source of the "unsynchronized" problem; the setup did not create the interface in the oVirt config as a VLAN. I changed the config to include the VLAN tag, and then it sees the configuration as synchronized. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Network question: mirrored port?
I have a couple of traffic-monitoring servers that get a copy of all traffic on a VLAN via a mirrored port on the switch, connected to a dedicated port on each server. Is there a good way to run that type of traffic into a VM? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Error migrating VM with direct LUN disk
I have a VM with a virtio-scsi disk that is a direct-mapped iSCSI LUN. I'm trying to migrate it from one node to another (in the process of updating my system from 3.5.3 to 3.5.5), and it fails migration with: Thread-1886273::ERROR::2016-01-06 10:31:17,657::migration::161::vm.Vm::(_recover) vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::unsupported configuration: scsi-block 'lun' devices do not support the serial property Thread-1886273::ERROR::2016-01-06 10:31:17,693::migration::260::vm.Vm::(run) vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::Failed to migrate File "/usr/share/vdsm/virt/migration.py", line 246, in run self._startUnderlyingMigration(time.time()) File "/usr/share/vdsm/virt/migration.py", line 335, in _startUnderlyingMigration File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1701, in migrateToURI2 if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self) Thread-1886279::DEBUG::2016-01-06 10:31:18,539::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.getMigrationStatus' in bridge with {u'vmID': u'606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43'} Is this a known bug (maybe fixed in a newer version), something unexpected, etc.? Is there a way around it (other than shutting down the VM)? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error migrating VM with direct LUN disk
Once upon a time, Chris Adams said: > I have a VM with a virtio-scsi disk that is a direct-mapped iSCSI LUN. > I'm trying to migrate it from one node to another (in the process of > updating my system from 3.5.3 to 3.5.5), and it fails migration with: > > Thread-1886273::ERROR::2016-01-06 > 10:31:17,657::migration::161::vm.Vm::(_recover) > vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::unsupported configuration: > scsi-block 'lun' devices do not support the serial property > Thread-1886273::ERROR::2016-01-06 10:31:17,693::migration::260::vm.Vm::(run) > vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::Failed to migrate > File "/usr/share/vdsm/virt/migration.py", line 246, in run > self._startUnderlyingMigration(time.time()) > File "/usr/share/vdsm/virt/migration.py", line 335, in > _startUnderlyingMigration > File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1701, in > migrateToURI2 > if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', > dom=self) > Thread-1886279::DEBUG::2016-01-06 > 10:31:18,539::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) Calling > 'VM.getMigrationStatus' in bridge with {u'vmID': > u'606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43'} > > Is this a known bug (maybe fixed in a newer version), something > unexpected, etc.? Is there a way around it (other than shutting down > the VM)? So, this isn't just a bug with migration; I powered off the VM and then tried to start it on a host that had been updated to 3.5.5 and it would not start (same error). This is a pretty significant regression IMHO - I can't start this VM on any 3.5.5 host. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Error migrating VM with direct LUN disk
Once upon a time, Yedidyah Bar David said: > On Thu, Jan 7, 2016 at 10:16 AM, Chris Adams wrote: > > Once upon a time, Chris Adams said: > >> I have a VM with a virtio-scsi disk that is a direct-mapped iSCSI LUN. > >> I'm trying to migrate it from one node to another (in the process of > >> updating my system from 3.5.3 to 3.5.5), and it fails migration with: > >> > >> Thread-1886273::ERROR::2016-01-06 > >> 10:31:17,657::migration::161::vm.Vm::(_recover) > >> vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::unsupported configuration: > >> scsi-block 'lun' devices do not support the serial property > >> Thread-1886273::ERROR::2016-01-06 > >> 10:31:17,693::migration::260::vm.Vm::(run) > >> vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::Failed to migrate > >> File "/usr/share/vdsm/virt/migration.py", line 246, in run > >> self._startUnderlyingMigration(time.time()) > >> File "/usr/share/vdsm/virt/migration.py", line 335, in > >> _startUnderlyingMigration > >> File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1701, in > >> migrateToURI2 > >> if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', > >> dom=self) > >> Thread-1886279::DEBUG::2016-01-06 > >> 10:31:18,539::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) > >> Calling 'VM.getMigrationStatus' in bridge with {u'vmID': > >> u'606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43'} > >> > >> Is this a known bug (maybe fixed in a newer version), something > >> unexpected, etc.? Is there a way around it (other than shutting down > >> the VM)? > > > > So, this isn't just a bug with migration; I powered off the VM and then > > tried to start it on a host that had been updated to 3.5.5 and it would > > not start (same error). > > > > This is a pretty significant regression IMHO - I can't start this VM on > > any 3.5.5 host. > > Seems like a result of the fix for [1]. > > Please check/post vdsm/libvirt/qemu versions and logs. Old node (that is working) is CentOS 7.1, oVirt 3.5.3: vdsm-4.16.20-0.el7.centos.x86_64 libvirt-client-1.2.8-16.el7_1.3.x86_64 qemu-kvm-ev-2.1.2-23.el7_1.3.1.x86_64 New nodes (that won't work) are CentOS 7.2, oVirt 3.5.6 (I said 3.5.5 earlier but they are updated as of yesterday): vdsm-4.16.30-0.el7.centos.x86_64 libvirt-client-1.2.17-13.el7_2.2.x86_64 qemu-kvm-ev-2.3.0-29.1.el7.x86_64 Please let me know which logs - the above snip is from vdsm.log when trying to migrate the VM (got the same error when just trying to start it). All that is in the libvirt/qemu/.log when I tried to start the VM on an upgraded node is "shutting down" (no errors or other messages). > Do you have any vdsm hooks installed? No. I did wonder if I could work around this with a hook to strip out the in the XML (but haven't written a hook before so haven't tried that yet). > Was this system upgraded from a previous version? Please > state the upgrade history. It started with either 3.5 or 3.5.1 (can't remember for sure now), then upgraded to 3.5.2, 3.5.3, and 3.5.6. > Thanks. Thank you. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] HP ProLiant ML10 v2 or ML110 Gen9 with Windows Server Essentials 2012 R2 guest
Once upon a time, gregor said: > And will Windows Server 2012 run in oVirt, > because on my test machine (HP ProLiant ML350 G5) it didn't (maybe the > CPU is to old). I just went through trying to get Windows Server 2012 Essentials (both "original" and R2) running on a cluster with Nehalem CPUs, and it would blue-screen during install. I replicated the problem on my Fedora desktop with plain KVM set up to emulate a Nehalem CPU. When I switched to Westmere or newer, Windows worked. This appears to be some difference between Essentials and Standard edition (I am running Windows Server 2012 Standard VMs on my cluster just fine). A co-worker searching around on the Internet also found some VirtualBox users having similar issues with Essentials. So, with Westmere or newer CPU, I think Essentials should be okay, but don't try it with Nehalem. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Dumb question: exclamation mark next to VM?
I set up a new oVirt 3.6.2 cluster on CentOS 7.2 (everything up to date as of yesterday). I created a basic CentOS 7.2 VM with my local customizations, created a template from it, and then created a VM from that template. That new VM has an exclamation mark next to it in the web GUI (between the up arror for "running" and the "server" icon). Usually I would expect that means something is wrong or needs attention, but I can't find anything to fix/address/etc. (no messages in the Alerts, nothing odd in the Events, etc.). What does the exclamation mark mean, and how do I clear it? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Expand hosted-engine disk?
I'm running oVirt 3.6.2 on CentOS 7.2, all up to date, with a hosted engine. The storage for the engine is on a dedicated iSCSI LUN. When I created the LUN, I made it 40G so I'd have a little more disk space for the engine (logs, ISOs, etc.), but then forgot to make the VM image larger than the default 25G. Is there an easy way to extend the image now? When I try to do that in the web GUI, I got the "Cannot edit Virtual Machine Disk. This VM is not managed by the engine." error. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Dumb question: exclamation mark next to VM?
Once upon a time, Joe DiTommasso said: > If you mouse over the exclamation mark, you should get a tooltip that tells > you what it's complaining about. I've got it on pretty much all my VMs, > it's an issue with the timezone for me. I get nothing for the exclamation mark. I go straight from the "Up" tip to the "Server" tip. The ! is in the first column with the status icon (if you widen the columns it stay next to the up arror). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Dumb question: exclamation mark next to VM?
Once upon a time, Darrell Budic said: > After upgrading to 3.6.2, I’ve got a couple that are doing this to (No actual > tooltip for the exclamation point). One windows, two linux, funny thing is > they are all down at the moment and still have this warning… Weird, I think mine only have it when they are up. The VMs with the exclamation point are all CentOS 7.2+EPEL up-to-date (so ovirt-guest-agent-common-1.0.11-1.el7 from EPEL - is that current?). I installed a Windows Server 2012 Essentials today, using the ISO from ovirt-guest-tools-iso-3.6.0-0.2_master.fc22, and the Windows VM does not have an exclamation mark. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Stuck tasks in web UI
I'm running oVirt 3.6.2 on CentOS 7.2, all up to date, with a hosted engine. I tried to clone a VM and that failed (not sure why yet), but the first problem is that there are a couple of stuck tasks in the web UI from my attempts. The Events tab shows the tasks failed, and I ran "vdsClient -s 0 getAllTasksStatuses" on the SPM node, and it shows no tasks. How do I clear the tasks from the web UI? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Clone VM fails - could not create volume
On oVirt 3.6.2, I tried to clone a VM, but got an error that the volume couldn't be created. Checking the logs, I see (in vdsm.log on the SPM): jsonrpc.Executor/2::ERROR::2016-02-11 09:50:24,459::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Image is not a legal chain: (u'fa4d1802-6223-4800-8339-c194076cfb4b',)", 'code': 262}} The VM I am trying to clone is thin-provisioned from a template; is it "legal" to clone such a VM, or is this a bug? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Clone VM fails - could not create volume
Once upon a time, Chris Adams said: > On oVirt 3.6.2, I tried to clone a VM, but got an error that the volume > couldn't be created. Checking the logs, I see (in vdsm.log on the SPM): > > jsonrpc.Executor/2::ERROR::2016-02-11 > 09:50:24,459::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': > {'message': "Image is not a legal chain: > (u'fa4d1802-6223-4800-8339-c194076cfb4b',)", 'code': 262}} > > The VM I am trying to clone is thin-provisioned from a template; is it > "legal" to clone such a VM, or is this a bug? Hmm, weird; I also tried just copying the disk, and I could not. Fiddling around, I forced SPM over to another node, and copy worked, so I tried cloning again, and that worked too. Guess something needed a "reset". -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [ANN] oVirt 3.6.5 Final Release is now available
Once upon a time, Sven Kieske said: > On 26.04.2016 16:22, Gianluca Cecchi wrote: > > as the > > reported mirror site missed that too (3.6.4 released on late March) and is > > not aligned since more than one month now... > maybe it's time to setup some automatic mirror health checking service? > > how do other repositories like centos or fedora handle such issues? Fedora uses mirrormanager: https://fedoraproject.org/wiki/Infrastructure/MirrorManager but somebody has to manage the server side of that (mirror admin web access and such). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Restart VMs after failure
One of my oVirt clusters, running 3.6.5, lost power last night (power failure plus bad UPS batteries - batteries on order!). When power came back, the storage and nodes came back, and then the hosted engine started, but nothing else happened (no other VMs started). I expected that VMs that were running when the power failed would have been restarted once the engine came back up. Is there a way to make that happen? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Restart VMs after failure
Once upon a time, Nir Soffer said: > On Fri, May 6, 2016 at 4:01 PM, Chris Adams wrote: > > One of my oVirt clusters, running 3.6.5, lost power last night (power > > failure plus bad UPS batteries - batteries on order!). When power came > > back, the storage and nodes came back, and then the hosted engine > > started, but nothing else happened (no other VMs started). > > > > I expected that VMs that were running when the power failed would have > > been restarted once the engine came back up. Is there a way to make > > that happen? > > Yes, I think you need to define them as HA vm. Adding Michal to add more > info about this. The domains are all marked as HA. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Multipath iSCSI with several IPs
Once upon a time, Dan Yasny said: > Normally you > 1. enter the IP > 2. click discover > 3. login to whatever was found > 4. enter another IP instead of the first > 5. goto 2 How do you give the oVirt server two IPs (in the same subnet) though? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Multipath iSCSI with several IPs
Once upon a time, James Michels said: > Correct me if I'm wrong but I think Dan meant target's IPs. So if you have > a SAN backend with two IP addresses, you first discover LUNs from first IP > address, then discover LUNs from the second IP address, and so on... once > you have them all, you just check them and click on "OK" so the same target > is added with several IP addresses. You don't need to have one IP address > per oVirt server. Well, to do iSCSI multipath right, you should also have multiple interfaces on each client server, each with its own IP. I'm not sure how you do that with oVirt. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Multipath iSCSI with several IPs
Once upon a time, Yaniv Kaul said: > BTW, having two IPs on a single subnet is not a great idea - it usually > mean you have a SPOF somewhere (the switch perhaps?). Two NICs on the server, two NICs on the iSCSI target, each with an IP per NIC, and connected to two switches in between (either stacked or trunked). No SPOF. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Multipath iSCSI with several IPs
Once upon a time, James Michels said: > I guess you mean the 'iSCSI multipath' sub-tab under the 'Datacenters' tab. > There you can assign one or more networks to a iSCSI backend. In my opinion > you cannot have more than one interface within the same network segment to > do multipath, as you would have connectivity issues (not sure if ovirt > restricts creating two overlapping networks) The way I did it on a test 3.6 cluster was to create two networks in oVirt, "storage1" and "storage2". I assigned both networks to the hosts, connected to different NICs, and gave each an IP (in the iSCSI subnet). Then I could set up the iSCSI multipath in the oVirt data center. This seems weird/wrong, and I'm not sure oVirt actually configured both NICs in multipath. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] LVM2 Thinprovisioned
Once upon a time, Fernando Frediani said: > Thanks for the answer anyway. Hopefully at least LVM2 > Thinprovisioning comes up anytime soon. This has nothing to do with oVirt; it is something the core Linux LVM code does not support. Last time I looked, nobody was working on it upstream. You can still thin-provision VMs in oVirt, there's just not a way to release space if a VM image shrinks significantly. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] LVM2 Thinprovisioned
Once upon a time, Fernando Frediani said: > I use LVM2 and Thinprovisioned LVs to put Filesystems and it works > with no issues. It's just a question of handling it correctly to > tell it how to create each storage chunk that way. The same way > those LVs can be used to run VMs as they are in traditional LVM. > > Not sure what you mean by cote Linux not supporting it. To do that with multiple access, you have to be running in clustered LVM mode, and thin provisioning is not supported with CLVM. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Migrating self hosted engine from iSCSI to NFS ?
Once upon a time, Simone Tiraboschi said: > unfortunately moving an existing hosted-engine env from one storage kind to > another (without manually touching the engine DB) is currently not > supported. Please see: > http://lists.ovirt.org/pipermail/users/2016-July/041526.html I'm digging through this now, as I need to move my oVirt 3.5 setup from one storage array to another (both iSCSI), including the hosted engine. Reading this: https://bugzilla.redhat.com/show_bug.cgi?id=1240466#c21 it sounds like that's not currently possible (at least with 3.5). Is that correct? I was planning to follow this process: https://www.ovirt.org/documentation/admin-guide/hosted-engine-backup-and-restore/ which says "point to the new shared storage" - will that not work? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Migrating self hosted engine from iSCSI to NFS ?
Once upon a time, Simone Tiraboschi said: > The issue is that the engine DB backup you are going to restore already > contains a reference to the previous hosted-engine storage domain and to > the previous hosted-engine VM and so on and so the auto-import procedure to > have the engine VM looking up for its own infrastructure will not trigger. > You have to manually remove them form the DB you restored. Okay, that makes sense. I see this from you: https://gerrit.ovirt.org/#/c/64966/ Should that work okay with a 3.5 database? I'm familiar with SQL, so if it needs some tweaks, I can handle that (just looking really to see if that's the right general idea). If so, could I connect the new iSCSI storage to a host, shutdown the engine, "dd" the engine over, start up the new location in single-user mode, and make the DB change? Basically, just wondering if I could skip the full install and jump right to an installed system. Thanks for your help. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Migrating self hosted engine from iSCSI to NFS ?
Once upon a time, Simone Tiraboschi said: > On Thu, Sep 29, 2016 at 6:35 PM, Chris Adams wrote: > > If so, could I connect the new iSCSI storage to a host, shutdown the > > engine, "dd" the engine over, start up the new location in single-user > > mode, and make the DB change? > > > > Basically, just wondering if I could skip the full install and jump > > right to an installed system. > > With 3.5 you probably can do just that. > Then you have to edit /etc/ovirt-hosted-engine/hosted-engine.conf on all of > your hosts to point to the new storage device. Okay, thanks. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Import from OVA with encrypted root
I'm trying to import an appliance image from a vendor. It is based on Debian. For some added level of "security" I guess, the vendor disk image has the root filesystem encrypted (and then the key is in the initrd - I know that's no real added security, but... whatever). Trying to import this VM into oVirt fails because it can't find/mount the root filesystem. Is there any way around this? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Import from OVA with encrypted root
Once upon a time, Tomáš Golembiovský said: > unfortunately virt-v2v cannot import VMs with encrypted root file > system. Moreover import of Debian/Ubuntu/Mint guests is not yet > supported by oVirt either. For that you would need development version > of virt-v2v. There are no packages for RHEL/CentOS yet. There should be > packages in Fedora rawhide if you feel brave enough to setup such host > in oVirt (Note: I'm not suggesting you or anyone should do that). So, I went the manual route. I made a new VM of appropriate size, with a non-thin-provisioned IDE disk, and booted it from a rescue CD. I extracted the vmdk from the ova file, used qemu-img to convert it to raw, and used netcat to dump it over the network into the VM and onto the disk. That of course doesn't do any of the things that should be done to "convert" a VM, but (at least in this case), it appears to have worked "good enough" (the VM boots and gets on the network). Still amused that somebody thinks distributing an image with encrypted filesystems, and the key for that encryption in the initrd, does anything to "secure" their image. Sigh... -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [HEADS UP] CentOS 7.3 is rolling out, need qemu-kvm-ev 2.6
Once upon a time, Sandro Bonazzola said: > In terms of ovirt repositories, qemu-kvm-ev 2.6 is available right now in > ovirt-master-snapshot-static, ovirt-4.0-snapshot-static, and ovirt-4.0-pre > (contains 4.0.6 RC4 rpms going to be announced in a few minutes.) Will qemu-kvm-ev 2.6 be added to any of the oVirt repos for prior versions (such as 3.5 or 3.6)? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.phx.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Optimizations for VoIP VM
Once upon a time, Jim Kusznir said: > Are there known issues with VoIP on Ovirt-managed clusters? (I know well > reputed companies that sell VoIP server virtual hosting and guarantee the > performance, so I know VoIP Virtualization is possible, just need to know > if its recommended with Ovirt, and if so what do I need to do to give it > the best chance of success?) I am running Asteria (an Asterisk-based PBX system targeted at small call-center type setups) in an oVirt VM with no problems. We typically have 30-50 calls at a time during the business day. I've also set up Digium's Switchvox in an oVirt VM without issue (small office setup, so not a lot of simultaneous calls). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Optimizations for VoIP VM
Once upon a time, Yaniv Dary said: > Can you please describe the application network requirements? > Does it relay on low latency? Pass-through or SR-IOV could help with > reducing that. For VoIP, latency can be an issue, but the amount of latency from adding VM networking overhead isn't a big deal (because other network latency will have a larger impact). 10ms isn't really a problem for VoIP for example. The bigger network concern for VoIP is jitter; for that, the only solution is to not over-provision hardware CPUs or total network bandwidth. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Attaching ISO to hosted engine for OS upgrade
I'm working on upgrading an oVirt 3.5 setup. The physical hosts are running CentOS 7, but the hosted engine is CentOS 6. The upgrade notes are "back up the engine, upgrade/reinstall the OS, then restore", but I can't see how to actually install CentOS 7 on the engine. Am I supposed to re-run "hosted-engine --deploy"? Wouldn't that try to re-register the physical hosts, or can I interrupt it to restore the backup? Or, is there a way to just attach an install ISO to the engine VM and boot from that? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Attaching ISO to hosted engine for OS upgrade
Once upon a time, Simone Tiraboschi said: > Then ee have a specific helper utility for 3.6/el6 -> 4.0/el7: > https://www.ovirt.org/develop/release-management/features/hosted-engine-migration-to-4-0/ Ahh, that looks better. I was looking at this: https://www.ovirt.org/documentation/migration-engine-36-to-40/ which just kind of glosses over how to upgrade the OS. :) I do usually use my custom CentOS install (rather than the appliance); is there a way to do that? Also, is it normally recommended to upgrade one major release at a time? In other words, aside from the engine CentOS6->7 step, would upgrading from 3.5 to 4.1 need to go through 3.6 and 4.0 along the way? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Attaching ISO to hosted engine for OS upgrade
Once upon a time, Simone Tiraboschi said: > On Wed, Feb 22, 2017 at 8:04 PM, Chris Adams wrote: > > Also, is it normally recommended to upgrade one major release at a time? > > For the engine it's not just recommended, it's mandatory! Ahh, I didn't realize that. I don't think I saw that in the documentation (but maybe I just missed it?). Thanks. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Recognize HE iSCSI volume size change
I'm testing upgrading an oVirt 3.5 setup, and I have run into a problem when going from 3.5 to 3.6 on a physical machine configured for the hosted engine. I upgraded the engine itself okay, but when I upgraded the first physical machine, it cannot be re-activated; it gets an error connecting to the storage domain. Checking the logs, it looks like it is looping trying to create a new LV in the HE VG. I assume this is for moving the HE config to the shared storage? It is failing because it is trying to create a 1G LV, but the VG only has 512M free space. I extended the iSCSI volume, but there doesn't appear to be anyway to get the HE nodes to recognize this; they both still see the original size, no matter what I try. Is there a way to get them to see the larger PV, so the new LV(s) can be created? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Recognize HE iSCSI volume size change
I did see that page, but... I can't get there from here. I can't get upgraded from 3.5 until I get past this problem, and on 3.5, the hosted engine storage domain isn't included in the normal UI at all. I think I did get this working though; both servers had kernel messages that they saw the LUN resize, but they didn't actually change the block device size to reflect that. After rebooting each server (separately), lsblk showed the new size on both, and a manual pvresize on both shows the increased VG size. On to testing 3.6 upgrade again! Once upon a time, Adam Litke said: > Hi Chris. We added this feature to newer versions of oVirt (see the > feature page[1]). The easiest way to work around this problem might be to > add an additional LUN to this domain if you are able to do it. If not, it > looks like you would need to manually reconnect the host to the domain, to > a pvresize to the new size. I am not sure if any engine DB updates will > also be required. Nir and Fred worked on this feature and might be able to > assist you further. > > > [1] > https://www.ovirt.org/develop/release-management/features/storage/lun-resize/ > > On Fri, Feb 24, 2017 at 9:00 AM, Chris Adams wrote: > > > I'm testing upgrading an oVirt 3.5 setup, and I have run into a problem > > when going from 3.5 to 3.6 on a physical machine configured for the > > hosted engine. I upgraded the engine itself okay, but when I upgraded > > the first physical machine, it cannot be re-activated; it gets an error > > connecting to the storage domain. > > > > Checking the logs, it looks like it is looping trying to create a new LV > > in the HE VG. I assume this is for moving the HE config to the shared > > storage? It is failing because it is trying to create a 1G LV, but the > > VG only has 512M free space. > > > > I extended the iSCSI volume, but there doesn't appear to be anyway to > > get the HE nodes to recognize this; they both still see the original > > size, no matter what I try. Is there a way to get them to see the > > larger PV, so the new LV(s) can be created? > > > > -- > > Chris Adams > > ___ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > > > > > -- > Adam Litke > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Recognize HE iSCSI volume size change
Once upon a time, Nir Soffer said: > This is not enough, you need to resize the multiapth mapping on all hosts, > resize the pv using the LUN (must be done by the SPM), and invalidate > vdsm lvm cache on all hosts, so they go to storage and see the new size of > the pv. I did all of this except invaliding the vdsm lvm cache - how would I do that? I will say just doing up to the pvresize worked in my test environment (but I might have just been lucky). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Recognize HE iSCSI volume size change
Once upon a time, Nir Soffer said: > I think the complete flow on 3.5 can work like this: > > 1. stop ovirt-engine, so it will not try to restart vdsm on any host > 2. stop vdsm on all hosts > 3. rescan scsi bus, resizing luns on all hosts > 4. pvresize the pv from one on the host > 5. start vdsm on all hosts > 6. start ovirt-engine > > This will allow resize while the storage is online and vms are running. Thanks! -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] 3.5->3.6 did not import hosted engine storage domain
So, on to my next upgrade issue (sorry for all the questions and thanks for everybody's help)... I upgraded my test cluster from 3.5 to 3.6 (latest version of each, all on CentOS 7 except the engine on CentOS 6). Now I'm working on the next step, upgrading to 4.0 and migrating the HE to the appliance. When I went from 3.5 to 3.6, I ended up with an fhanswers.conf in the shared storage that only contained "None"; I fixed that based on some mailing list messages (but just mentioning it in case it could be related). My problem is that the hosted engine storage domain did not get imported into the engine DB, so I can't proceed with "hosted-engine --upgrade-appliance". I didn't see any errors, so I'm not sure how that happened. I'm also not sure how to fix that. Suggestions? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain
Once upon a time, Simone Tiraboschi said: > Can you please attach your engine.log ? Sorry, I was rolling back to 3.5 snapshots to test my 3.6 procedure (trying to make sure I didn't just screw up), made a mistake, and started over. Now however, I can't do anything, because jpackage.org has really screwed up their DNS - registered to 3 nameservers, two of which only exist as glue records (not in authoritative DNS), and all three point to the same IP (which is not responding). -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Cluster compatibility version and major upgrades
Hello again, still working on my upgrade from 3.5... I'm trying to understand the cluster compatibility version setting and how that applies to major upgrades. Do I have to always raise the compatibility version when I do a major upgrade? In other words, when I upgrade from 3.5 to 3.6, do I need to raise it to 3.6 before I upgrade to 4.0 (and then again raise it 4.0 before upgrading to 4.1)? It looks like the 3.6->4.0 EL6->EL7 migration requires the cluster compatibility level to be at 3.6 (if I'm reading things right). It appears that when I upgrade to 3.6, I will have to stop all running VMs to raise the compatibility version (and I found an open bug about whether that's possible with the hosted engine). It sounds like with 4.0, the VMs can be flagged for compatibility and I can reboot them individually. I have over 80 VMs, many behind a load balancer (for HA and load sharing), but taking them all down will obviously still interrupt service for a while. Is there a safe way around that? I saw someone mention they partitioned their servers and made a new cluster (with the new version), and migrated VMs from cluster to cluster. Can I do live migrations in that case? How do I get the hosted engine from one cluster to another (especially with starting at 3.5)? -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain
Once upon a time, Simone Tiraboschi said: > > I recreated a 3.5 setup and upgraded the engine to 3.6 - that should > > have been enough to import the hosted engine storage domain, right? > > Did you also raised the cluster compatibility level to 3.6 on the engine? No (I didn't realize this didn't happen until that was changed). However, now I'm back into the catch-22 of 3.6.7+hosted engine: the cluster compatibility level can't be raised while there's a running VM, and that includes the hosted engine. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain
Once upon a time, Simone Tiraboschi said: > On Wed, Mar 1, 2017 at 3:19 PM, Chris Adams wrote: > > However, now I'm back into the catch-22 of 3.6.7+hosted engine: the > > cluster compatibility level can't be raised while there's a running VM, > > and that includes the hosted engine. > > Please see this one: > https://bugzilla.redhat.com/show_bug.cgi?id=1364557 > > Simply define 'InClusterUpgrade' scheduling policy on the HE VM cluster I first tried setting the policy, but got "Error while executing action: The set cluster compatibility version does not allow mixed major host OS versions. Can not start the cluster upgrade."; I guess this is because my hosts are CentOS 7 and the engine is CentOS 6? I tried changing the engine config to skip that check from comment 10 step 3, but got: - Can not start cluster upgrade mode, see below for details: - VM HostedEngine with id 4a035efd-a041-4e46-84db-01cf79400913 is configured to be not migratable. I did the SQL update from comment 1, and then I could set the policy. However, I still can't change the cluster compatibility version. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain
Once upon a time, Simone Tiraboschi said: > On Wed, Mar 1, 2017 at 5:04 PM, Chris Adams wrote: > > I first tried setting the policy, but got "Error while executing action: > > The set cluster compatibility version does not allow mixed major host OS > > versions. Can not start the cluster upgrade."; I guess this is because > > my hosts are CentOS 7 and the engine is CentOS 6? > > This is not an issue, are you sure that all the hosts are el7 based? Yes, there are only two hosts (dev/test setup), both freshly installed with CentOS 7.3 plus all current updates. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain
Once upon a time, Chris Adams said: > However, now I'm back into the catch-22 of 3.6.7+hosted engine: the > cluster compatibility level can't be raised while there's a running VM, > and that includes the hosted engine. I'm still stuck on this - anybody have any solution? Because of this, I can't upgrade my cluster. -- Chris Adams ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users