Re: [ovirt-users] Storage latency message

2017-04-18 Thread Chris Adams
Once upon a time, Nir Soffer  said:
> Ovirt is reading 4k from the metadata special volume every 10 secods. If
> the read takes more than 5 seconds, you will see this warning in engine
> event log.
> 
> Maybe your storage or the host was overloaded at that time (e.g. vm backup)?

I don't see any evidence that the storage was having any problem.  The
times the message gets logged are not at any high-load times either
(either scheduled backups or just high demand).

I wrote a perl script to replicate the check, and I ran it on a node in
maintenance mode (so no other traffic on the node).  My script opens a
block device with O_DIRECT, reads the first 4K, and closes it, reporting
the time.  I do see some latency jumps with that check, but not on the
raw block device, just the LV.

By that I mean I'm running it on two devices: the multipath device that
is the PV and the metadata LV.  The multipath device latency is pretty
stable, running around 0.3 to 0.5ms.  The LV latency is higher (just a
little normally) but has a higher variability and spikes to 50-125ms (at
the same time that reading the multipath device took under 0.5ms).

Seems like this might be a problem somewhere in the Linux logical volume
layer, not the block or network layer (or with the network/storage
itself).
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] LACP Bonding issue

2017-04-20 Thread Chris Adams
Once upon a time, Bryan Sockel  said:
> It seems that is some disconnect between my network bridge, the bond and my 
> interfaces.  I would like to some how get my bond to use all 4 interfaces.  
> On reboot, it always seems to reset consistently to EM1.

Are you sure the switch side is all the same LACP group?  Sounds like
one port may accidentally be in a separate group, and that happens to be
em1.

You might try swapping wires between em1 and another port and reboot and
see which ports come up - if all but the port with the wire formerly in
em1 come up, it points to the switch config.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] LACP Bonding issue

2017-04-20 Thread Chris Adams
Once upon a time, Bryan Sockel  said:
> We checked the port groups, and servers are cabled correctly.
> 
> After server is rebooted, em1 is the only interface passing traffic.
> Other 3 nics sitting idle.  We can down each port on the switch and
> confirm it is down on the server.
> 
> 
> I am pretty sure it is related to the bridge that was created to pass
> vm-host-altn traffic when the appliance was first installed.
> 
> 
> 
>  Original message 
> From: Chris Adams  
> Date: 4/20/17 5:40 PM (GMT-06:00) 
> To: users@ovirt.org 
> Subject: Re: [ovirt-users] LACP Bonding issue 
> 
>   _  
> 
> >From : Chris Adams [c...@cmadams.net]
> To : users@ovirt.org [users@ovirt.org]
> Date : Thursday, April 20 2017 17:40:25
> Once upon a time, Bryan Sockel  said:
> > It seems that is some disconnect between my network bridge, the bond
> and my 
> > interfaces.  I would like to some how get my bond to use all 4
> interfaces.  
> > On reboot, it always seems to reset consistently to EM1.
> 
> Are you sure the switch side is all the same LACP group?  Sounds like
> one port may accidentally be in a separate group, and that happens to be
> em1.
> 
> You might try swapping wires between em1 and another port and reboot and
> see which ports come up - if all but the port with the wire formerly in
> em1 come up, it points to the switch config.
> 
> -- 
> Chris Adams 
> _______
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] LACP Bonding issue

2017-04-20 Thread Chris Adams
Sorry about the message with nothing new...

Once upon a time, Bryan Sockel  said:
> We checked the port groups, and servers are cabled correctly.
> 
> After server is rebooted, em1 is the only interface passing traffic.
> Other 3 nics sitting idle.  We can down each port on the switch and
> confirm it is down on the server.
> 
> I am pretty sure it is related to the bridge that was created to pass
> vm-host-altn traffic when the appliance was first installed.

Well, I don't have any problem with that setup on multiple oVirt
clusters (including a bunch of R610 servers), so I don't think that's
it.

I configure oVirt for "custom" bonding options; I use:

  mode=802.3ad lacp_rate=1 xmit_hash_policy=layer2+3

Is it possible to move the wires around temporarily, so different server
ports are connected to different switch ports?  It would be interested
to see if the "solo" behavior stayed with the port or the wire.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Seamless SAN HA failovers with oVirt?

2017-06-06 Thread Chris Adams
Once upon a time, Sven Achtelik  said:
> I was failing over by rebooting one of the TrueNas nodes and this took some 
> time for the other node to take over. I was thinking about asking the TN guys 
> if there is a command or procedure to speed up the failover.

That's the way TrueNAS failover works; there is no "graceful" failover,
you just reboot the active node.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Seamless SAN HA failovers with oVirt?

2017-06-06 Thread Chris Adams
Once upon a time, Juan Pablo  said:
> I think its not related to something on the trueNAS side. if you are using
> iscsi multipath you should be using round-robin

TrueNAS HA is active/standby, so multipath has nothing to do with
rebooting/upgrading a TrueNAS.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Seamless SAN HA failovers with oVirt?

2017-06-06 Thread Chris Adams
Once upon a time, Juan Pablo  said:
> Im saying you can do it with multipath and not rely on truenas/freenas.
> with an active/active configuration on the virt side...instead of
> active/passive on the storage side.

But there's still only one active system (the active TrueNAS node)
connected to the hard drives, and the only way to upgrade is to reboot
it.  Multipath doesn't bypass that.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Seamless SAN HA failovers with oVirt?

2017-06-06 Thread Chris Adams
Once upon a time, Juan Pablo  said:
> Chris, if you have active-active with multipath: you upgrade one system,
> reboot it, check it came active again, then upgrade the other.

Yes, but that's still not how a TrueNAS (and most other low- to
mid-range SANs) works, so is not relevant.  The TrueNAS only has a
single active node talking to the hard drives at a time, because having
two nodes talking to the same storage at the same time is a hard problem
to solve (typically requires custom hardware with active cache coherency
and such).

You can (and should) use multipath between servers and a TrueNAS, and
that protects against NIC, cable, and switch failures, but does not help
with a controller failure/reboot/upgrade.  Multipath is also used to
provide better bandwidth sharing between links than ethernet LAGs.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Add disk image from node command line?

2017-07-12 Thread Chris Adams
I have a qcow2 disk image sitting on the local filesystem of one node.
Is there a way to copy this image to oVirt (into an iSCSI storage
domain) without copying it to my desktop and uploading through the web
UI?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Software RAID on oVirt Node

2017-08-04 Thread Chris Adams
Once upon a time, Vinícius Ferrão  said:
> On typical deployment scenarios of oVirt which is the recommended RAID 
> technologies for oVirt Node installation? Should I use controller based RAID 
> or mdadm can be used instead? Is this recommended?
> 
> I’m asking this because other vendors requires hardware RAID, even those 100% 
> based on CentOS, like XenServer. There’s not even a way to install it with 
> mdadm (Software Raid).

I use Linux software RAID under oVirt just fine.  I'm not using oVirt
Node though (I just installed CentOS and then installed oVirt).  Note
that I have an iSCSI SAN for VM storage - things might be different if
you are planning to use the local disks for VMs (local storage or
Gluster).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Replacing engine SSL cert

2017-09-09 Thread Chris Adams
I'm writing a script to install a new SSL key/cert pair (from Let's
Encrypt) for the engine web UI on oVirt 4.1.  I'm looking at this, but
it's a little confusing.

https://www.ovirt.org/documentation/admin-guide/appe-oVirt_and_SSL/

It sounds like steps 1 and 3 are referring to the CA-supplied
intermediate cert(s), not the actual issue cert for the server.  Is that
right?

Does anything actually use the PCKS12 format file referred to in step 4?
I don't normally see that format from regular CAs; they usually provide
cert+intermediate(s) in PEM format.

With Apache 2.4, it is normal to just put the cert+intermediate(s) chain
in one file and configure Apache with SSLCertificateFile.  You aren't
supposed to put the CA-supplied cert in the SSLCACertificateFile like
oVirt appears to do; that's intended to be used for validating client
certs, not the intermediate(s) for the server cert.

It really just looks like the cert+intermediate(s) should go in
/etc/pki/ovirt-engine/certs/apache.cer, the corresponding key put in
/etc/pki/ovirt-engine/keys/apache.key.nopass, and then Apache needs to
be restarted.  Since oVirt doesn't use the engine web UI cert for
anything internally (right?), do any of the other steps on the above
page matter?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Different link speeds in LACP LAG?

2017-09-13 Thread Chris Adams
I have a small oVirt setup for one customer, with two servers each
connected to a two-switch stack with 1G links.  Now the customer would
like to upgrade the server links to 10G.  My question is this: can I add
a 10G NIC and do this with minimal "fuss" by just adding the 10G links
to the same LAG, then removing the 1G links?  I would have the host in
maintenance mode no matter what.

I haven't checked the switch to see if it'll support that yet, figured
I'd start on the oVirt side.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Question about cold start

2017-10-04 Thread Chris Adams
I have an oVirt cluster that was hard shutdown last night (fire is bad,
and firemen killed the generators for their safety).  When it came back
up, it did not start any VMs other than the hosted engine.

Is that expected?  I know this is not a normal use case, but is there a
way to set VMs to start on cluster boot?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Question about cold start

2017-10-04 Thread Chris Adams
Once upon a time, Charles Kozler  said:
> I believe you would accomplish this by setting a VM to be highly available
> (like the engine). Then engine makes sure this VM is up on at least one
> node through lease agreements (IIRC). In either case, I think this is what
> you want

That keeps VMs up as long as the cluster is up, but does not bring them
back if the whole cluster goes down (unless there's some other setting
I'm missing).
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Question about cold start

2017-10-04 Thread Chris Adams
Once upon a time, Martin Sivak  said:
> Can you please describe your use-case there to make sure we do not
> forget and to make it obvious there is a need for this feature?

Thanks, added.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Multiple NICs on hosted engine?

2014-11-14 Thread Chris Adams
I have installed the first node of a new oVirt 3.5 setup with a hosted
engine VM.  I have multiple networks: one public-accessible and one
private (with storage, iDRAC/IPMI, etc.).  I set the engine VM up on the
public LAN, but now realize that it can't access the power control.  I
tried to add a second NIC to the engine VM through the web interface,
but of course that doesn't work (because it isn't really managed there).

How can I add a second NIC to the hosted engine VM?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Multiple NICs on hosted engine?

2014-11-14 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> Sorry, I forgot you cannot add that nic on the engine VM from the engine UI.
> Please try what I explained plus Darrel's trick.

It worked.  I added the network in the UI, added it to the host (so it
got the bridge set up on that interface) in the UI, and then edited the
vm.conf file on the host.  Migrated back and forth and all appears well.

Thanks.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Multiple NICs on hosted engine?

2014-11-17 Thread Chris Adams
Once upon a time, Darrell Budic  said:
> Glad it worked. Make sure you add it to the vm.conf file on all your ha 
> hosts, otherwise you’ll drop it if ha-agent restarts it as opposed to a 
> migration. Wasn’t clear if you’d done that or not.

Based on some other notes I found via Google, here's what I did (for the
archives):

- Created the network in the UI
- hosted-engine --set-maintenance --mode=global
- edited /etc/ovirt-hosted-engine/vm.conf; duplicated the existing
  network line, changing the MAC, UUID, and network name (changed on all
  hosted-engine nodes)
- hosted-engine --vm-shutdown
- hosted-engine --vm-start
- hosted-engine --set-maintenance --mode=none

That appears to be working correctly.

I did then figure out that I probably didn't need it, at least for what
I thought: power management.  I didn't realize that the engine doesn't
talk to the IPMI devices directly, that it instead proxies through a
node.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] iptables management

2014-11-17 Thread Chris Adams
During setup, I allowed the script to change iptables rules.  Is this
necessary?  Also, is it an "active" management (where oVirt will make
changes), or just a one-time thing?

I ask because I have some other iptables setup I want (such as limited
SSH access), and I don't want to make changes to iptables that oVirt
will override later or anything like that.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] iptables management

2014-11-17 Thread Chris Adams
Once upon a time, Alon Bar-Lev  said:
> I guess you mean engine setup, right?

Yes, that and hosted-engine --deploy.

> Each time you run engine-setup you will be prompt if you want to override 
> iptables settings.
> If you choose to override, the current settings will be backed up and you can 
> diff and re-apply your own.
> If you choose to keep your settings, setup will write the iptables rules into 
> own location and you can diff and apply the changes manually.

Okay, so that's the only time iptables are changed?  That makes sense,
and I can work with that.  Thanks.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Gluster access tied to one node?

2014-11-18 Thread Chris Adams
So this may be a dumb question, but here goes...

I set up a replicated Gluster volume for VM disk image storage.  I
specified the path as node9:gluster1 (where node9 is one of the two
nodes with a brick).  However, when I shut down node9, the VM using that
storage automatically gets paused.

I thought the replicated storage was supposed to work even if one node
was down.  Is there some other way to specify the path (did I do that
wrong)?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Gluster access tied to one node?

2014-11-19 Thread Chris Adams
Once upon a time, Gabi C  said:
> tried localhost:gluster1?

No, I haven't - is that expected to work?

Also, as it turns out, I can't try it.  I can edit the storage domain
and change the path, but the change is ignored.  I cannot remove the
storage domain either; that's greyed out.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Hosted engine: sending ioctl 5401 to a partition!

2014-11-21 Thread Chris Adams
I have set up oVirt with hosted engine, on an iSCSI volume.  On both
nodes, the kernel logs the following about every 10 seconds:

Nov 21 15:27:49 node8 kernel: ovirt-ha-broker: sending ioctl 5401 to a 
partition!

Is this a known bug, something that I need to address, etc.?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine: sending ioctl 5401 to a partition!

2014-11-28 Thread Chris Adams
Once upon a time, Federico Simoncelli  said:
> > I have set up oVirt with hosted engine, on an iSCSI volume.  On both
> > nodes, the kernel logs the following about every 10 seconds:
> > 
> > Nov 21 15:27:49 node8 kernel: ovirt-ha-broker: sending ioctl 5401 to a
> > partition!
> > 
> > Is this a known bug, something that I need to address, etc.?
> 
> Is this on centos or fedora?

Oops, sorry to leave that out.  CentOS 7 and oVirt 3.5 (all up-to-date).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] how to see new iSCSI lun added

2014-12-01 Thread Chris Adams
Once upon a time, Gianluca Cecchi  said:
> Am I missing anyting simple?

Yep...

> On the server offering iSCSI target:
> # tgtadm --lld iscsi --mode target --op show
> Target 1: iqn.2014-07.local.localdomain:store1
> LUN: 1
> SCSI ID: p_iscsi_store1_l
> LUN: 2
> SCSI ID: p_iscsi_store1_l

Both LUNs have the same ID name, which confuses discovery (I've done the
same thing before).  This is a 16 character string, so make sure they
are distinct/unique in that 16 characters.

It is annoying that scsi-target-utils lets you do this.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] vdsm losing connection to libvirt

2014-12-16 Thread Chris Adams
I have a oVirt setup that has three nodes, all running CentOS 7, with a
hosted engine running CentOS 6.  Two of the nodes (node8 and node9) are
configured for hosted engine, and the third (node2) is just a "regular"
node (as you might guess from the names, more nodes are coming as I
migrate VMs to oVirt).

On one node, node8, vdsm periodically loses its connection to libvirt,
which causes vdsm to restart.  There doesn't appear to be any trigger
that I can see (not time of day, load, etc. related).  The engine VM is
up and running on node8 (don't know if that has anything to do with it).

I get some entries in /var/log/messages repeated continuously; the
"ovirt-ha-broker: sending ioctl 5401 to a partition" I mentioned before,
and the following:

Dec 15 20:56:23 node8 journal: User record for user '107' was not found: No 
such file or directory
Dec 15 20:56:23 node8 journal: Group record for user '107' was not found: No 
such file or directory

I don't think those have any relevance (don't know where they come
from); filtering those out, I see:

Dec 15 20:56:33 node8 journal: End of file while reading data: Input/output 
error
Dec 15 20:56:33 node8 journal: Tried to close invalid fd 0
Dec 15 20:56:38 node8 journal: vdsm root WARNING connection to libvirt broken. 
ecode: 1 edom: 7
Dec 15 20:56:38 node8 journal: vdsm root CRITICAL taking calling process down.
Dec 15 20:56:38 node8 journal: vdsm vds ERROR libvirt error
Dec 15 20:56:38 node8 journal: ovirt-ha-broker mgmt_bridge.MgmtBridge ERROR 
Failed to getVdsCapabilities: Error 16 from getVdsCapabilities: Unexpected 
exception
Dec 15 20:56:45 node8 journal: End of file while reading data: Input/output 
error
Dec 15 20:56:45 node8 vdsmd_init_common.sh: vdsm: Running run_final_hooks
Dec 15 20:56:45 node8 systemd: Starting Virtual Desktop Server Manager...


It is happening about once a day, but not at any regular interval or
time (was 02:23 Sunday, then 20:56 Monday).

vdsm.log has this at that time:

Thread-601576::DEBUG::2014-12-15 
20:56:38,715::BindingXMLRPC::1132::vds::(wrapper) client [127.0.0.1]::call 
getCapabilities with () {}
Thread-601576::DEBUG::2014-12-15 20:56:38,718::utils::738::root::(execCmd) 
/sbin/ip route show to 0.0.0.0/0 table all (cwd None)
Thread-601576::DEBUG::2014-12-15 20:56:38,746::utils::758::root::(execCmd) 
SUCCESS:  = '';  = 0
Thread-601576::WARNING::2014-12-15 
20:56:38,754::libvirtconnection::135::root::(wrapper) connection to libvirt 
broken. ecode: 1 edom: 7
Thread-601576::CRITICAL::2014-12-15 
20:56:38,754::libvirtconnection::137::root::(wrapper) taking calling process 
down.
MainThread::DEBUG::2014-12-15 20:56:38,754::vdsm::58::vds::(sigtermHandler) 
Received signal 15
Thread-601576::DEBUG::2014-12-15 
20:56:38,755::libvirtconnection::143::root::(wrapper) Unknown libvirterror: 
ecode: 1 edom: 7 level: 2 message: internal error: client socket is closed
MainThread::DEBUG::2014-12-15 
20:56:38,755::protocoldetector::135::vds.MultiProtocolAcceptor::(stop) Stopping 
Acceptor
MainThread::INFO::2014-12-15 
20:56:38,755::__init__::563::jsonrpc.JsonRpcServer::(stop) Stopping JsonRPC 
Server
Detector thread::DEBUG::2014-12-15 
20:56:38,756::protocoldetector::106::vds.MultiProtocolAcceptor::(_cleanup) 
Cleaning Acceptor
MainThread::INFO::2014-12-15 20:56:38,757::vmchannels::188::vds::(stop) VM 
channels listener was stopped.
MainThread::INFO::2014-12-15 20:56:38,758::momIF::91::MOM::(stop) Shutting down 
MOM
MainThread::DEBUG::2014-12-15 
20:56:38,759::task::595::Storage.TaskManager.Task::(_updateState) 
Task=`26c7680c-23e2-42bb-964c-272e778a168a`::moving from state init -> state 
preparing
MainThread::INFO::2014-12-15 20:56:38,759::logUtils::44::dispatcher::(wrapper) 
Run and protect: prepareForShutdown(options=None)
Thread-601576::ERROR::2014-12-15 
20:56:38,755::BindingXMLRPC::1142::vds::(wrapper) libvirt error
Traceback (most recent call last):
  File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 1135, in wrapper
res = f(*args, **kwargs)
  File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 463, in getCapabilities
ret = api.getCapabilities()
  File "/usr/share/vdsm/API.py", line 1245, in getCapabilities
c = caps.get()
  File "/usr/share/vdsm/caps.py", line 615, in get
caps.update(netinfo.get())
  File "/usr/lib/python2.7/site-packages/vdsm/netinfo.py", line 812, in get
nets = networks()
  File "/usr/lib/python2.7/site-packages/vdsm/netinfo.py", line 119, in networks
allNets = ((net, net.name()) for net in conn.listAllNetworks(0))
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 129, 
in wrapper
__connections.get(id(target)).pingLibvirt()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3642, in 
getLibVersion
if ret == -1: raise libvirtError ('virConnectGetLibVersion() failed', 
conn=self)
libvirtError: i

Re: [ovirt-users] 3. vdsm losing connection to libvirt (Chris Adams)

2014-12-16 Thread Chris Adams
Once upon a time, Nikolai Sednev  said:
> Can I get engine, libvirt, vdsm, mom, logs from host8 and connectivity log? 
> Have you tried installing clean OSs on hosts, especially on problematic host? 
> I'd also try to disable JSONRPC on hosts, by putting them to maintenance and 
> then removing JSONRPC from the check box on all hosts, just to compare if it 
> resolves the issue. 

Just to follow up... (tl;dr: issues may be just my own fault)

I tried to put node8 into maintenance mode, but then vdsm died while
migrating active VMs and the node rebooted.  At that point,
ovirt-ha-agent.service would exit and sanlock logged errors.  I finally
realized sanlock was logging "-13" (would be nice to strerr() here, as
-13 is not intuitive), which is EACCESS aka permission denied.

I realized I didn't have the latest SELinux policy, but had enabled
enforcing mode since the last reboot (from permissive, so no relabel
needed).  The latest CentOS 7 policy includes this in the changelog:

* Mon Nov 10 2014 Miroslav Grepl  3.12.1-153.el7_0.13
-  Add support for vdsm.
Resolves:#1172146
- ALlow sanlock to send a signal to virtd_t.
- ALlow sanlock_t to read sysfs.
Resolves:#1172147

* Tue Nov 04 2014 Miroslav Grepl  3.12.1-153.el7_0.12
- Allow logrotate to manage virt_cache_t type
Resolves:#1159834

So, this may have all just been self-inflicted.  I've switched back to
permissive mode until I next apply updates; hopefully that'll fix my
other issues as well.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] NUMA and non-NUMA nodes and migration

2014-12-16 Thread Chris Adams
So, new problem (I'm good at breaking things I guess?).  Same setup,
CentOS 7 + oVirt 3.5.0.

Some of my nodes have 2 four-core CPUs, and some have 1 eight-core CPU
(same number of available cores); all Intel Xeons of Nehalem or newer
type.  The systems with 2 CPUs apparently have NUMA support, although I
haven't configured anything related to it.

The problem: I am unable to live migrate a VM from a node with NUMA to a
node without NUMA (haven't tried the other direction).  I get messages
like:

Dec 16 15:36:05 node8 journal: internal error: Process exited prior to exec: 
libvirt:  error : internal error: NUMA node 1 is out of range

I see this mentioned in RHBZ 1147644, but it doesn't have a clear
resolution to this issue there (multiple issues came up in the same
ticket).  Is this something that is supposed to be fixed already, will
be fixed in 3.5.1 (or later release), or has fallen through the cracks?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] NUMA and non-NUMA nodes and migration

2014-12-18 Thread Chris Adams
Once upon a time, Gilad Chaplik  said:
> Hi Chris, 
> 
> The fix didn't make it to 3.5, will be available in 3.5.1

Okay, thanks.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Changing the engine HA ping address?

2015-01-27 Thread Chris Adams
I have an up-to-date hosted-engine 3.5.1 setup (CentOS 7 for the nodes,
CentOS 6 for the engine), and the engine keeps jumping between the two
nodes running the hosted-engine HA (sometimes after just 10-20 minutes,
sometimes after a day or two).  I figured out that it is failing on
pinging the gateway sometimes.

The gateway IP is a layer-3 switch, and I think sometimes it just is not
responding to ICMP echo request in a timely fashion (traffic is routing
just fine though).  How is the HA ping implemented?  How many requests
does it send (and how many responses are required to be considered
"good")?

If I can't tweak the sensitivity of the ping, I'd like to ping a
different IP (on a HA load balancer setup).  The oVirt HA config refers
to it as "gateway" though; is it really used as a gateway in any case,
or is that just the recommended IP?

Can I just edit /etc/ovirt-hosted-engine/hosted-engine.conf on the two
nodes and restart the ovirt-ha-broker service?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Changing the engine HA ping address?

2015-01-27 Thread Chris Adams
Once upon a time, Chris Adams  said:
> The gateway IP is a layer-3 switch, and I think sometimes it just is not
> responding to ICMP echo request in a timely fashion (traffic is routing
> just fine though).  How is the HA ping implemented?  How many requests
> does it send (and how many responses are required to be considered
> "good")?

I see ovirt_hosted_engine_he/broker/submonitors/ping.py that only one
packet is sent.  That's probably not a great way to do things; there are
a number of routers/firewalls/etc. that put ICMP echo requests to the
device (as opposed to through the device) at the very lowest priority,
and drop them under any load.

A better way would be to send multiple requests, with only one answer
required.  "ping -c 1 -i 0.2 -w  -W  " should do
that.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] VDSM hook for setting DSCP bits?

2015-02-10 Thread Chris Adams
Is there a VDSM hook available that can set DSCP bits on a VM's network
interface?  I want to do some QoS for some traffic across my network,
and it would be easier if I could set DSCP bits outside the VM.

I see vdsm-hook-qos, but that appears to just set bandwidth control in
the Linux host, not DSCP on packets for the rest of the network.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM hook for setting DSCP bits?

2015-02-11 Thread Chris Adams
Once upon a time, Dan Yasny  said:
> shouldn't be hard to do. Can you provide the details of what you need to
> happen to the VM exactly?
> - domxml changes
> - other host level changes
> - whether the VM should be able to live migrate

It looks like libvirt supports setting up DSCP bits with nwfilter, per:

https://libvirt.org/formatnwfilter.html

I will play with this some to see exactly how to use it (haven't tried
it before).  If that's the case, there shouldn't be any host-level
changes required.  I would want the VM to be able to live migrate still
(with the DSCP still applied).

I'll test this out on a bare libvirt VM and see if that'll do the job,
and report back with what XML is needed.

Thanks.


> On Tue, Feb 10, 2015 at 2:34 PM, Chris Adams  wrote:
> 
> > Is there a VDSM hook available that can set DSCP bits on a VM's network
> > interface?  I want to do some QoS for some traffic across my network,
> > and it would be easier if I could set DSCP bits outside the VM.
> >
> > I see vdsm-hook-qos, but that appears to just set bandwidth control in
> > the Linux host, not DSCP on packets for the rest of the network.
> > --
> > Chris Adams 
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >

> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users


-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM hook for setting DSCP bits?

2015-02-11 Thread Chris Adams
Once upon a time, Chris Adams  said:
> Once upon a time, Dan Yasny  said:
> > shouldn't be hard to do. Can you provide the details of what you need to
> > happen to the VM exactly?
> > - domxml changes
> > - other host level changes
> > - whether the VM should be able to live migrate
> 
> It looks like libvirt supports setting up DSCP bits with nwfilter, per:
> 
> https://libvirt.org/formatnwfilter.html

Oh, on reading this, nwfilter can only match, not set, so that won't
help.  It doesn't look like libvirt has a way to set something like
that.

Do VDSM hooks only act on the XML, or is there a way to configure things
outside of libvirt?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Port mirroring outside traffic into a VM?

2015-02-13 Thread Chris Adams
I have a network traffic monitor that is on a physical machine right
now.  It has two network interfaces: one with an IP on a regular switch
port, and one without an IP on a switch port that is the target of a
port mirror/monitor session for the desired VLAN.

I'd like to move this system to an oVirt VM (I'm running 3.5.1).  Is
this the right way to go about it (and still have the VM migratable)?

- I have several hosts with extra network interfaces; pick at least a
  couple, connect them to switch ports that are configured for
  mirror/monitor session.

- In oVirt admin console, choose the Networks tab, click New.  Give the
  network a name (like "monitor"), leave VLAN tagging de-selected and VM
  Network selected.  Under the Cluster section, de-select Required
  (because the mirror won't go to all hosts).  Click OK to create.

- Click on the network, select the vNIC Profiles tab, edit the default
  profile and select Port Mirroring.

- Go to the Hosts tab.  For each host with a port mirror, click on the
  host, then choose the Network Interfaces tab and Setup Host Networks.
  Drag the new network to its attached port, click the pencil, and set
  Boot Protocol to None.

- Go to the Virtual Machines tab.  Click on the VM, choose the Network
  Interfaces tab, and click New.  Choose the monitor network in the
  Profile.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Port mirroring outside traffic into a VM?

2015-02-15 Thread Chris Adams
Once upon a time, Genadi Chereshnya  said:
> If I understand you correctly you are trying to replace the physical device 
> mirroring with VM?

Yes, that is correct.

> If this is the case I don't think it's possible to do it with port mirroring 
> oVIRT feature.
> The existing oVIRT port mirroing feature is for mirroring traffic between VM 
> devices for specific Network.
> So if you have 3 VMs with network  you can monitor on 1 VM that specific 
> network that is used between 2 other VMs.

Ah, I see.

Is there a way to get an external network interface (that happens to be
a target of an external switch's port mirror/monitor session) to pass
through to a VM?  A way that still allows for live migration would be
best of course, but even without that would be a start.

Thanks.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HA of VMs

2015-02-24 Thread Chris Adams
Once upon a time, Matt Wells  said:
> I've been poking around for a better way to perform HA.  With VM's like IPA
> or even HA web sites behind an HAProxy; how do I ensure that they are never
> on the same host?

You just need to set up affinity groups.  Negative affinity means keep
VMs away from each other, and enforcing means _never_ do it (even it
means shutting down a VM rather than moving it).
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM memory consumption

2015-03-06 Thread Chris Adams
Once upon a time, Federico Alberto Sayd  said:
> I am experiencing troubles with VDSM memory consuption.
> 
> I am running
> 
> Engine: ovirt 3.5.1
> 
> Nodes:
> 
> Centos 6.6
> VDSM 4.16.10-8
> Libvirt: libvirt-0.10.2-46
> Kernel: 2.6.32
> 
> When the host boots, memory consuption is normal, but after 2 or 3
> days running, VDSM memory consuption grows and it consumes more
> memory that all vm's running in the host. If I restart the vdsm
> service, memory consuption normalizes, but then it start growing
> again.
> 
> I have seen some BZ about vdsm and supervdsm about memory leaks, but
> I don't know if VDSM 4.6.10.8 is still affected by a related bug.

Can't help, but I see the same thing with CentOS 7 nodes and the same
version of vdsm.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Chris Adams
Once upon a time, Paul Heinlein  said:
> Summary: To get oVirt-managed VMs to boot using PXE, I had to
> replace the rhel6-*.rom files with their ipxe equivalents.

I'm PXE booting oVirt VMs with no trouble.  I have CentOS 7 nodes,
running oVirt 3.5.1 (hosted engine on CentOS 6).  Each node has a pair
of NICs in a LACP bond to a switch stack, running 802.1q on top of that,
with several VLANs (only one VLAN has a DHCP server and a local CentOS
repo, so I put VMs on that VLAN for install).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Chris Adams
Once upon a time, Paul Heinlein  said:
> Good data point! Can you tell me the compatibility version of your
> data center and its cluster(s)? How about the cluster CPU type?

One DC, one cluster, version 3.5.  Intel Nehalem CPU.  I've PXE booted
CentOS 5, 6, and 7 VMs (64 bit for all and 32 bit for 5/6).

I'd suspect something in the network setup.  I have VLANs on an 802.1q
trunk on an LACP bond (with oVirt bridging the VLANs to VMs).  My DHCP
server (separate physical CentOS 6 box) is also running VLANs on 802.1q
on LACP bond, with dnsmasq listening on one VLAN.

I'd look at traffic coming out of the VM on the node, and coming into
the DHCP server, and see who sees what (are the requests coming out of
the VM, is the DHCP server seeing them, is it replying, does the VM get
the reply).

If the DHCP requests are making it to the server, the next thing to see
is if there is any difference in the DHCP options requested between the
different ROM images (maybe your DHCP config isn't matching up correctly
in some case that works on mine?).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Chris Adams
Once upon a time, Paul Heinlein  said:
> So it might be helpful to look at the DHCP options, but the server
> is making OFFERs, so I'm not really sure what bits might be suspect.

Do you see a difference between the DHCP options with the "bad" and
"good" ROMs?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM memory consumption

2015-03-09 Thread Chris Adams
Once upon a time, Dan Kenigsberg  said:
> I'm afraid that we are yet to find a solution for this issue, which is
> completly different from the horrible leak of supervdsm < 4.16.7.
> 
> Could you corroborate the claim of
> Bug 1147148 - M2Crypto usage in vdsm leaks memory
> ? Does the leak disappear once you start using plaintext transport?

So, to confirm, it looks like to do that, the steps would be:

- In the [vars] section of /etc/vdsm/vdsm.conf, set "ssl = false".
- Restart the vdsmd service.

Is that all that is needed?  Is it safe to restart vdsmd on a node with
active VMs?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Communication errors between engine and nodes?

2015-03-10 Thread Chris Adams
Setup: oVirt 3.5.1 w/hosted engine, nodes: CentOS 7, engine: CentOS 6

I am periodically seeing errors like this in my engine web UI:

2015-Mar-10, 04:42 Host node5 is not responding. It will stay in Connecting 
state for a grace period of 89 seconds and after that an attempt to fence the 
host will be issued.
2015-Mar-10, 04:42 Host node3 from cluster c1 was chosen as a proxy to execute 
Status command on Host node5.
2015-Mar-10, 04:42 Status of host node5 was set to Up.
2015-Mar-10, 04:42 Host node5 power management was verified successfully.

The engine.log file has this:

2015-03-10 04:42:23,310 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.ListVDSCommand] 
(DefaultQuartzScheduler_Worker-40) [75b9e6d9] Command ListVDSCommand(HostName = 
node5, HostId = 8dfd0195-f386-4e16-9379-a5287221d5bd, 
vds=Host[node5,8dfd0195-f386-4e16-9379-a5287221d5bd]) execution failed.  
Exception: VDSNetworkException: VDSGenericException: VDSNetworkException: 
Heartbeat exeeded 

This seems to happen with a random node sometimes.  The VMs on the node
stay up and don't appear to experience any problem.  I can't find any
sign of a network problem on either the node, the engine, the node
hosting the engine, or the switches.  I don't see anything obvious in
the logs on any of the systems involved either.

The node network setup is VLANs on top of a bond of two NICs, each
connected to a different switch in a two-switch stack.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Communication errors between engine and nodes?

2015-03-11 Thread Chris Adams
Once upon a time, Chris Adams  said:
> 2015-03-10 04:42:23,310 ERROR 
> [org.ovirt.engine.core.vdsbroker.vdsbroker.ListVDSCommand] 
> (DefaultQuartzScheduler_Worker-40) [75b9e6d9] Command ListVDSCommand(HostName 
> = node5, HostId = 8dfd0195-f386-4e16-9379-a5287221d5bd, 
> vds=Host[node5,8dfd0195-f386-4e16-9379-a5287221d5bd]) execution failed.  
> Exception: VDSNetworkException: VDSGenericException: VDSNetworkException: 
> Heartbeat exeeded 

I'm trying to dig into this some on my own (without knowing about
oVirt's internals); can somebody tell me the timeout for the dispatching
of commands to vdsm?  I get different things happening when the engine
thinks a node has "gone away", but they all start with the same
org.ovirt.engine.core.vdsbroker.vdsbroker bit (and have a network
timeout of some type).

I don't see anything in common in any of the logs at the time of the
error, so I'm trying to roll back to when the request was sent (but I
don't know how long it took for the engine to time out before the error
was logged).
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Communication errors between engine and nodes?

2015-03-12 Thread Chris Adams
Once upon a time, Lior Vernia  said:
> If I'm not mistaken, heartbeat intervals are configured to 10 seconds by
> default.

Okay, thanks.

> The command times out queries for the status of VMs on a host - any
> reason to suspect why that's taking long? Does it happen on specific hosts?

No idea.  It seemed to happen on node5 a bunch over a week, but then
there were errors on other nodes as well.  It isn't always "Heartbeet
exceeded", sometimes it is "VDSNetworkException: Message timeout which
can be caused by communication issues".  I haven't been able to find any
network issues that could cause this (no errors logged anywhere).

There doesn't seem to be any pattern to when it happens either.  The log
entry I posted was from 04:42 local time, and a bunch of the VMs are
CentOS 5, which does log rotation at 04:00 by default (which can spike
the CPU and disk I/O), but they are all done long before 04:42.  It
happened in the middle of the afternoon a couple of days ago, while I
was logged-in to the web UI, and I didn't notice any unusual behavior.

One other odd thing: I have also been experiencing an issue where I
randomly get logged out of the web UI.  Usually nothing else was going
on, but a couple of times it seemed to correspond with one of the node
errors (hard to tell).  It looked like the same error as BZ 1198493 (I'd
see a bunch of "Failed to log User null@N/A out" messages).  I don't
know if these issues are related or that was just coincidence.

To try to rule out any unseen network issues, I started an fping to all
seven nodes and the engine from another physical system on the same
VLAN.  It is sending one ping to each of the eight hosts every 0.2
seconds.  That has not shown a dropped packet since I started yesterday
afternoon.  However, during that time, I also have not seen any
engine/vdsm timeouts.  I was going to say I had not been logged out of
the web UI, but that just happened while I was typing the previous
sentence.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM memory consumption

2015-03-13 Thread Chris Adams
Once upon a time, Sven Kieske  said:
> On 13/03/15 12:29, Kapetanakis Giannis wrote:
> > We also face this problem since 3.5 in two different installations...
> > Hope it's fixed soon
> 
> Nothing will get fixed if no one bothers to
> open BZs and send relevants log files to help
> track down the problems.

There's already an open BZ:

https://bugzilla.redhat.com/show_bug.cgi?id=1158108

I'm not sure if that is exactly the same problem I'm seeing or not; my
vdsm process seems to be growing faster (RSS grew 952K in a 5 minute
period just now; VSZ didn't change).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Communication errors between engine and nodes?

2015-03-13 Thread Chris Adams
Once upon a time, Roel de Rooy  said:
> We are observing the same thing with our oVirt environment.
> At random moments (could be a couple of times a day , once a day or even once 
> every couple of days), we receive the "VDSNetworkException" message on one of 
> our nodes.
> Haven't seen the "heartbeat exceeded" message, but could be that I overlooked 
> it within our logs.
> At some rare occasions, we also do see "Host cannot access the Storage 
> Domain(s)  attached to the Data Center", within the GUI.
> 
> VM's will continue to run normally and most of the times the nodes will be in 
> "UP" state again within the same minute.
> 
> Will still haven't found the root cause of this issue.
> Our engine is CentOS 6.6 based and it's happing with both Centos 6 and Fedora 
> 20 nodes.
> We are using a LCAP bond of 1Gbit ports for our management network.
> 
> As we didn't see any reports about this before, we are currently looking if 
> something network related is causing this.

I just opened a BZ on it (since it isn't just me):

https://bugzilla.redhat.com/show_bug.cgi?id=1201779

My cluster went a couple of days without hitting this (as soon as I
posted to the list of course), but then it happened several times
overnight.  Interestingly, one error logged was communicating with the
node currently running my hosted engine.  That should rule out external
network (e.g. switch and such) issues, as those packets should not have
left the physical box.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Communication errors between engine and nodes?

2015-03-20 Thread Chris Adams
So, in my case, I'm wondering if maybe there is some kind of weird
network issue happening.

The node that seems to be showing up most for the last day or two is one
of the two nodes running the hosted-engine HA, and is _not_ currently
hosting the engine.  It seems that, at the same time the engine has
trouble communicating with that node, the hosted-engine HA running on
that node has trouble seeing the engine.

I still can't find any actual network problem.  Using another physical
system, I ran fping to all the nodes and the engine with a 0.2 second
interval, and that didn't show any problem (I ran it until I also saw an
instance of the engine->node communication error).  I'm watching ARP
traffic now to see if something is sending bad answers.  I'm pretty
stumped at this point of what to look at next.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [QE][ACTION REQUIRED] oVirt 3.5.2 and 3.5.3 status

2015-04-08 Thread Chris Adams
Once upon a time, Sandro Bonazzola  said:
> We have 3 open blockers for 3.5.2[1]:

Any chance the vdsm memory leak fix (RHBZ 1158108) will make 3.5.2?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Problem with hosted engine setup on VLAN/bond combo

2014-07-10 Thread Chris Adams
I am trying to install oVirt with the hosted engine.  The physical
system is CentOS 6.5 x86_64 (with all current updates).  It is connected
to a two-switch stack via bond0 (running LACP), which is a VLAN trunk,
and the management interface is vlan51.

This doesn't work with oVirt 3.4.2, but I see that both 3.4.3 and 3.5
have "support engine on bond" and "support engine on vlan" in the
release notes, so I tried again with today's 3.4.3 RC.  I got a
different error from "hosted-engine --deploy":

[ ERROR ] Failed to execute stage 'Misc configuration': Command 
'/usr/bin/vdsClient' failed to execute

I see this in /var/log/vdsm/supervdsm.log:


MainProcess|Thread-16::INFO::2014-07-10 
10:20:23,003::configNetwork::275::root::(addNetwork) Adding network ovirtmgmt 
with vlan=51, bonding=None, nics=['bond0'], bondingOptions=None, mtu=None, 
bridged=True, defaultRoute=True,options={'bootproto': 'static', 'ONBOOT': 'yes'}
MainProcess|Thread-16::ERROR::2014-07-10 
10:20:23,003::supervdsmServer::100::SuperVdsm.ServerCallback::(wrapper) Error 
in addNetwork
Traceback (most recent call last):
  File "/usr/share/vdsm/supervdsmServer", line 98, in wrapper
res = func(*args, **kwargs)
  File "/usr/share/vdsm/supervdsmServer", line 190, in addNetwork
return configNetwork.addNetwork(bridge, **options)
  File "/usr/share/vdsm/configNetwork.py", line 186, in wrapped
return func(*args, **kwargs)
  File "/usr/share/vdsm/configNetwork.py", line 287, in addNetwork
blockingdhcp=blockingdhcp, **options)
  File "/usr/share/vdsm/configNetwork.py", line 121, in objectivizeNetwork
topNetDev = Nic(nic, configurator, mtu=mtu, _netinfo=_netinfo)
  File "/usr/share/vdsm/netmodels.py", line 80, in __init__
raise ConfigNetworkError(ne.ERR_BAD_NIC, 'unknown nic: %s' % name)
ConfigNetworkError: (23, 'unknown nic: bond0')


Is there maybe still a problem combining a VLAN on a bond?


A little background (if it helps): this is my first attempt with oVirt.
I'm installing on a clean CentOS install on a spare box, with the intent
to get oVirt up and running and then convert (one node at a time) an
existing VM setup (running old stand-alone Xen installs) to oVirt.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Problem with hosted engine setup on VLAN/bond combo

2014-07-11 Thread Chris Adams
Once upon a time, Sven Kieske  said:
> Am 10.07.2014 18:02, schrieb Chris Adams:
> > Is there maybe still a problem combining a VLAN on a bond?
> 
> Yes exactly, but just with hosted-engine.
> See this BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1072027
> (Should be resolved with 3.5)

Well, the same notes are in the 3.4.3 release notes as 3.5:

 oVirt Hosted Engine Setup

 BZ 162 - [RFE] [ovirt-hosted-engine-setup] add support for bonded 
interfaces
 BZ 1117634 - [RFE] Hosted Engine deploy should support VLAN-tagged interfaces

Since I got a different error when I tried 3.4.3-RC vs. 3.4.2 (and I'm
trying with a VLAN on top of a bond), I was concerned that the same fix
as 3.5 was in 3.4.3-RC, and would not actually fix my combo setup.

> There is also a workaround in the BZ, didn't try it myself.

Yeah, I can't (easily anyway) disable the VLAN trunk and bond and then
re-enable them (since that breaks network access, and I'm not sitting at
the same location as the system).

I will try Robert Story's suggestion of manually configuring the bridge
before running the deploy.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM respawning too quickly

2014-07-14 Thread Chris Adams
Once upon a time, Kyle Gordon  said:
> Following an upgrade from 3.3 to 3.4, I've been greeted with this
> message in /var/log/messages, on my CentOS 6.5 server.

I'm hitting the same thing with an up-to-date CentOS 6.5 trying to
install hosted-engine.  It appears the problem is an updated
pythong-pthreading package in EPEL, version 0.1.3-2.  There's already a
0.1.3-3 in koji that rolls back the patch in 0.1.3-2.

http://koji.fedoraproject.org/koji/buildinfo?buildID=543650

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Odd messages on new node/hosted engine

2014-07-16 Thread Chris Adams
mstore_engine/74cb6a07-5745-4b21-ba4b-d9012acb5cae
Thread-40::DEBUG::2014-07-16 
14:34:19,856::persistentDict::192::Storage.PersistentDict::(__init__) Created a 
persistent dict with FileMetadataRW backend
Thread-40::DEBUG::2014-07-16 
14:34:19,860::persistentDict::234::Storage.PersistentDict::(refresh) read lines 
(FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=hosted_storage', 
'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=ON', 
'LOCKRENEWALINTERVALSEC=5', 'MASTER_VERSION=1', 'POOL_DESCRIPTION=c1', 
'POOL_DOMAINS=74cb6a07-5745-4b21-ba4b-d9012acb5cae:Active', 'POOL_SPM_ID=-1', 
'POOL_SPM_LVER=0', 'POOL_UUID=b15478ff-1ae1-4065-8e52-19c808d39597', 
'REMOTE_PATH=nfs.c1.api-digital.com:/vmstore/engine', 'ROLE=Master', 
'SDUUID=74cb6a07-5745-4b21-ba4b-d9012acb5cae', 'TYPE=NFS', 'VERSION=3', 
'_SHA_CKSUM=4f007c871da3177ba5546459bcebc8be8aff689e']
Thread-40::DEBUG::2014-07-16 
14:34:19,863::fileSD::609::Storage.StorageDomain::(imageGarbageCollector) 
Removing remnants of deleted images []
Thread-40::INFO::2014-07-16 
14:34:19,863::sd::383::Storage.StorageDomain::(_registerResourceNamespaces) 
Resource namespace 74cb6a07-5745-4b21-ba4b-d9012acb5cae_imageNS already 
registered
Thread-40::INFO::2014-07-16 
14:34:19,863::sd::391::Storage.StorageDomain::(_registerResourceNamespaces) 
Resource namespace 74cb6a07-5745-4b21-ba4b-d9012acb5cae_volumeNS already 
registered
Thread-40::DEBUG::2014-07-16 
14:34:19,868::fileSD::259::Storage.Misc.excCmd::(getReadDelay) '/bin/dd 
iflag=direct 
if=/rhev/data-center/mnt/nfs.c1.api-digital.com:_vmstore_engine/74cb6a07-5745-4b21-ba4b-d9012acb5cae/dom_md/metadata
 bs=4096 count=1' (cwd None)
Thread-40::DEBUG::2014-07-16 
14:34:19,885::fileSD::259::Storage.Misc.excCmd::(getReadDelay) SUCCESS:  = 
'0+1 records in\n0+1 records out\n476 bytes (476 B) copied, 0.000548138 s, 868 
kB/s\n';  = 0

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] starting VDSM gives "libvir: XML-RPC error : authentication failed: authentication failed" error

2014-07-22 Thread Chris Adams
Once upon a time, ybronhei  said:
> On 07/21/2014 01:45 AM, Jorick Astrego wrote:
> >Hi,
> >
> >Some more info, I think this is the problem and I also have it on the
> >node image:
> >
> >Jul 20 22:41:49 localhost libvirtd: unable to open Berkeley db
> >/etc/libvirt/passwd.db: No such file or directory
> >
> Hey
> 
> Did you check if libvirtd services is up?
> can you share libvirtd.conf file?
> I'm not sure what is exactly the issue, but you can try "vdsm-tool
> configure --module libvirt" command to see if you set the vdsm
> configuration for libvirt as required.

I see the same thing with a new 3.5-beta install.  libvirtd is running;
I ran the above vdsm-tool command, but that made no difference.

The libvirtd.conf has the following config (set by vdsm install):

## beginning of configuration section by vdsm-4.13.0
keepalive_interval=-1
log_outputs="1:file:/var/log/libvirt/libvirtd.log"
unix_sock_rw_perms="0770"
auth_unix_rw="sasl"
log_filters="3:virobject 3:virfile 2:virnetlink 3:cgroup 3:event 3:json 
1:libvirt 1:util 1:qemu"
cert_file="/etc/pki/vdsm/certs/vdsmcert.pem"
unix_sock_group="qemu"
listen_addr="0.0.0.0"
ca_file="/etc/pki/vdsm/certs/cacert.pem"
key_file="/etc/pki/vdsm/keys/vdsmkey.pem"
host_uuid="74e1d154-d83f-4852-9c35-3c931f8b45cf"
## end of configuration section by vdsm-4.13.0

If I comment out the auth_unix_rw line and restart libvirtd, vdsmd will
start successfully.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] starting VDSM gives "libvir: XML-RPC error : authentication failed: authentication failed" error

2014-07-22 Thread Chris Adams
Once upon a time, Chris Adams  said:
> Once upon a time, ybronhei  said:
> > On 07/21/2014 01:45 AM, Jorick Astrego wrote:
> > >Hi,
> > >
> > >Some more info, I think this is the problem and I also have it on the
> > >node image:
> > >
> > >Jul 20 22:41:49 localhost libvirtd: unable to open Berkeley db
> > >/etc/libvirt/passwd.db: No such file or directory
> > >
> > Hey
> > 
> > Did you check if libvirtd services is up?
> > can you share libvirtd.conf file?
> > I'm not sure what is exactly the issue, but you can try "vdsm-tool
> > configure --module libvirt" command to see if you set the vdsm
> > configuration for libvirt as required.
> 
> I see the same thing with a new 3.5-beta install.  libvirtd is running;
> I ran the above vdsm-tool command, but that made no difference.

The problem is that /usr/lib64/python2.6/site-packages/vdsm/constants.py
(from vdsm-python-4.16.0-3.git601f786.el6.x86_64) sets EXT_SASLPASSWD2
to /sbin/saslpasswd2, but the binary is actually in /usr/sbin.

I fixed that and manually set the password (and fixed the ovirt-ha* init
scripts), but still can't deploy a 3.5-beta hosted engine.  When
configuring the management bridge, it leaves the interface down (both
the bridge and the underlying interface); I brought them back up, but
the setup then hits this:

[ INFO  ] Verifying sanlock lockspace initialization
[ ERROR ] Failed to execute stage 'Misc configuration': [Errno 2] No such file 
or directory

I see this in the setup log:


2014-07-22 11:27:39 DEBUG otopi.context context._executeMethod:152 method 
exception
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in 
_executeMethod
method['method']()
  File 
"/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/sanlock/lockspace.py",
 line 163, in _misc
lockspace + '.metadata': md_size,
  File 
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
 line 336, in create
service_size=size)
  File 
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
 line 268, in create_volume
raise RuntimeError(response["status"]["message"])
RuntimeError: [Errno 2] No such file or directory
2014-07-22 11:27:39 ERROR otopi.context context._executeMethod:161 Failed to 
execute stage 'Misc configuration': [Errno 2] No such file or directory


Not sure what is happening there.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Misc architecture questions

2014-07-23 Thread Chris Adams
Once upon a time, Sandro Bonazzola  said:
> >> Unfortunately, it seems like Ovirt 3.4 does not support installing
> >> self-hosted engine on bond + vlan. I tried 3.5 but there were to much
> >> bug to be usable and the project is set to be deployed in 2 months.
> > 
> > should be available via:
> > http://gerrit.ovirt.org/#/c/29730/
> > ?
> 
> Yes, it's available in 3.4.3[1]. However, looks like VDSM has some issue with 
> the network [2][3]

As far as I can tell, "hosted-engine --deploy" still fails if you try to
use a VLAN on top of a bond, in both 3.4 and 3.5 (I just tried again
with a clean install of the latest 3.5 snapshot).  The deploy script
dies when it tries to configure the management bridge.  vdsm.log ends
with this (bond0 definately already exists):

Thread-16::ERROR::2014-07-23 08:40:59,436::API::1363::vds::(addNetwork) unknown 
nic: bond0
Traceback (most recent call last):
  File "/usr/share/vdsm/API.py", line 1361, in addNetwork
supervdsm.getProxy().addNetwork(bridge, options)
  File "/usr/share/vdsm/supervdsm.py", line 50, in __call__
return callMethod()
  File "/usr/share/vdsm/supervdsm.py", line 48, in 
**kwargs)
  File "", line 2, in addNetwork
  File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in 
_callmethod
raise convert_to_error(kind, result)
ConfigNetworkError: (23, 'unknown nic: bond0')

If I manually create the ovirtmgmt bridge first, I can install a
3.5-snapshot hosted engine, although the install gets stuck at the end
on "Still waiting for VDSM host to become operational".  It results in
an install that thinks the ovirtmgmt network is unsynchronized and can't
be synchronized because the network is being used (of course, it is
being used by the engine).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Misc architecture questions

2014-07-23 Thread Chris Adams
Once upon a time, Chris Adams  said:
> If I manually create the ovirtmgmt bridge first, I can install a
> 3.5-snapshot hosted engine, although the install gets stuck at the end
> on "Still waiting for VDSM host to become operational".  It results in
> an install that thinks the ovirtmgmt network is unsynchronized and can't
> be synchronized because the network is being used (of course, it is
> being used by the engine).

I found the source of the "unsynchronized" problem; the setup did not
create the interface in the oVirt config as a VLAN.  I changed the
config to include the VLAN tag, and then it sees the configuration as
synchronized.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Network question: mirrored port?

2014-08-25 Thread Chris Adams
I have a couple of traffic-monitoring servers that get a copy of all
traffic on a VLAN via a mirrored port on the switch, connected to a
dedicated port on each server.  Is there a good way to run that type of
traffic into a VM?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Error migrating VM with direct LUN disk

2016-01-06 Thread Chris Adams
I have a VM with a virtio-scsi disk that is a direct-mapped iSCSI LUN.
I'm trying to migrate it from one node to another (in the process of
updating my system from 3.5.3 to 3.5.5), and it fails migration with:

Thread-1886273::ERROR::2016-01-06 
10:31:17,657::migration::161::vm.Vm::(_recover) 
vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::unsupported configuration: 
scsi-block 'lun' devices do not support the serial property
Thread-1886273::ERROR::2016-01-06 10:31:17,693::migration::260::vm.Vm::(run) 
vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::Failed to migrate
  File "/usr/share/vdsm/virt/migration.py", line 246, in run
self._startUnderlyingMigration(time.time())
  File "/usr/share/vdsm/virt/migration.py", line 335, in 
_startUnderlyingMigration
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1701, in 
migrateToURI2
if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
Thread-1886279::DEBUG::2016-01-06 
10:31:18,539::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) Calling 
'VM.getMigrationStatus' in bridge with {u'vmID': 
u'606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43'}

Is this a known bug (maybe fixed in a newer version), something
unexpected, etc.?  Is there a way around it (other than shutting down
the VM)?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Error migrating VM with direct LUN disk

2016-01-07 Thread Chris Adams
Once upon a time, Chris Adams  said:
> I have a VM with a virtio-scsi disk that is a direct-mapped iSCSI LUN.
> I'm trying to migrate it from one node to another (in the process of
> updating my system from 3.5.3 to 3.5.5), and it fails migration with:
> 
> Thread-1886273::ERROR::2016-01-06 
> 10:31:17,657::migration::161::vm.Vm::(_recover) 
> vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::unsupported configuration: 
> scsi-block 'lun' devices do not support the serial property
> Thread-1886273::ERROR::2016-01-06 10:31:17,693::migration::260::vm.Vm::(run) 
> vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::Failed to migrate
>   File "/usr/share/vdsm/virt/migration.py", line 246, in run
> self._startUnderlyingMigration(time.time())
>   File "/usr/share/vdsm/virt/migration.py", line 335, in 
> _startUnderlyingMigration
>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1701, in 
> migrateToURI2
> if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
> dom=self)
> Thread-1886279::DEBUG::2016-01-06 
> 10:31:18,539::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) Calling 
> 'VM.getMigrationStatus' in bridge with {u'vmID': 
> u'606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43'}
> 
> Is this a known bug (maybe fixed in a newer version), something
> unexpected, etc.?  Is there a way around it (other than shutting down
> the VM)?

So, this isn't just a bug with migration; I powered off the VM and then
tried to start it on a host that had been updated to 3.5.5 and it would
not start (same error).

This is a pretty significant regression IMHO - I can't start this VM on
any 3.5.5 host.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Error migrating VM with direct LUN disk

2016-01-07 Thread Chris Adams
Once upon a time, Yedidyah Bar David  said:
> On Thu, Jan 7, 2016 at 10:16 AM, Chris Adams  wrote:
> > Once upon a time, Chris Adams  said:
> >> I have a VM with a virtio-scsi disk that is a direct-mapped iSCSI LUN.
> >> I'm trying to migrate it from one node to another (in the process of
> >> updating my system from 3.5.3 to 3.5.5), and it fails migration with:
> >>
> >> Thread-1886273::ERROR::2016-01-06 
> >> 10:31:17,657::migration::161::vm.Vm::(_recover) 
> >> vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::unsupported configuration: 
> >> scsi-block 'lun' devices do not support the serial property
> >> Thread-1886273::ERROR::2016-01-06 
> >> 10:31:17,693::migration::260::vm.Vm::(run) 
> >> vmId=`606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43`::Failed to migrate
> >>   File "/usr/share/vdsm/virt/migration.py", line 246, in run
> >> self._startUnderlyingMigration(time.time())
> >>   File "/usr/share/vdsm/virt/migration.py", line 335, in 
> >> _startUnderlyingMigration
> >>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1701, in 
> >> migrateToURI2
> >> if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
> >> dom=self)
> >> Thread-1886279::DEBUG::2016-01-06 
> >> 10:31:18,539::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) 
> >> Calling 'VM.getMigrationStatus' in bridge with {u'vmID': 
> >> u'606ae10e-bcac-4bf8-8ad0-e9d76f0c6f43'}
> >>
> >> Is this a known bug (maybe fixed in a newer version), something
> >> unexpected, etc.?  Is there a way around it (other than shutting down
> >> the VM)?
> >
> > So, this isn't just a bug with migration; I powered off the VM and then
> > tried to start it on a host that had been updated to 3.5.5 and it would
> > not start (same error).
> >
> > This is a pretty significant regression IMHO - I can't start this VM on
> > any 3.5.5 host.
> 
> Seems like a result of the fix for [1].
> 
> Please check/post vdsm/libvirt/qemu versions and logs.

Old node (that is working) is CentOS 7.1, oVirt 3.5.3:
  vdsm-4.16.20-0.el7.centos.x86_64
  libvirt-client-1.2.8-16.el7_1.3.x86_64
  qemu-kvm-ev-2.1.2-23.el7_1.3.1.x86_64

New nodes (that won't work) are CentOS 7.2, oVirt 3.5.6 (I said 3.5.5
earlier but they are updated as of yesterday):
  vdsm-4.16.30-0.el7.centos.x86_64
  libvirt-client-1.2.17-13.el7_2.2.x86_64
  qemu-kvm-ev-2.3.0-29.1.el7.x86_64

Please let me know which logs - the above snip is from vdsm.log when
trying to migrate the VM (got the same error when just trying to start
it).  All that is in the libvirt/qemu/.log when I tried to start the
VM on an upgraded node is "shutting down" (no errors or other messages).

> Do you have any vdsm hooks installed?

No.  I did wonder if I could work around this with a hook to strip out
the  in the XML (but haven't written a hook before so
haven't tried that yet).

> Was this system upgraded from a previous version? Please
> state the upgrade history.

It started with either 3.5 or 3.5.1 (can't remember for sure now), then
upgraded to 3.5.2, 3.5.3, and 3.5.6.

> Thanks.

Thank you.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] HP ProLiant ML10 v2 or ML110 Gen9 with Windows Server Essentials 2012 R2 guest

2016-01-17 Thread Chris Adams
Once upon a time, gregor  said:
> And will Windows Server 2012 run in oVirt,
> because on my test machine (HP ProLiant ML350 G5) it didn't (maybe the
> CPU is to old).

I just went through trying to get Windows Server 2012 Essentials (both
"original" and R2) running on a cluster with Nehalem CPUs, and it would
blue-screen during install.  I replicated the problem on my Fedora
desktop with plain KVM set up to emulate a Nehalem CPU.  When I switched
to Westmere or newer, Windows worked.

This appears to be some difference between Essentials and Standard
edition (I am running Windows Server 2012 Standard VMs on my cluster
just fine).  A co-worker searching around on the Internet also found
some VirtualBox users having similar issues with Essentials.

So, with Westmere or newer CPU, I think Essentials should be okay, but
don't try it with Nehalem.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Dumb question: exclamation mark next to VM?

2016-02-04 Thread Chris Adams
I set up a new oVirt 3.6.2 cluster on CentOS 7.2 (everything up to date
as of yesterday).  I created a basic CentOS 7.2 VM with my local
customizations, created a template from it, and then created a VM from
that template.

That new VM has an exclamation mark next to it in the web GUI (between
the up arror for "running" and the "server" icon).  Usually I would
expect that means something is wrong or needs attention, but I can't
find anything to fix/address/etc. (no messages in the Alerts, nothing
odd in the Events, etc.).  What does the exclamation mark mean, and how
do I clear it?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Expand hosted-engine disk?

2016-02-04 Thread Chris Adams
I'm running oVirt 3.6.2 on CentOS 7.2, all up to date, with a hosted
engine.  The storage for the engine is on a dedicated iSCSI LUN.  When I
created the LUN, I made it 40G so I'd have a little more disk space for
the engine (logs, ISOs, etc.), but then forgot to make the VM image
larger than the default 25G.

Is there an easy way to extend the image now?  When I try to do that in
the web GUI, I got the "Cannot edit Virtual Machine Disk. This VM is not
managed by the engine." error.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Dumb question: exclamation mark next to VM?

2016-02-05 Thread Chris Adams
Once upon a time, Joe DiTommasso  said:
> If you mouse over the exclamation mark, you should get a tooltip that tells
> you what it's complaining about. I've got it on pretty much all my VMs,
> it's an issue with the timezone for me.

I get nothing for the exclamation mark.  I go straight from the "Up" tip
to the "Server" tip.  The ! is in the first column with the status icon
(if you widen the columns it stay next to the up arror).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Dumb question: exclamation mark next to VM?

2016-02-05 Thread Chris Adams
Once upon a time, Darrell Budic  said:
> After upgrading to 3.6.2, I’ve got a couple that are doing this to (No actual 
> tooltip for the exclamation point). One windows, two linux, funny thing is 
> they are all down at the moment and still have this warning…

Weird, I think mine only have it when they are up.  The VMs with the
exclamation point are all CentOS 7.2+EPEL up-to-date (so
ovirt-guest-agent-common-1.0.11-1.el7 from EPEL - is that current?).

I installed a Windows Server 2012 Essentials today, using the ISO from
ovirt-guest-tools-iso-3.6.0-0.2_master.fc22, and the Windows VM does not
have an exclamation mark.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Stuck tasks in web UI

2016-02-09 Thread Chris Adams
I'm running oVirt 3.6.2 on CentOS 7.2, all up to date, with a hosted
engine.  I tried to clone a VM and that failed (not sure why yet), but
the first problem is that there are a couple of stuck tasks in the web
UI from my attempts.

The Events tab shows the tasks failed, and I ran "vdsClient -s 0
getAllTasksStatuses" on the SPM node, and it shows no tasks.  How do I
clear the tasks from the web UI?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Clone VM fails - could not create volume

2016-02-11 Thread Chris Adams
On oVirt 3.6.2, I tried to clone a VM, but got an error that the volume
couldn't be created.  Checking the logs, I see (in vdsm.log on the SPM):

jsonrpc.Executor/2::ERROR::2016-02-11 
09:50:24,459::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': 
{'message': "Image is not a legal chain: 
(u'fa4d1802-6223-4800-8339-c194076cfb4b',)", 'code': 262}}

The VM I am trying to clone is thin-provisioned from a template; is it
"legal" to clone such a VM, or is this a bug?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Clone VM fails - could not create volume

2016-02-11 Thread Chris Adams
Once upon a time, Chris Adams  said:
> On oVirt 3.6.2, I tried to clone a VM, but got an error that the volume
> couldn't be created.  Checking the logs, I see (in vdsm.log on the SPM):
> 
> jsonrpc.Executor/2::ERROR::2016-02-11 
> 09:50:24,459::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': 
> {'message': "Image is not a legal chain: 
> (u'fa4d1802-6223-4800-8339-c194076cfb4b',)", 'code': 262}}
> 
> The VM I am trying to clone is thin-provisioned from a template; is it
> "legal" to clone such a VM, or is this a bug?

Hmm, weird; I also tried just copying the disk, and I could not.
Fiddling around, I forced SPM over to another node, and copy worked, so
I tried cloning again, and that worked too.  Guess something needed a
"reset".
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [ANN] oVirt 3.6.5 Final Release is now available

2016-04-26 Thread Chris Adams
Once upon a time, Sven Kieske  said:
> On 26.04.2016 16:22, Gianluca Cecchi wrote:
> > as the
> > reported mirror site missed that too (3.6.4 released on late March) and is
> > not aligned since more than one month now...
> maybe it's time to setup some automatic mirror health checking service?
> 
> how do other repositories like centos or fedora handle such issues?

Fedora uses mirrormanager:

https://fedoraproject.org/wiki/Infrastructure/MirrorManager

but somebody has to manage the server side of that (mirror admin web
access and such).
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Restart VMs after failure

2016-05-06 Thread Chris Adams
One of my oVirt clusters, running 3.6.5, lost power last night (power
failure plus bad UPS batteries - batteries on order!).  When power came
back, the storage and nodes came back, and then the hosted engine
started, but nothing else happened (no other VMs started).

I expected that VMs that were running when the power failed would have
been restarted once the engine came back up.  Is there a way to make
that happen?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Restart VMs after failure

2016-05-06 Thread Chris Adams
Once upon a time, Nir Soffer  said:
> On Fri, May 6, 2016 at 4:01 PM, Chris Adams  wrote:
> > One of my oVirt clusters, running 3.6.5, lost power last night (power
> > failure plus bad UPS batteries - batteries on order!).  When power came
> > back, the storage and nodes came back, and then the hosted engine
> > started, but nothing else happened (no other VMs started).
> >
> > I expected that VMs that were running when the power failed would have
> > been restarted once the engine came back up.  Is there a way to make
> > that happen?
> 
> Yes, I think you need to define them as HA vm. Adding Michal to add more
> info about this.

The domains are all marked as HA.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Multipath iSCSI with several IPs

2016-08-04 Thread Chris Adams
Once upon a time, Dan Yasny  said:
> Normally you
> 1. enter the IP
> 2. click discover
> 3. login to whatever was found
> 4. enter another IP instead of the first
> 5. goto 2

How do you give the oVirt server two IPs (in the same subnet) though?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Multipath iSCSI with several IPs

2016-08-04 Thread Chris Adams
Once upon a time, James Michels  said:
> Correct me if I'm wrong but I think Dan meant target's IPs. So if you have
> a SAN backend with two IP addresses, you first discover LUNs from first IP
> address, then discover LUNs from the second IP address, and so on... once
> you have them all, you just check them and click on "OK" so the same target
> is added with several IP addresses. You don't need to have one IP address
> per oVirt server.

Well, to do iSCSI multipath right, you should also have multiple
interfaces on each client server, each with its own IP.  I'm not sure
how you do that with oVirt.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Multipath iSCSI with several IPs

2016-08-04 Thread Chris Adams
Once upon a time, Yaniv Kaul  said:
> BTW, having two IPs on a single subnet is not a great idea - it usually
> mean you have a SPOF somewhere (the switch perhaps?).

Two NICs on the server, two NICs on the iSCSI target, each with an IP
per NIC, and connected to two switches in between (either stacked or
trunked).  No SPOF.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Multipath iSCSI with several IPs

2016-08-04 Thread Chris Adams
Once upon a time, James Michels  said:
> I guess you mean the 'iSCSI multipath' sub-tab under the 'Datacenters' tab.
> There you can assign one or more networks to a iSCSI backend. In my opinion
> you cannot have more than one interface within the same network segment to
> do multipath, as you would have connectivity issues (not sure if ovirt
> restricts creating two overlapping networks)

The way I did it on a test 3.6 cluster was to create two networks in
oVirt, "storage1" and "storage2".  I assigned both networks to the
hosts, connected to different NICs, and gave each an IP (in the iSCSI
subnet).  Then I could set up the iSCSI multipath in the oVirt data
center.  This seems weird/wrong, and I'm not sure oVirt actually
configured both NICs in multipath.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] LVM2 Thinprovisioned

2016-08-11 Thread Chris Adams
Once upon a time, Fernando Frediani  said:
> Thanks for the answer anyway. Hopefully at least LVM2
> Thinprovisioning comes up anytime soon.

This has nothing to do with oVirt; it is something the core Linux LVM
code does not support.  Last time I looked, nobody was working on it
upstream.

You can still thin-provision VMs in oVirt, there's just not a way to
release space if a VM image shrinks significantly.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] LVM2 Thinprovisioned

2016-08-11 Thread Chris Adams
Once upon a time, Fernando Frediani  said:
> I use LVM2 and Thinprovisioned LVs to put Filesystems and it works
> with no issues. It's just a question of handling it correctly to
> tell it how to create each storage chunk that way. The same way
> those LVs can be used to run VMs as they are in traditional LVM.
> 
> Not sure what you mean by cote Linux not supporting it.

To do that with multiple access, you have to be running in clustered LVM
mode, and thin provisioning is not supported with CLVM.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Migrating self hosted engine from iSCSI to NFS ?

2016-09-29 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> unfortunately moving an existing hosted-engine env from one storage kind to
> another (without manually touching the engine DB) is currently not
> supported. Please see:
> http://lists.ovirt.org/pipermail/users/2016-July/041526.html

I'm digging through this now, as I need to move my oVirt 3.5 setup from
one storage array to another (both iSCSI), including the hosted engine.

Reading this:

https://bugzilla.redhat.com/show_bug.cgi?id=1240466#c21

it sounds like that's not currently possible (at least with 3.5).  Is
that correct?  I was planning to follow this process:

https://www.ovirt.org/documentation/admin-guide/hosted-engine-backup-and-restore/

which says "point to the new shared storage" - will that not work?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Migrating self hosted engine from iSCSI to NFS ?

2016-09-29 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> The issue is that the engine DB backup you are going to restore already
> contains a reference to the previous hosted-engine storage domain and to
> the previous hosted-engine VM and so on and so the auto-import procedure to
> have the engine VM looking up for its own infrastructure will not trigger.
> You have to manually remove them form the DB you restored.

Okay, that makes sense.  I see this from you:

https://gerrit.ovirt.org/#/c/64966/

Should that work okay with a 3.5 database?  I'm familiar with SQL, so if
it needs some tweaks, I can handle that (just looking really to see if
that's the right general idea).

If so, could I connect the new iSCSI storage to a host, shutdown the
engine, "dd" the engine over, start up the new location in single-user
mode, and make the DB change?

Basically, just wondering if I could skip the full install and jump
right to an installed system.

Thanks for your help.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Migrating self hosted engine from iSCSI to NFS ?

2016-09-30 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> On Thu, Sep 29, 2016 at 6:35 PM, Chris Adams  wrote:
> > If so, could I connect the new iSCSI storage to a host, shutdown the
> > engine, "dd" the engine over, start up the new location in single-user
> > mode, and make the DB change?
> >
> > Basically, just wondering if I could skip the full install and jump
> > right to an installed system.
> 
> With 3.5 you probably can do just that.
> Then you have to edit /etc/ovirt-hosted-engine/hosted-engine.conf on all of
> your hosts to point to the new storage device.

Okay, thanks.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Import from OVA with encrypted root

2016-11-08 Thread Chris Adams
I'm trying to import an appliance image from a vendor.  It is based on
Debian.  For some added level of "security" I guess, the vendor disk
image has the root filesystem encrypted (and then the key is in the
initrd - I know that's no real added security, but... whatever).

Trying to import this VM into oVirt fails because it can't find/mount
the root filesystem.

Is there any way around this?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Import from OVA with encrypted root

2016-11-09 Thread Chris Adams
Once upon a time, Tomáš Golembiovský  said:
> unfortunately virt-v2v cannot import VMs with encrypted root file
> system. Moreover import of Debian/Ubuntu/Mint guests is not yet
> supported by oVirt either. For that you would need development version
> of virt-v2v. There are no packages for RHEL/CentOS yet. There should be
> packages in Fedora rawhide if you feel brave enough to setup such host
> in oVirt (Note: I'm not suggesting you or anyone should do that).

So, I went the manual route.  I made a new VM of appropriate size, with
a non-thin-provisioned IDE disk, and booted it from a rescue CD.  I
extracted the vmdk from the ova file, used qemu-img to convert it to
raw, and used netcat to dump it over the network into the VM and onto
the disk.

That of course doesn't do any of the things that should be done to
"convert" a VM, but (at least in this case), it appears to have worked
"good enough" (the VM boots and gets on the network).

Still amused that somebody thinks distributing an image with encrypted
filesystems, and the key for that encryption in the initrd, does
anything to "secure" their image.  Sigh...
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [HEADS UP] CentOS 7.3 is rolling out, need qemu-kvm-ev 2.6

2016-12-12 Thread Chris Adams
Once upon a time, Sandro Bonazzola  said:
> In terms of ovirt repositories, qemu-kvm-ev 2.6 is available right now in
> ovirt-master-snapshot-static, ovirt-4.0-snapshot-static, and ovirt-4.0-pre
> (contains 4.0.6 RC4 rpms going to be announced in a few minutes.)

Will qemu-kvm-ev 2.6 be added to any of the oVirt repos for prior
versions (such as 3.5 or 3.6)?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Optimizations for VoIP VM

2017-01-03 Thread Chris Adams
Once upon a time, Jim Kusznir  said:
> Are there known issues with VoIP on Ovirt-managed clusters?  (I know well
> reputed companies that sell VoIP server virtual hosting and guarantee the
> performance, so I know VoIP Virtualization is possible, just need to know
> if its recommended with Ovirt, and if so what do I need to do to give it
> the best chance of success?)

I am running Asteria (an Asterisk-based PBX system targeted at small
call-center type setups) in an oVirt VM with no problems.  We typically
have 30-50 calls at a time during the business day.

I've also set up Digium's Switchvox in an oVirt VM without issue (small
office setup, so not a lot of simultaneous calls).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Optimizations for VoIP VM

2017-01-04 Thread Chris Adams
Once upon a time, Yaniv Dary  said:
> Can you please describe the application network requirements?
> Does it relay on low latency? Pass-through or SR-IOV could help with
> reducing that.

For VoIP, latency can be an issue, but the amount of latency from adding
VM networking overhead isn't a big deal (because other network latency
will have a larger impact).  10ms isn't really a problem for VoIP for
example.

The bigger network concern for VoIP is jitter; for that, the only
solution is to not over-provision hardware CPUs or total network
bandwidth.

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Attaching ISO to hosted engine for OS upgrade

2017-02-22 Thread Chris Adams
I'm working on upgrading an oVirt 3.5 setup.  The physical hosts are
running CentOS 7, but the hosted engine is CentOS 6.  The upgrade notes
are "back up the engine, upgrade/reinstall the OS, then restore", but I
can't see how to actually install CentOS 7 on the engine.

Am I supposed to re-run "hosted-engine --deploy"?  Wouldn't that try to
re-register the physical hosts, or can I interrupt it to restore the
backup?

Or, is there a way to just attach an install ISO to the engine VM and
boot from that?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Attaching ISO to hosted engine for OS upgrade

2017-02-22 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> Then ee have a specific helper utility for 3.6/el6 -> 4.0/el7:
> https://www.ovirt.org/develop/release-management/features/hosted-engine-migration-to-4-0/

Ahh, that looks better.  I was looking at this:

https://www.ovirt.org/documentation/migration-engine-36-to-40/

which just kind of glosses over how to upgrade the OS. :)

I do usually use my custom CentOS install (rather than the appliance);
is there a way to do that?

Also, is it normally recommended to upgrade one major release at a time?
In other words, aside from the engine CentOS6->7 step, would upgrading
from 3.5 to 4.1 need to go through 3.6 and 4.0 along the way?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Attaching ISO to hosted engine for OS upgrade

2017-02-23 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> On Wed, Feb 22, 2017 at 8:04 PM, Chris Adams  wrote:
> > Also, is it normally recommended to upgrade one major release at a time?
> 
> For the engine it's not just recommended, it's mandatory!

Ahh, I didn't realize that.  I don't think I saw that in the
documentation (but maybe I just missed it?).

Thanks.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Recognize HE iSCSI volume size change

2017-02-24 Thread Chris Adams
I'm testing upgrading an oVirt 3.5 setup, and I have run into a problem
when going from 3.5 to 3.6 on a physical machine configured for the
hosted engine.  I upgraded the engine itself okay, but when I upgraded
the first physical machine, it cannot be re-activated; it gets an error
connecting to the storage domain.

Checking the logs, it looks like it is looping trying to create a new LV
in the HE VG.  I assume this is for moving the HE config to the shared
storage?  It is failing because it is trying to create a 1G LV, but the
VG only has 512M free space.

I extended the iSCSI volume, but there doesn't appear to be anyway to
get the HE nodes to recognize this; they both still see the original
size, no matter what I try.  Is there a way to get them to see the
larger PV, so the new LV(s) can be created?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Recognize HE iSCSI volume size change

2017-02-24 Thread Chris Adams
I did see that page, but... I can't get there from here.  I can't get
upgraded from 3.5 until I get past this problem, and on 3.5, the hosted
engine storage domain isn't included in the normal UI at all.

I think I did get this working though; both servers had kernel messages
that they saw the LUN resize, but they didn't actually change the block
device size to reflect that.  After rebooting each server (separately),
lsblk showed the new size on both, and a manual pvresize on both shows
the increased VG size.

On to testing 3.6 upgrade again!


Once upon a time, Adam Litke  said:
> Hi Chris.  We added this feature to newer versions of oVirt (see the
> feature page[1]).  The easiest way to work around this problem might be to
> add an additional LUN to this domain if you are able to do it.  If not, it
> looks like you would need to manually reconnect the host to the domain, to
> a pvresize to the new size.  I am not sure if any engine DB updates will
> also be required.  Nir and Fred worked on this feature and might be able to
> assist you further.
> 
> 
> [1]
> https://www.ovirt.org/develop/release-management/features/storage/lun-resize/
> 
> On Fri, Feb 24, 2017 at 9:00 AM, Chris Adams  wrote:
> 
> > I'm testing upgrading an oVirt 3.5 setup, and I have run into a problem
> > when going from 3.5 to 3.6 on a physical machine configured for the
> > hosted engine.  I upgraded the engine itself okay, but when I upgraded
> > the first physical machine, it cannot be re-activated; it gets an error
> > connecting to the storage domain.
> >
> > Checking the logs, it looks like it is looping trying to create a new LV
> > in the HE VG.  I assume this is for moving the HE config to the shared
> > storage?  It is failing because it is trying to create a 1G LV, but the
> > VG only has 512M free space.
> >
> > I extended the iSCSI volume, but there doesn't appear to be anyway to
> > get the HE nodes to recognize this; they both still see the original
> > size, no matter what I try.  Is there a way to get them to see the
> > larger PV, so the new LV(s) can be created?
> >
> > --
> > Chris Adams 
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
> 
> 
> 
> -- 
> Adam Litke

> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users


-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Recognize HE iSCSI volume size change

2017-02-24 Thread Chris Adams
Once upon a time, Nir Soffer  said:
> This is not enough, you need to resize the multiapth mapping on all hosts,
> resize the pv using the LUN (must be done by the SPM), and invalidate
> vdsm lvm cache on all hosts, so they go to storage and see the new size of
> the pv.

I did all of this except invaliding the vdsm lvm cache - how would I do
that?

I will say just doing up to the pvresize worked in my test environment
(but I might have just been lucky).
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Recognize HE iSCSI volume size change

2017-02-24 Thread Chris Adams
Once upon a time, Nir Soffer  said:
> I think the complete flow on 3.5 can work like this:
> 
> 1. stop ovirt-engine, so it will not try to restart vdsm on any host
> 2. stop vdsm on all hosts
> 3. rescan scsi bus, resizing luns on all hosts
> 4. pvresize the pv from one on the host
> 5. start vdsm on all hosts
> 6. start ovirt-engine
> 
> This will allow resize while the storage is online and vms are running.

Thanks!

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] 3.5->3.6 did not import hosted engine storage domain

2017-02-24 Thread Chris Adams
So, on to my next upgrade issue (sorry for all the questions and thanks
for everybody's help)...  I upgraded my test cluster from 3.5 to 3.6
(latest version of each, all on CentOS 7 except the engine on CentOS 6).
Now I'm working on the next step, upgrading to 4.0 and migrating the HE
to the appliance.

When I went from 3.5 to 3.6, I ended up with an fhanswers.conf in the
shared storage that only contained "None"; I fixed that based on some
mailing list messages (but just mentioning it in case it could be
related).

My problem is that the hosted engine storage domain did not get imported
into the engine DB, so I can't proceed with "hosted-engine
--upgrade-appliance".  I didn't see any errors, so I'm not sure how that
happened.  I'm also not sure how to fix that.

Suggestions?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain

2017-02-27 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> Can you please attach your engine.log ?

Sorry, I was rolling back to 3.5 snapshots to test my 3.6 procedure
(trying to make sure I didn't just screw up), made a mistake, and
started over.

Now however, I can't do anything, because jpackage.org has really
screwed up their DNS - registered to 3 nameservers, two of which only
exist as glue records (not in authoritative DNS), and all three point to
the same IP (which is not responding).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Cluster compatibility version and major upgrades

2017-02-28 Thread Chris Adams
Hello again, still working on my upgrade from 3.5...

I'm trying to understand the cluster compatibility version setting and
how that applies to major upgrades.  Do I have to always raise the
compatibility version when I do a major upgrade?  In other words, when I
upgrade from 3.5 to 3.6, do I need to raise it to 3.6 before I upgrade
to 4.0 (and then again raise it 4.0 before upgrading to 4.1)?

It looks like the 3.6->4.0 EL6->EL7 migration requires the cluster
compatibility level to be at 3.6 (if I'm reading things right).

It appears that when I upgrade to 3.6, I will have to stop all running
VMs to raise the compatibility version (and I found an open bug about
whether that's possible with the hosted engine).  It sounds like with
4.0, the VMs can be flagged for compatibility and I can reboot them
individually.  I have over 80 VMs, many behind a load balancer (for HA
and load sharing), but taking them all down will obviously still
interrupt service for a while.

Is there a safe way around that?

I saw someone mention they partitioned their servers and made a new
cluster (with the new version), and migrated VMs from cluster to
cluster.  Can I do live migrations in that case?  How do I get the
hosted engine from one cluster to another (especially with starting at
3.5)?

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain

2017-03-01 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> > I recreated a 3.5 setup and upgraded the engine to 3.6 - that should
> > have been enough to import the hosted engine storage domain, right?
> 
> Did you also raised the cluster compatibility level to 3.6 on the engine?

No (I didn't realize this didn't happen until that was changed).

However, now I'm back into the catch-22 of 3.6.7+hosted engine: the
cluster compatibility level can't be raised while there's a running VM,
and that includes the hosted engine.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain

2017-03-01 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> On Wed, Mar 1, 2017 at 3:19 PM, Chris Adams  wrote:
> > However, now I'm back into the catch-22 of 3.6.7+hosted engine: the
> > cluster compatibility level can't be raised while there's a running VM,
> > and that includes the hosted engine.
> 
> Please see this one:
> https://bugzilla.redhat.com/show_bug.cgi?id=1364557
> 
> Simply define 'InClusterUpgrade' scheduling policy on the HE VM cluster

I first tried setting the policy, but got "Error while executing action:
The set cluster compatibility version does not allow mixed major host OS
versions. Can not start the cluster upgrade."; I guess this is because
my hosts are CentOS 7 and the engine is CentOS 6?

I tried changing the engine config to skip that check from comment 10
step 3, but got:
- Can not start cluster upgrade mode, see below for details:
- VM HostedEngine with id 4a035efd-a041-4e46-84db-01cf79400913 is
  configured to be not migratable.

I did the SQL update from comment 1, and then I could set the policy.

However, I still can't change the cluster compatibility version.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain

2017-03-01 Thread Chris Adams
Once upon a time, Simone Tiraboschi  said:
> On Wed, Mar 1, 2017 at 5:04 PM, Chris Adams  wrote:
> > I first tried setting the policy, but got "Error while executing action:
> > The set cluster compatibility version does not allow mixed major host OS
> > versions. Can not start the cluster upgrade."; I guess this is because
> > my hosts are CentOS 7 and the engine is CentOS 6?
> 
> This is not an issue, are you sure that all the hosts are el7 based?

Yes, there are only two hosts (dev/test setup), both freshly installed
with CentOS 7.3 plus all current updates.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5->3.6 did not import hosted engine storage domain

2017-03-07 Thread Chris Adams
Once upon a time, Chris Adams  said:
> However, now I'm back into the catch-22 of 3.6.7+hosted engine: the
> cluster compatibility level can't be raised while there's a running VM,
> and that includes the hosted engine.

I'm still stuck on this - anybody have any solution?  Because of this, I
can't upgrade my cluster.
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


  1   2   >