Re: [ovirt-users] ovirt and gateway behavior

2018-02-05 Thread Edward Haas
Hi Alex,

Please provide Engine logs from when this is occurring and mention the
date/time we should focus at.

Thanks,
Edy.


On Mon, Feb 5, 2018 at 2:19 PM, Alex K  wrote:

> Hi all,
>
> I have a 3 nodes ovirt 4.1 cluster, self hosted on top of glusterfs. The
> cluster is used to host several VMs.
> I have observed that when gateway is lost (say the gateway device is down)
> the ovirt cluster goes down.
>
> It seems a bit extreme behavior especially when one does not care if the
> hosted VMs have connectivity to Internet or not.
>
> Can this behavior be disabled?
>
> Thanx,
> Alex
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] NetworkManager with oVirt version 4.2.0

2018-02-05 Thread Edward Haas
On Sun, Feb 4, 2018 at 10:01 PM, Vincent Royer 
wrote:

> I had these types of issues as well my first time around, and after a
> failed engine install I haven't been able to get things cleaned up, so I
> will have to start over. I created a bonded interface on the host before
> the engine setup. but once I created my first VM and assigned bond0 to it,
> the engine became inaccessible the moment the VM got an IP from the
> router.
>

Please clarify what does it mean to "assign bond0 to it".  A vnic can be
defined on a network (using vnic profiles).
If your Engine is inaccessible, try to understand what changed in the
network, perhaps something collided (duplicate IP/s, routes, mac/s etc).


> What is the preferred way to setup bonded interfaces?  In Cockpit or nmcli
> before hosted engine setup?  Or proceed with only one interface then add
> the other in engine?
>

All should work.


>
> Is it possible, for example, to setup bonded interfaces with a static
> management IP on vlan 50 to access the engine, and let the other VMs grab
> DHCP IPs on vlan 10?
>

Sure it is, one is the management (vlan 50) network and the other a VM
network (vlan 10).


>
>
> On Feb 3, 2018 11:31 PM, "Edward Haas"  wrote:
>
>
>
> On Sat, Feb 3, 2018 at 9:06 AM, maoz zadok  wrote:
>
>> Hello All,
>> I'm new to oVirt, I'm trying with no success to set up the networking on
>> an oVirt 4.2.0 node, and I think I'm missing something.
>>
>> background:
>> interfaces em1-4 is bonded to bond0
>> VLAN configured on bond0.1
>> and bridged to ovirtmgmt for the management interface.
>>
>> I'm not sure its updated to version 4.2.0 but I followed this post:
>> https://www.ovirt.org/documentation/how-to/networking/bondin
>> g-vlan-bridge/
>>
>
> It looks like an old howto, we will need to update or remove it.
>
>
>>
>> with this setting, the NetworkManager keep starting up on reboot,
>> and the interfaces are not managed by oVirt (and the nice traffic graphs
>> are not shown).
>>
>
> For the interfaces to be owned by oVirt, you will need to add the host to
> Engine.
> So I would just configure everything up to the VLAN (slaves, bond, VLAN)
> with NetworkManager prior to adding it to Engine. The bridge should be
> created when you add the host.
> (assuming the VLAN you mentioned is your management interface and its ip
> is the one used by Engine)
>
>
>>
>>
>>
>>
>> my question:
>> Is  NetworkManager need to be disabled as in the above post?
>>
>
> No (for 4.1 and 4.2)
>
> Do I need to manage the networking using (nmtui) NetworkManager?
>>
>
> You better use cockpit or nmcli to configure the node before you add it to
> Engine.
>
>
>>
>> Thanks!
>> Maoz
>>
>>
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Slow conversion from VMware in 4.1

2018-02-05 Thread Richard W.M. Jones
On Mon, Feb 05, 2018 at 10:57:58PM +0100, Luca 'remix_tj' Lorenzetto wrote:
> On Fri, Feb 2, 2018 at 12:52 PM, Richard W.M. Jones  wrote:
> > There is a section about this in the virt-v2v man page.  I'm on
> > a train at the moment but you should be able to find it.  Try to
> > run many conversions, at least 4 or 8 would be good places to start.
> 
> Hello Richard,
> 
> read the man but found nothing explicit about resource usage. Anyway,
> digging on our setup i found out that vcenter when on low cpu usage is
> 95%.
> I think our windows admins should take care of this.

http://libguestfs.org/virt-v2v.1.html#vmware-vcenter-resources

You should be able to run multiple conversions in parallel
to improve throughput.

The only long-term solution is to use a different method such as VMX
over SSH.  vCenter is just fundamentally bad.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Slow conversion from VMware in 4.1

2018-02-05 Thread Luca 'remix_tj' Lorenzetto
On Fri, Feb 2, 2018 at 12:52 PM, Richard W.M. Jones  wrote:
> There is a section about this in the virt-v2v man page.  I'm on
> a train at the moment but you should be able to find it.  Try to
> run many conversions, at least 4 or 8 would be good places to start.

Hello Richard,

read the man but found nothing explicit about resource usage. Anyway,
digging on our setup i found out that vcenter when on low cpu usage is
95%.
I think our windows admins should take care of this.

Luca

-- 
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)

"Internet è la più grande biblioteca del mondo.
Ma il problema è che i libri sono tutti sparsi sul pavimento"
John Allen Paulos, Matematico (1945-vivente)

Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] guest ip address not shown on the engine panel - version 4.2.0

2018-02-05 Thread Dan Yasny
Do you have the guest agent installed?

On Mon, Feb 5, 2018 at 4:18 PM, maoz zadok  wrote:

> Hi All,
> Is it possible that the "IP addresses" of the guest virtual machine will
> be shown? it currently empty.
>
> [image: Inline image 1]
>
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] guest ip address not shown on the engine panel - version 4.2.0

2018-02-05 Thread maoz zadok
Hi All,
Is it possible that the "IP addresses" of the guest virtual machine will be
shown? it currently empty.

[image: Inline image 1]
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt 3.6, we had the ovirt manager go down in a bad way and all VMs for one node marked Unknown and Not Reponding while up

2018-02-05 Thread Christopher Cox
Answering my own post... a restart of vdsmd on the affected blade has 
fixed everything.  Thanks everyone who helped.



On 02/05/2018 10:02 AM, Christopher Cox wrote:
Forgive the top post.  I guess what I need to know now is whether there 
is a recovery path that doesn't lead to total loss of the VMs that are 
currently in the "Unknown" "Not responding" state.


We are planning a total oVirt shutdown.  I just would like to know if 
we've effectively lot those VMs or not.  Again, the VMs are currently 
"up".  And we use a file backup process, so in theory they can be 
restored, just somewhat painfully, from scratch.


But if somebody knows if we shutdown all the bad VMs and the blade, is 
there someway oVirt can know the VMs are "ok" to start up??  Will 
changing their state directly to "down" in the db stick if the blade is 
down?  That is, will we get to a known state where the VMs can actually 
be started and brought back into a known state?


Right now, we're feeling there's a good chance we will not be able to 
recover these VMs, even though they are "up" right now.  I really need 
some way to force oVirt into an integral state, even if it means we take 
the whole thing down.


Possible?


On 01/25/2018 06:57 PM, Christopher Cox wrote:



On 01/25/2018 04:57 PM, Douglas Landgraf wrote:
On Thu, Jan 25, 2018 at 5:12 PM, Christopher Cox 
 wrote:

On 01/25/2018 02:25 PM, Douglas Landgraf wrote:


On Wed, Jan 24, 2018 at 10:18 AM, Christopher Cox 


wrote:


Would restarting vdsm on the node in question help fix this? 
Again, all

the
VMs are up on the node.  Prior attempts to fix this problem have 
left the
node in a state where I can issue the "has been rebooted" command 
to it,

it's confused.

So... node is up.  All VMs are up.  Can't issue "has been 
rebooted" to

the
node, all VMs show Unknown and not responding but they are up.

Chaning the status is the ovirt db to 0 works for a second and 
then it

goes
immediately back to 8 (which is why I'm wondering if I should restart
vdsm
on the node).



It's not recommended to change db manually.



Oddly enough, we're running all of this in production.  So, 
watching it

all
go down isn't the best option for us.

Any advice is welcome.




We would need to see the node/engine logs, have you found any error in
the vdsm.log
(from nodes) or engine.log? Could you please share the error?




In short, the error is our ovirt manager lost network (our problem) and
crashed hard (hardware issue on the server)..  On bring up, we had some
network changes (that caused the lost network problem) so our LACP 
bond was
down for a bit while we were trying to bring it up (noting the ovirt 
manager

is up while we're reestablishing the network on the switch side).

In other word, that's the "error" so to speak that got us to where 
we are.


Full DEBUG enabled on the logs... The error messages seem obvious to 
me..
starts like this (nothing the ISO DOMAIN was coming off an NFS mount 
off the
ovirt management server... yes... we know... we do have plans to 
move that).


So on the hypervisor node itself, from the vdsm.log (vdsm.log.33.xz):

(hopefully no surprise here)

Thread-2426633::WARNING::2018-01-23
13:50:56,672::fileSD::749::Storage.scanDomains::(collectMetaFiles) 
Could not

collect metadata file for domain path
/rhev/data-center/mnt/d0lppc129.skopos.me:_var_lib_exports_iso-20160408002844 


Traceback (most recent call last):
   File "/usr/share/vdsm/storage/fileSD.py", line 735, in 
collectMetaFiles

 sd.DOMAIN_META_DATA))
   File "/usr/share/vdsm/storage/outOfProcess.py", line 121, in glob
 return self._iop.glob(pattern)
   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", 
line 536,

in glob
 return self._sendCommand("glob", {"pattern": pattern}, 
self.timeout)
   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", 
line 421,

in _sendCommand
 raise Timeout(os.strerror(errno.ETIMEDOUT))
Timeout: Connection timed out
Thread-27::ERROR::2018-01-23
13:50:56,672::sdc::145::Storage.StorageDomainCache::(_findDomain) 
domain

e5ecae2f-5a06-4743-9a43-e74d83992c35 not found
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/sdc.py", line 143, in _findDomain
 dom = findMethod(sdUUID)
   File "/usr/share/vdsm/storage/nfsSD.py", line 122, in findDomain
 return NfsStorageDomain(NfsStorageDomain.findDomainPath(sdUUID))
   File "/usr/share/vdsm/storage/nfsSD.py", line 112, in findDomainPath
 raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'e5ecae2f-5a06-4743-9a43-e74d83992c35',)
Thread-27::ERROR::2018-01-23
13:50:56,673::monitor::276::Storage.Monitor::(_monitorDomain) Error
monitoring domain e5ecae2f-5a06-4743-9a43-e74d83992c35
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/monitor.py", line 272, in 
_monitorDomain

 self._performDomainSelftest()
   File 

Re: [ovirt-users] oVirt DR: ansible with 4.1, only a subset of storage domain replicated

2018-02-05 Thread Maor Lipchuk
Hi Luca,

Thank you for your interst in the Disaster Recovery ansible solution, it is
great to see users get familiar with it.
Please see my comments inline

Regards,
Maor

On Mon, Feb 5, 2018 at 7:54 PM, Yaniv Kaul  wrote:

>
>
> On Feb 5, 2018 5:00 PM, "Luca 'remix_tj' Lorenzetto" <
> lorenzetto.l...@gmail.com> wrote:
>
> Hello,
>
> i'm starting the implementation of our disaster recovery site with RHV
> 4.1.latest for our production environment.
>
> Our production setup is very easy, with self hosted engine on dc
> KVMPDCA, and virtual machines both in KVMPDCA and KVMPD dcs. All our
> setup has an FC storage backend, which is EMC VPLEX/VMAX in KVMPDCA
> and EMC VNX8000. Both storage arrays supports replication via their
> own replication protocols (SRDF, MirrorView), so we'd like to delegate
> to them the replication of data to the remote site, which is located
> on another remote datacenter.
>
> In KVMPD DC we have some storage domains that contains non critical
> VMs, which we don't want to replicate to remote site (in case of
> failure they have a low priority and will be restored from a backup).
> In our setup we won't replicate them, so will be not available for
> attachment on remote site. Can be this be an issue? Do we require to
> replicate everything?
>
>
No, it is not required to replicate everything.
If there are no disks on those storage domains that attached to your
critical VMs/Templates you don't have to use them as part of yout mapping
var file


> What about master domain? Do i require that the master storage domain
> stays on a replicated volume or can be any of the available ones?
>
>

You can choose which storage domains you want to recover.
Basically, if a storage domain is indicated as "master" in the mapping var
file then it should be attached first to the Data Center.
If your secondary setup already contains a master storage domain which you
dont care to replicate and recover, then you can configure your mapping var
file to only attach regular storage domains, simply indicate
"dr_master_domain: False" in the dr_import_storages for all the storage
domains. (You can contact me on IRC if you need some guidance with it)


>
> I've seen that since 4.1 there's an API for updating OVF_STORE disks.
> Do we require to invoke it with a frequency that is the compatible
> with the replication frequency on storage side.
>
>

No, you don't have to use the update OVF_STORE disk for replication.
The OVF_STORE disk is being updated every 60 minutes (The default
configuration value),


> We set at the moment
> RPO to 1hr (even if planned RPO requires 2hrs). Does OVF_STORE gets
> updated with the required frequency?
>
>

OVF_STORE disk is being updated every 60 minutes but keep in mind that the
OVF_STORE is being updated internally in the engine so it might not be
synced with the RPO which you configured.
If I understood correctly, then you are right by indicating that the data
of the storage domain will be synced at approximatly 2 hours = RPO of 1hr +
 OVF_STORE update of 1hr


>
> I've seen a recent presentation by Maor Lipchuk that is showing the
> "automagic" ansible role for disaster recovery:
>
> https://www.slideshare.net/maorlipchuk/ovirt-dr-site-tosite-using-ansible
>
> It's also related with some youtube presentations demonstrating a real
> DR plan execution.
>
> But what i've seen is that Maor is explicitly talking about 4.2
> release. Does that role works only with >4.2 releases or can be used
> also on earlier (4.1) versions?
>
>
> Releases before 4.2 do not store complete information on the OVF store to
> perform such comprehensive failover. I warmly suggest 4.2!
> Y.
>

Indeed,
We also introduced several functionalities like detach of master storage
domain , and attach of "dirty" master storage domain which are depndant on
the failover process, so unfortunatly to support a full recovery process
you will need oVirt 4.2 env.


>
> I've tested a manual flow of replication + recovery through Import SD
> followed by Import VM and worked like a charm. Using a prebuilt
> ansible role will reduce my effort on creating a new automation for
> doing this.
>
> Anyone has experiences like mine?
>
> Thank you for the help you may provide, i'd like to contribute back to
> you with all my findings and with an usable tool (also integrated with
> storage arrays if possible).
>
>
Please feel free to share your comments and questions, I would very
appreciate to know your user expirience.


>
> Luca
>
> (Sorry for duplicate email, ctrl-enter happened before mail completion)
>
>
> --
> "E' assurdo impiegare gli uomini di intelligenza eccellente per fare
> calcoli che potrebbero essere affidati a chiunque se si usassero delle
> macchine"
> Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)
>
> "Internet è la più grande biblioteca del mondo.
> Ma il problema è che i libri sono tutti sparsi sul pavimento"
> John Allen Paulos, Matematico (1945-vivente)
>
> Luca 'remix_tj' 

Re: [ovirt-users] oVirt DR: ansible with 4.1, only a subset of storage domain replicated

2018-02-05 Thread Yaniv Kaul
On Feb 5, 2018 5:00 PM, "Luca 'remix_tj' Lorenzetto" <
lorenzetto.l...@gmail.com> wrote:

Hello,

i'm starting the implementation of our disaster recovery site with RHV
4.1.latest for our production environment.

Our production setup is very easy, with self hosted engine on dc
KVMPDCA, and virtual machines both in KVMPDCA and KVMPD dcs. All our
setup has an FC storage backend, which is EMC VPLEX/VMAX in KVMPDCA
and EMC VNX8000. Both storage arrays supports replication via their
own replication protocols (SRDF, MirrorView), so we'd like to delegate
to them the replication of data to the remote site, which is located
on another remote datacenter.

In KVMPD DC we have some storage domains that contains non critical
VMs, which we don't want to replicate to remote site (in case of
failure they have a low priority and will be restored from a backup).
In our setup we won't replicate them, so will be not available for
attachment on remote site. Can be this be an issue? Do we require to
replicate everything?
What about master domain? Do i require that the master storage domain
stays on a replicated volume or can be any of the available ones?

I've seen that since 4.1 there's an API for updating OVF_STORE disks.
Do we require to invoke it with a frequency that is the compatible
with the replication frequency on storage side. We set at the moment
RPO to 1hr (even if planned RPO requires 2hrs). Does OVF_STORE gets
updated with the required frequency?

I've seen a recent presentation by Maor Lipchuk that is showing the
"automagic" ansible role for disaster recovery:

https://www.slideshare.net/maorlipchuk/ovirt-dr-site-tosite-using-ansible

It's also related with some youtube presentations demonstrating a real
DR plan execution.

But what i've seen is that Maor is explicitly talking about 4.2
release. Does that role works only with >4.2 releases or can be used
also on earlier (4.1) versions?


Releases before 4.2 do not store complete information on the OVF store to
perform such comprehensive failover. I warmly suggest 4.2!
Y.


I've tested a manual flow of replication + recovery through Import SD
followed by Import VM and worked like a charm. Using a prebuilt
ansible role will reduce my effort on creating a new automation for
doing this.

Anyone has experiences like mine?

Thank you for the help you may provide, i'd like to contribute back to
you with all my findings and with an usable tool (also integrated with
storage arrays if possible).

Luca

(Sorry for duplicate email, ctrl-enter happened before mail completion)


--
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)

"Internet è la più grande biblioteca del mondo.
Ma il problema è che i libri sono tutti sparsi sul pavimento"
John Allen Paulos, Matematico (1945-vivente)

Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , <
lorenzetto.l...@gmail.com>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine 4.2.1-pre setup on a clean node..

2018-02-05 Thread Simone Tiraboschi
On Fri, Feb 2, 2018 at 9:10 PM, Thomas Davis  wrote:

> Is this supported?
>
> I have a node, that centos 7.4 minimal is installed on, with an interface
> setup for an IP address.
>
> I've yum installed nothing else except the ovirt-4.2.1-pre rpm, run
> screen, and then do the 'hosted-engine --deploy' command.
>

Fine, nothing else is required.


>
> It hangs on:
>
> [ INFO  ] changed: [localhost]
> [ INFO  ] TASK [Get ovirtmgmt route table id]
> [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": true,
> "cmd": "ip rule list | grep ovirtmgmt | sed s/[.*]\\ //g | awk '{
> print $9 }'", "delta": "0:00:00.004845", "end": "2018-02-02
> 12:03:30.794860", "rc": 0, "start": "2018-02-02 12:03:30.790015", "stderr":
> "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing
> ansible-playbook
> [ INFO  ] Stage: Clean up
> [ INFO  ] Cleaning temporary resources
> [ INFO  ] TASK [Gathering Facts]
> [ INFO  ] ok: [localhost]
> [ INFO  ] TASK [Remove local vm dir]
> [ INFO  ] ok: [localhost]
> [ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-
> setup/answers/answers-20180202120333.conf'
> [ INFO  ] Stage: Pre-termination
> [ INFO  ] Stage: Termination
> [ ERROR ] Hosted Engine deployment failed: please check the logs for the
> issue, fix accordingly or re-deploy from scratch.
>   Log file is located at /var/log/ovirt-hosted-engine-
> setup/ovirt-hosted-engine-setup-20180202115038-r11nh1.log
>
> but the VM is up and running, just attached to the 192.168.122.0/24 subnet
>
> [root@d8-r13-c2-n1 ~]# ssh root@192.168.122.37
> root@192.168.122.37's password:
> Last login: Fri Feb  2 11:54:47 2018 from 192.168.122.1
> [root@ovirt ~]# systemctl status ovirt-engine
> ● ovirt-engine.service - oVirt Engine
>Loaded: loaded (/usr/lib/systemd/system/ovirt-engine.service; enabled;
> vendor preset: disabled)
>Active: active (running) since Fri 2018-02-02 11:54:42 PST; 11min ago
>  Main PID: 24724 (ovirt-engine.py)
>CGroup: /system.slice/ovirt-engine.service
>├─24724 /usr/bin/python /usr/share/ovirt-engine/
> services/ovirt-engine/ovirt-engine.py --redirect-output --systemd=notify
> start
>└─24856 ovirt-engine -server -XX:+TieredCompilation -Xms3971M
> -Xmx3971M -Djava.awt.headless=true -Dsun.rmi.dgc.client.gcInterval=360
> -Dsun.rmi.dgc.server.gcInterval=360 -Djsse...
>
> Feb 02 11:54:41 ovirt.crt.nersc.gov systemd[1]: Starting oVirt Engine...
> Feb 02 11:54:41 ovirt.crt.nersc.gov ovirt-engine.py[24724]: 2018-02-02
> 11:54:41,767-0800 ovirt-engine: INFO _detectJBossVersion:187 Detecting
> JBoss version. Running: /usr/lib/jvm/jre/...60', '-
> Feb 02 11:54:42 ovirt.crt.nersc.gov ovirt-engine.py[24724]: 2018-02-02
> 11:54:42,394-0800 ovirt-engine: INFO _detectJBossVersion:207 Return code:
> 0,  | stdout: '[u'WildFly Full 11.0.0tderr: '[]'
> Feb 02 11:54:42 ovirt.crt.nersc.gov systemd[1]: Started oVirt Engine.
> Feb 02 11:55:25 ovirt.crt.nersc.gov python2[25640]: ansible-stat Invoked
> with checksum_algorithm=sha1 get_checksum=True follow=False
> path=/usr/share/ovirt-engine/playbooks/roles/ovir...ributes=True
> Feb 02 11:55:29 ovirt.crt.nersc.gov python2[25698]: ansible-stat Invoked
> with checksum_algorithm=sha1 get_checksum=True follow=False
> path=/usr/share/ovirt-engine/playbooks/roles/ovir...ributes=True
> Feb 02 11:55:30 ovirt.crt.nersc.gov python2[25741]: ansible-stat Invoked
> with checksum_algorithm=sha1 get_checksum=True follow=False
> path=/usr/share/ovirt-engine/playbooks/roles/ovir...ributes=True
> Feb 02 11:55:30 ovirt.crt.nersc.gov python2[25767]: ansible-stat Invoked
> with checksum_algorithm=sha1 get_checksum=True follow=False
> path=/usr/share/ovirt-engine/playbooks/roles/ovir...ributes=True
> Feb 02 11:55:31 ovirt.crt.nersc.gov python2[25795]: ansible-stat Invoked
> with checksum_algorithm=sha1 get_checksum=True follow=False
> path=/etc/ovirt-engine-metrics/config.yml get_md5...ributes=True
>
> The 'ip rule list' never has an ovirtmgmt rule/table in it.. which means
> the ansible script loops then dies; vdsmd has never configured the network
> on the node.
>

Right.
Can you please attach engine.log and host-deploy from the engine VM?


>
> [root@d8-r13-c2-n1 ~]# systemctl status vdsmd -l
> ● vdsmd.service - Virtual Desktop Server Manager
>Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor
> preset: enabled)
>Active: active (running) since Fri 2018-02-02 11:55:11 PST; 14min ago
>  Main PID: 7654 (vdsmd)
>CGroup: /system.slice/vdsmd.service
>└─7654 /usr/bin/python2 /usr/share/vdsm/vdsmd
>
> Feb 02 11:55:11 d8-r13-c2-n1 vdsmd_init_common.sh[7551]: vdsm: Running
> dummybr
> Feb 02 11:55:11 d8-r13-c2-n1 vdsmd_init_common.sh[7551]: vdsm: Running
> tune_system
> Feb 02 11:55:11 d8-r13-c2-n1 vdsmd_init_common.sh[7551]: vdsm: Running
> test_space
> Feb 02 11:55:11 d8-r13-c2-n1 vdsmd_init_common.sh[7551]: 

Re: [ovirt-users] ovirt 3.6, we had the ovirt manager go down in a bad way and all VMs for one node marked Unknown and Not Reponding while up

2018-02-05 Thread Christopher Cox
Forgive the top post.  I guess what I need to know now is whether there 
is a recovery path that doesn't lead to total loss of the VMs that are 
currently in the "Unknown" "Not responding" state.


We are planning a total oVirt shutdown.  I just would like to know if 
we've effectively lot those VMs or not.  Again, the VMs are currently 
"up".  And we use a file backup process, so in theory they can be 
restored, just somewhat painfully, from scratch.


But if somebody knows if we shutdown all the bad VMs and the blade, is 
there someway oVirt can know the VMs are "ok" to start up??  Will 
changing their state directly to "down" in the db stick if the blade is 
down?  That is, will we get to a known state where the VMs can actually 
be started and brought back into a known state?


Right now, we're feeling there's a good chance we will not be able to 
recover these VMs, even though they are "up" right now.  I really need 
some way to force oVirt into an integral state, even if it means we take 
the whole thing down.


Possible?


On 01/25/2018 06:57 PM, Christopher Cox wrote:



On 01/25/2018 04:57 PM, Douglas Landgraf wrote:
On Thu, Jan 25, 2018 at 5:12 PM, Christopher Cox  
wrote:

On 01/25/2018 02:25 PM, Douglas Landgraf wrote:


On Wed, Jan 24, 2018 at 10:18 AM, Christopher Cox 
wrote:


Would restarting vdsm on the node in question help fix this?  
Again, all

the
VMs are up on the node.  Prior attempts to fix this problem have 
left the
node in a state where I can issue the "has been rebooted" command 
to it,

it's confused.

So... node is up.  All VMs are up.  Can't issue "has been rebooted" to
the
node, all VMs show Unknown and not responding but they are up.

Chaning the status is the ovirt db to 0 works for a second and then it
goes
immediately back to 8 (which is why I'm wondering if I should restart
vdsm
on the node).



It's not recommended to change db manually.



Oddly enough, we're running all of this in production.  So, 
watching it

all
go down isn't the best option for us.

Any advice is welcome.




We would need to see the node/engine logs, have you found any error in
the vdsm.log
(from nodes) or engine.log? Could you please share the error?




In short, the error is our ovirt manager lost network (our problem) and
crashed hard (hardware issue on the server)..  On bring up, we had some
network changes (that caused the lost network problem) so our LACP 
bond was
down for a bit while we were trying to bring it up (noting the ovirt 
manager

is up while we're reestablishing the network on the switch side).

In other word, that's the "error" so to speak that got us to where we 
are.


Full DEBUG enabled on the logs... The error messages seem obvious to 
me..
starts like this (nothing the ISO DOMAIN was coming off an NFS mount 
off the
ovirt management server... yes... we know... we do have plans to move 
that).


So on the hypervisor node itself, from the vdsm.log (vdsm.log.33.xz):

(hopefully no surprise here)

Thread-2426633::WARNING::2018-01-23
13:50:56,672::fileSD::749::Storage.scanDomains::(collectMetaFiles) 
Could not

collect metadata file for domain path
/rhev/data-center/mnt/d0lppc129.skopos.me:_var_lib_exports_iso-20160408002844 


Traceback (most recent call last):
   File "/usr/share/vdsm/storage/fileSD.py", line 735, in 
collectMetaFiles

 sd.DOMAIN_META_DATA))
   File "/usr/share/vdsm/storage/outOfProcess.py", line 121, in glob
 return self._iop.glob(pattern)
   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", 
line 536,

in glob
 return self._sendCommand("glob", {"pattern": pattern}, 
self.timeout)
   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", 
line 421,

in _sendCommand
 raise Timeout(os.strerror(errno.ETIMEDOUT))
Timeout: Connection timed out
Thread-27::ERROR::2018-01-23
13:50:56,672::sdc::145::Storage.StorageDomainCache::(_findDomain) domain
e5ecae2f-5a06-4743-9a43-e74d83992c35 not found
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/sdc.py", line 143, in _findDomain
 dom = findMethod(sdUUID)
   File "/usr/share/vdsm/storage/nfsSD.py", line 122, in findDomain
 return NfsStorageDomain(NfsStorageDomain.findDomainPath(sdUUID))
   File "/usr/share/vdsm/storage/nfsSD.py", line 112, in findDomainPath
 raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'e5ecae2f-5a06-4743-9a43-e74d83992c35',)
Thread-27::ERROR::2018-01-23
13:50:56,673::monitor::276::Storage.Monitor::(_monitorDomain) Error
monitoring domain e5ecae2f-5a06-4743-9a43-e74d83992c35
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/monitor.py", line 272, in 
_monitorDomain

 self._performDomainSelftest()
   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 769, in
wrapper
 value = meth(self, *a, **kw)
   File "/usr/share/vdsm/storage/monitor.py", line 339, in
_performDomainSelftest
 

Re: [ovirt-users] oVirt Upgrade 4.1 -> 4.2 fails with YUM dependency problems (CentOS)

2018-02-05 Thread Frank Thommen
Following the minor release upgrade instructions on 
https://www.ovirt.org/documentation/upgrade-guide/chap-Updates_between_Minor_Releases/ 
solved this issue.  Now we are bumping into an other issue, for which 
I'll probably open an other thread.


frank


On 02/02/2018 05:33 PM, Chas Hockenbarger wrote:
I haven't tried this yet, but looking at the detailed error, the 
implication is that your current install is less than 4.1.7, which is 
where the conflict is. Have you tried updating to > 4.1.7 before upgrading?



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt DR: ansible with 4.1, only a subset of storage domain replicated

2018-02-05 Thread Luca 'remix_tj' Lorenzetto
Hello,

i'm starting the implementation of our disaster recovery site with RHV
4.1.latest for our production environment.

Our production setup is very easy, with self hosted engine on dc
KVMPDCA, and virtual machines both in KVMPDCA and KVMPD dcs. All our
setup has an FC storage backend, which is EMC VPLEX/VMAX in KVMPDCA
and EMC VNX8000. Both storage arrays supports replication via their
own replication protocols (SRDF, MirrorView), so we'd like to delegate
to them the replication of data to the remote site, which is located
on another remote datacenter.

In KVMPD DC we have some storage domains that contains non critical
VMs, which we don't want to replicate to remote site (in case of
failure they have a low priority and will be restored from a backup).
In our setup we won't replicate them, so will be not available for
attachment on remote site. Can be this be an issue? Do we require to
replicate everything?
What about master domain? Do i require that the master storage domain
stays on a replicated volume or can be any of the available ones?

I've seen that since 4.1 there's an API for updating OVF_STORE disks.
Do we require to invoke it with a frequency that is the compatible
with the replication frequency on storage side. We set at the moment
RPO to 1hr (even if planned RPO requires 2hrs). Does OVF_STORE gets
updated with the required frequency?

I've seen a recent presentation by Maor Lipchuk that is showing the
"automagic" ansible role for disaster recovery:

https://www.slideshare.net/maorlipchuk/ovirt-dr-site-tosite-using-ansible

It's also related with some youtube presentations demonstrating a real
DR plan execution.

But what i've seen is that Maor is explicitly talking about 4.2
release. Does that role works only with >4.2 releases or can be used
also on earlier (4.1) versions?

I've tested a manual flow of replication + recovery through Import SD
followed by Import VM and worked like a charm. Using a prebuilt
ansible role will reduce my effort on creating a new automation for
doing this.

Anyone has experiences like mine?

Thank you for the help you may provide, i'd like to contribute back to
you with all my findings and with an usable tool (also integrated with
storage arrays if possible).

Luca

(Sorry for duplicate email, ctrl-enter happened before mail completion)


-- 
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)

"Internet è la più grande biblioteca del mondo.
Ma il problema è che i libri sono tutti sparsi sul pavimento"
John Allen Paulos, Matematico (1945-vivente)

Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed upgrade from 4.1.9 to 4.2.x

2018-02-05 Thread Martin Perina
On Mon, Feb 5, 2018 at 3:08 PM,  wrote:

> El 2018-02-05 14:03, Simone Tiraboschi escribió:
>
>> On Mon, Feb 5, 2018 at 2:46 PM,  wrote:
>>
>> Hi,
>>>
>>> We're trying to upgrade from 4.1.9 to 4.2.x and we're bumping into
>>> an error we don't know how to solve. As per [1] we run the
>>> 'engine-setup' command and it fails with:
>>>
>>> [ INFO  ] Rolling back to the previous PostgreSQL instance
>>> (postgresql).
>>> [ ERROR ] Failed to execute stage 'Misc configuration': Command
>>> '/opt/rh/rh-postgresql95/root/usr/bin/postgresql-setup' failed to
>>> execute
>>> [ INFO  ] Yum Performing yum transaction rollback
>>> [ INFO  ] Stage: Clean up
>>>   Log file is located at
>>>
>>> /var/log/ovirt-engine/setup/ovirt-engine-setup-20180205133116-sm2xd1.log
>>
>>> [ INFO  ] Generating answer file
>>> '/var/lib/ovirt-engine/setup/answers/20180205133354-setup.co [1]nf'
>>> [ INFO  ] Stage: Pre-termination
>>> [ INFO  ] Stage: Termination
>>> [ ERROR ] Execution of setup failed
>>>
>>> As of the
>>>
>>> /var/log/ovirt-engine/setup/ovirt-engine-setup-20180205133116-sm2xd1.log
>>
>>> file I could see this:
>>>
>>>  * upgrading from 'postgresql.service' to
>>> 'rh-postgresql95-postgresql.se [2]rvice'
>>>  * Upgrading database.
>>> ERROR: pg_upgrade tool failed
>>> ERROR: Upgrade failed.
>>>  * See /var/lib/pgsql/upgrade_rh-postgresql95-postgresql.log for
>>> details.
>>>
>>> And this file contains this information:
>>>
>>>   Performing Consistency Checks
>>>   -
>>>   Checking cluster versions
>>>  ok
>>>   Checking database user is the install user
>>> ok
>>>   Checking database connection settings
>>>  ok
>>>   Checking for prepared transactions
>>>   ok
>>>   Checking for reg* system OID user data types
>>>   ok
>>>   Checking for contrib/isn with bigint-passing mismatch
>>>  ok
>>>   Checking for invalid "line" user columns
>>> ok
>>>   Creating dump of global objects
>>>ok
>>>   Creating dump of database schemas
>>> django
>>> engine
>>> ovirt_engine_history
>>> postgres
>>> template1
>>>
>>> ok
>>>   Checking for presence of required libraries
>>>fatal
>>>
>>>   Your installation references loadable libraries that are missing
>>> from the
>>>   new installation.  You can add these libraries to the new
>>> installation,
>>>   or remove the functions using them from the old installation.
>>> A list of
>>>   problem libraries is in the file:
>>>   loadable_libraries.txt
>>>
>>>   Failure, exiting
>>>
>>> I'm attaching full logs FWIW. Also, I'd like to mention that we
>>> created two custom triggers on the engine's 'users' table, but as I
>>> understand from the error this is not the issue (We upgraded several
>>> times within the same minor and we had no issues with that).
>>>
>>> Could someone shed some light on this error and how to debug it?
>>>
>>
>> Hi,
>> can you please attach also loadable_libraries.txt ?
>>
>>
>
> Could not load library "$libdir/plpython2"
> ERROR:  could not access file "$libdir/plpython2": No such file or
> directory
>

​Hmm, you probably need to install rh-postgresql95-postgresql-plpython
package. This is not installed by default with oVirt as we don't use it
​


>
> Well, definitely it has to do with the triggers... The trigger uses
> plpython2u to replicate some entries in a different database. Is there a
> way I can get rid of this error other than disabling plpython2 before
> upgrading and re-enabling it after the upgrade?
>
> Thanks.
>
>
>> Thanks.
>>>
>>>   [1]: https://www.ovirt.org/release/4.2.0/ [3]
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users [4]
>>>
>>
>>
>>
>> Links:
>> --
>> [1] http://20180205133354-setup.co
>> [2] http://rh-postgresql95-postgresql.se
>> [3] https://www.ovirt.org/release/4.2.0/
>> [4] http://lists.ovirt.org/mailman/listinfo/users
>>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>



-- 
Martin Perina
Associate Manager, Software Engineering
Red Hat Czech s.r.o.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] oVirt DR: ansible with 4.1, only a subset of storage domain replicated

2018-02-05 Thread Luca 'remix_tj' Lorenzetto
Hello,

i'm starting the implementation of our disaster recovery site with RHV
4.1.latest for our production environment.

Our production setup is very easy, with self hosted engine on dc
KVMPDCA, and virtual machines both in KVMPDCA and KVMPD dcs. All our
setup has an FC storage backend, which is EMC VPLEX/VMAX in KVMPDCA
and EMC VNX8000. Both storage arrays supports replication via their
own replication protocols (SRDF, MirrorView), so we'd like to delegate
to them the replication of data to the remote site, which is located
on another remote datacenter.

In KVMPD DC we have some storage domains that contains non critical
VMs, which we don't want to replicate to remote site (in case of
failure they have a low priority and will be restored from a backup).
In our setup we won't replicate them, so will be not available for
attachment on remote site. Can be this be an issue? Do we require to
replicate everything?
What about master domain? Do i require that the master storage domain
stays on a replicated volume or can be any of the available ones?

I've seen that since 4.1 there's an API for updating OVF_STORE disks.
Do we require to invoke it with a frequency that is the compatible
with the replication frequency on storage side. We set at the moment
RPO to 1hr (even if planned RPO requires 2hrs). Does OVF_STORE gets
updated with the required frequency?

I've seen a recent presentation by Maor Lipchuk that is showing the
automagic ansible role for disaster recovery:

-- 
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)

"Internet è la più grande biblioteca del mondo.
Ma il problema è che i libri sono tutti sparsi sul pavimento"
John Allen Paulos, Matematico (1945-vivente)

Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] vdsmd fails after upgrade 4.1 -> 4.2

2018-02-05 Thread Petr Kotas
Hi Frank,

can you please send a vdsm logs? The 4.2 release added the little different
deployment from the engine.
Now the ansible is also called. Although I am not sure if this is your case.

I would go for entirely removing the vdsm and installing it from scratch if
it is possible for you.
This could solve your issue.

Looking forward to hear from you.

Petr

On Mon, Feb 5, 2018 at 2:49 PM, Frank Rothenstein <
f.rothenst...@bodden-kliniken.de> wrote:

> Hi,
>
> I'm currently stuck - after upgrading 4.1 to 4.2 I cannot start the
> host-processes.
> systemctl start vdsmd fails with following lines in journalctl:
>
> 
>
> Feb 05 14:40:15 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: vdsm: Running wait_for_network
> Feb 05 14:40:15 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: vdsm: Running run_init_hooks
> Feb 05 14:40:15 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: vdsm: Running check_is_configured
> Feb 05 14:40:15 glusternode1.bodden-kliniken.net
> sasldblistusers2[10440]: DIGEST-MD5 common mech free
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: Error:
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: One of the modules is not configured to
> work with VDSM.
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: To configure the module use the following:
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: 'vdsm-tool configure [--module module-
> name]'.
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: If all modules are not configured try to
> use:
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: 'vdsm-tool configure --force'
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: (The force flag will stop the module's
> service and start it
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: afterwards automatically to load the new
> configuration.)
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: abrt is already configured for vdsm
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: lvm is configured for vdsm
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: libvirt is not configured for vdsm yet
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: Current revision of multipath.conf
> detected, preserving
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: Modules libvirt are not configured
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net
> vdsmd_init_common.sh[10414]: vdsm: stopped during execute
> check_is_configured task (task returned with error code 1).
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net systemd[1]:
> vdsmd.service: control process exited, code=exited status=1
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net systemd[1]: Failed to
> start Virtual Desktop Server Manager.
> -- Subject: Unit vdsmd.service has failed
> -- Defined-By: systemd
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
> --
> -- Unit vdsmd.service has failed.
> --
> -- The result is failed.
> Feb 05 14:40:16 glusternode1.bodden-kliniken.net systemd[1]: Dependency
> failed for MOM instance configured for VDSM purposes.
> -- Subject: Unit mom-vdsm.service has failed
>
> 
>
> The suggested "vdsm-tool configure --force" runs w/o errors, the
> following restart of vdsmd shows the same error.
>
> Any hints on that topic?
>
> Frank
>
>
>
> Frank Rothenstein
>
> Systemadministrator
> Fon: +49 3821 700 125 <+49%203821%20700125>
> Fax: +49 3821 700 190 <+49%203821%20700190>
> Internet: www.bodden-kliniken.de
> E-Mail: f.rothenst...@bodden-kliniken.de
>
>
> _
> BODDEN-KLINIKEN Ribnitz-Damgarten GmbH
> Sandhufe 2
> 18311 Ribnitz-Damgarten
>
> Telefon: 03821-700-0
> Telefax: 03821-700-240
>
> E-Mail: i...@bodden-kliniken.de
> Internet: http://www.bodden-kliniken.de
>
> Sitz: Ribnitz-Damgarten, Amtsgericht: Stralsund, HRB 2919, Steuer-Nr.:
> 079/133/40188
> Aufsichtsratsvorsitzende: Carmen Schröter, Geschäftsführer: Dr. Falko
> Milski, MBA
>
> Der Inhalt dieser E-Mail ist ausschließlich für den bezeichneten
> Adressaten bestimmt. Wenn Sie nicht der
> vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten,
> beachten Sie bitte, dass jede
> Form der Veröffentlichung, Vervielfältigung oder Weitergabe des Inhalts
> dieser E-Mail unzulässig ist.
> Wir bitten Sie, sofort den Absender zu informieren und die E-Mail zu
> löschen.
>
>   © BODDEN-KLINIKEN Ribnitz-Damgarten GmbH 2017
> *** Virenfrei durch Kerio Mail Server und SOPHOS Antivirus ***
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>

Re: [ovirt-users] Failed upgrade from 4.1.9 to 4.2.x

2018-02-05 Thread nicolas

El 2018-02-05 14:03, Simone Tiraboschi escribió:

On Mon, Feb 5, 2018 at 2:46 PM,  wrote:


Hi,

We're trying to upgrade from 4.1.9 to 4.2.x and we're bumping into
an error we don't know how to solve. As per [1] we run the
'engine-setup' command and it fails with:

[ INFO  ] Rolling back to the previous PostgreSQL instance
(postgresql).
[ ERROR ] Failed to execute stage 'Misc configuration': Command
'/opt/rh/rh-postgresql95/root/usr/bin/postgresql-setup' failed to
execute
[ INFO  ] Yum Performing yum transaction rollback
[ INFO  ] Stage: Clean up
          Log file is located at


/var/log/ovirt-engine/setup/ovirt-engine-setup-20180205133116-sm2xd1.log

[ INFO  ] Generating answer file
'/var/lib/ovirt-engine/setup/answers/20180205133354-setup.co [1]nf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed

As of the


/var/log/ovirt-engine/setup/ovirt-engine-setup-20180205133116-sm2xd1.log

file I could see this:

 * upgrading from 'postgresql.service' to
'rh-postgresql95-postgresql.se [2]rvice'
 * Upgrading database.
ERROR: pg_upgrade tool failed
ERROR: Upgrade failed.
 * See /var/lib/pgsql/upgrade_rh-postgresql95-postgresql.log for
details.

And this file contains this information:

  Performing Consistency Checks
  -
  Checking cluster versions                         
         ok
  Checking database user is the install user             
    ok
  Checking database connection settings                 
     ok
  Checking for prepared transactions                   
      ok
  Checking for reg* system OID user data types             
  ok
  Checking for contrib/isn with bigint-passing mismatch     
 ok
  Checking for invalid "line" user columns               
    ok
  Creating dump of global objects                     
       ok
  Creating dump of database schemas
    django
    engine
    ovirt_engine_history
    postgres
    template1
                                           
                ok
  Checking for presence of required libraries             
   fatal

  Your installation references loadable libraries that are missing
from the
  new installation.  You can add these libraries to the new
installation,
  or remove the functions using them from the old installation. 
A list of
  problem libraries is in the file:
  loadable_libraries.txt

  Failure, exiting

I'm attaching full logs FWIW. Also, I'd like to mention that we
created two custom triggers on the engine's 'users' table, but as I
understand from the error this is not the issue (We upgraded several
times within the same minor and we had no issues with that).

Could someone shed some light on this error and how to debug it?


Hi,
can you please attach also loadable_libraries.txt ?
 


Could not load library "$libdir/plpython2"
ERROR:  could not access file "$libdir/plpython2": No such file or 
directory


Well, definitely it has to do with the triggers... The trigger uses 
plpython2u to replicate some entries in a different database. Is there a 
way I can get rid of this error other than disabling plpython2 before 
upgrading and re-enabling it after the upgrade?


Thanks.




Thanks.

  [1]: https://www.ovirt.org/release/4.2.0/ [3]
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users [4]




Links:
--
[1] http://20180205133354-setup.co
[2] http://rh-postgresql95-postgresql.se
[3] https://www.ovirt.org/release/4.2.0/
[4] http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failed upgrade from 4.1.9 to 4.2.x

2018-02-05 Thread Simone Tiraboschi
On Mon, Feb 5, 2018 at 2:46 PM,  wrote:

> Hi,
>
> We're trying to upgrade from 4.1.9 to 4.2.x and we're bumping into an
> error we don't know how to solve. As per [1] we run the 'engine-setup'
> command and it fails with:
>
> [ INFO  ] Rolling back to the previous PostgreSQL instance (postgresql).
> [ ERROR ] Failed to execute stage 'Misc configuration': Command
> '/opt/rh/rh-postgresql95/root/usr/bin/postgresql-setup' failed to execute
> [ INFO  ] Yum Performing yum transaction rollback
> [ INFO  ] Stage: Clean up
>   Log file is located at /var/log/ovirt-engine/setup/ov
> irt-engine-setup-20180205133116-sm2xd1.log
> [ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/
> 20180205133354-setup.conf'
> [ INFO  ] Stage: Pre-termination
> [ INFO  ] Stage: Termination
> [ ERROR ] Execution of setup failed
>
> As of the 
> /var/log/ovirt-engine/setup/ovirt-engine-setup-20180205133116-sm2xd1.log
> file I could see this:
>
>  * upgrading from 'postgresql.service' to 'rh-postgresql95-postgresql.se
> rvice'
>  * Upgrading database.
> ERROR: pg_upgrade tool failed
> ERROR: Upgrade failed.
>  * See /var/lib/pgsql/upgrade_rh-postgresql95-postgresql.log for details.
>
> And this file contains this information:
>
>   Performing Consistency Checks
>   -
>   Checking cluster versions   ok
>   Checking database user is the install user  ok
>   Checking database connection settings   ok
>   Checking for prepared transactions  ok
>   Checking for reg* system OID user data typesok
>   Checking for contrib/isn with bigint-passing mismatch   ok
>   Checking for invalid "line" user columnsok
>   Creating dump of global objects ok
>   Creating dump of database schemas
> django
> engine
> ovirt_engine_history
> postgres
> template1
> ok
>   Checking for presence of required libraries fatal
>
>   Your installation references loadable libraries that are missing from the
>   new installation.  You can add these libraries to the new installation,
>   or remove the functions using them from the old installation.  A list of
>   problem libraries is in the file:
>   loadable_libraries.txt
>
>   Failure, exiting
>
> I'm attaching full logs FWIW. Also, I'd like to mention that we created
> two custom triggers on the engine's 'users' table, but as I understand from
> the error this is not the issue (We upgraded several times within the same
> minor and we had no issues with that).
>
> Could someone shed some light on this error and how to debug it?
>

Hi,
can you please attach also loadable_libraries.txt ?


>
> Thanks.
>
>   [1]: https://www.ovirt.org/release/4.2.0/
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] vdsmd fails after upgrade 4.1 -> 4.2

2018-02-05 Thread Frank Rothenstein
Hi,

I'm currently stuck - after upgrading 4.1 to 4.2 I cannot start the
host-processes.
systemctl start vdsmd fails with following lines in journalctl:



Feb 05 14:40:15 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: vdsm: Running wait_for_network
Feb 05 14:40:15 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: vdsm: Running run_init_hooks
Feb 05 14:40:15 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: vdsm: Running check_is_configured
Feb 05 14:40:15 glusternode1.bodden-kliniken.net
sasldblistusers2[10440]: DIGEST-MD5 common mech free
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: Error:
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: One of the modules is not configured to
work with VDSM.
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: To configure the module use the following:
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: 'vdsm-tool configure [--module module-
name]'.
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: If all modules are not configured try to
use:
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: 'vdsm-tool configure --force'
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: (The force flag will stop the module's
service and start it
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: afterwards automatically to load the new
configuration.)
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: abrt is already configured for vdsm
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: lvm is configured for vdsm
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: libvirt is not configured for vdsm yet
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: Current revision of multipath.conf
detected, preserving
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: Modules libvirt are not configured
Feb 05 14:40:16 glusternode1.bodden-kliniken.net
vdsmd_init_common.sh[10414]: vdsm: stopped during execute
check_is_configured task (task returned with error code 1).
Feb 05 14:40:16 glusternode1.bodden-kliniken.net systemd[1]:
vdsmd.service: control process exited, code=exited status=1
Feb 05 14:40:16 glusternode1.bodden-kliniken.net systemd[1]: Failed to
start Virtual Desktop Server Manager.
-- Subject: Unit vdsmd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit vdsmd.service has failed.
-- 
-- The result is failed.
Feb 05 14:40:16 glusternode1.bodden-kliniken.net systemd[1]: Dependency
failed for MOM instance configured for VDSM purposes.
-- Subject: Unit mom-vdsm.service has failed



The suggested "vdsm-tool configure --force" runs w/o errors, the
following restart of vdsmd shows the same error.

Any hints on that topic?

Frank



Frank Rothenstein 

Systemadministrator
Fon: +49 3821 700 125
Fax: +49 3821 700 190Internet: www.bodden-kliniken.de
E-Mail: f.rothenst...@bodden-kliniken.de


_
BODDEN-KLINIKEN Ribnitz-Damgarten GmbH
Sandhufe 2
18311 Ribnitz-Damgarten

Telefon: 03821-700-0
Telefax: 03821-700-240

E-Mail: i...@bodden-kliniken.de 
Internet: http://www.bodden-kliniken.de

Sitz: Ribnitz-Damgarten, Amtsgericht: Stralsund, HRB 2919, Steuer-Nr.: 
079/133/40188
Aufsichtsratsvorsitzende: Carmen Schröter, Geschäftsführer: Dr. Falko Milski, 
MBA

Der Inhalt dieser E-Mail ist ausschließlich für den bezeichneten Adressaten 
bestimmt. Wenn Sie nicht der 
vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, beachten 
Sie bitte, dass jede 
Form der Veröffentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser 
E-Mail unzulässig ist. 
Wir bitten Sie, sofort den Absender zu informieren und die E-Mail zu löschen. 

      © BODDEN-KLINIKEN Ribnitz-Damgarten GmbH 2017
*** Virenfrei durch Kerio Mail Server und SOPHOS Antivirus ***
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Failed upgrade from 4.1.9 to 4.2.x

2018-02-05 Thread nicolas

Hi,

We're trying to upgrade from 4.1.9 to 4.2.x and we're bumping into an 
error we don't know how to solve. As per [1] we run the 'engine-setup' 
command and it fails with:


[ INFO  ] Rolling back to the previous PostgreSQL instance (postgresql).
[ ERROR ] Failed to execute stage 'Misc configuration': Command 
'/opt/rh/rh-postgresql95/root/usr/bin/postgresql-setup' failed to 
execute

[ INFO  ] Yum Performing yum transaction rollback
[ INFO  ] Stage: Clean up
  Log file is located at 
/var/log/ovirt-engine/setup/ovirt-engine-setup-20180205133116-sm2xd1.log
[ INFO  ] Generating answer file 
'/var/lib/ovirt-engine/setup/answers/20180205133354-setup.conf'

[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed

As of the 
/var/log/ovirt-engine/setup/ovirt-engine-setup-20180205133116-sm2xd1.log 
file I could see this:


 * upgrading from 'postgresql.service' to 
'rh-postgresql95-postgresql.service'

 * Upgrading database.
ERROR: pg_upgrade tool failed
ERROR: Upgrade failed.
 * See /var/lib/pgsql/upgrade_rh-postgresql95-postgresql.log for 
details.


And this file contains this information:

  Performing Consistency Checks
  -
  Checking cluster versions   ok
  Checking database user is the install user  ok
  Checking database connection settings   ok
  Checking for prepared transactions  ok
  Checking for reg* system OID user data typesok
  Checking for contrib/isn with bigint-passing mismatch   ok
  Checking for invalid "line" user columnsok
  Creating dump of global objects ok
  Creating dump of database schemas
django
engine
ovirt_engine_history
postgres
template1
ok
  Checking for presence of required libraries fatal

  Your installation references loadable libraries that are missing from 
the
  new installation.  You can add these libraries to the new 
installation,
  or remove the functions using them from the old installation.  A list 
of

  problem libraries is in the file:
  loadable_libraries.txt

  Failure, exiting

I'm attaching full logs FWIW. Also, I'd like to mention that we created 
two custom triggers on the engine's 'users' table, but as I understand 
from the error this is not the issue (We upgraded several times within 
the same minor and we had no issues with that).


Could someone shed some light on this error and how to debug it?

Thanks.

  [1]: https://www.ovirt.org/release/4.2.0/

upgrade.tar.gz
Description: GNU Zip compressed data
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Documentation about vGPU in oVirt 4.2

2018-02-05 Thread Gianluca Cecchi
On Fri, Feb 2, 2018 at 12:13 PM, Jordan, Marcel 
wrote:

> Hi,
>
> i have some NVIDIA Tesla P100 and V100 gpu in our oVirt 4.2 cluster and
> searching for a documentation how to use the new vGPU feature. Is there
> any documentation out there how i configure it correctly?
>
> --
> Marcel Jordan
>
>
>
Possibly check what would become the official documentation for RHEV 4.2,
even if it could not map one-to-one with oVirt

Admin guide here:
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.2-beta/html/administration_guide/sect-host_tasks#Preparing_GPU_Passthrough

Planning and prerequisites guide here:
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.2-Beta/html/planning_and_prerequisites_guide/requirements#pci_device_requirements

In oVirt 4.2 release notes I see these bugzilla entries that can help too...
https://bugzilla.redhat.com/show_bug.cgi?id=1481007
https://bugzilla.redhat.com/show_bug.cgi?id=1482033

HIH,
Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] After realizing HA migration, the virtual machine can still get the virtual machine information by using the "vdsm-client host getVMList" instruction on the host before the migration

2018-02-05 Thread Petr Kotas
Hi,

I have experimented on the issue and figured out the reason for the
original issue.

You are right, that the vm1 is not properly stopped. This is due to the
known issue in the graceful shutdown introduced in the ovirt 4.2.
The vm on the host in shutdown are killed, but are not marked as stopped.
This results in the behavior you have observed.

Luckily, the patch is already done and present in the latest ovirt.
However, be ware that gracefully shutting down the host, will result in
graceful shutdown of
the VMs. This result in engine not migrating them, since they have been
terminated gracefully.

Hope this helps.

Best,
Petr


On Fri, Feb 2, 2018 at 6:00 PM, Simone Tiraboschi 
wrote:

>
>
> On Thu, Feb 1, 2018 at 1:06 PM, Pym  wrote:
>
>> The environment on my side may be different from the link. My VM1 can be
>> used normally after it is started on host2, but there is still information
>> left on host1 that is not cleaned up.
>>
>> Only the interface and background can still get the information of vm1 on
>> host1, but the vm2 has been successfully started on host2, with the HA
>> function.
>>
>> I would like to ask a question, whether the UUID of the virtual machine
>> is stored in the database or where is it maintained? Is it not successfully
>> deleted after using the HA function?
>>
>>
> I just encounter a similar behavior:
> after a reboot of the host 'vdsm-client Host getVMFullList' is still
> reporting an old VM that is not visible with 'virsh -r list --all'.
>
> I filed a bug to track it:
> https://bugzilla.redhat.com/show_bug.cgi?id=1541479
>
>
>
>>
>>
>>
>>
>>  2018-02-01 16:12:16,"Simone Tiraboschi"  :
>>
>>
>>
>> On Thu, Feb 1, 2018 at 2:21 AM, Pym  wrote:
>>
>>>
>>> I checked the vm1, he is keep up state, and can be used, but on host1
>>> has after shutdown is a suspended vm1, this cannot be used, this is the
>>> problem now.
>>>
>>> In host1, you can get the information of vm1 using the "vdsm-client Host
>>> getVMList", but you can't get the vm1 information using the "virsh list".
>>>
>>>
>> Maybe a side effect of https://bugzilla.redhat.com
>> /show_bug.cgi?id=1505399
>>
>> Arik?
>>
>>
>>
>>>
>>>
>>>
>>>  2018-02-01 07:16:37,"Simone Tiraboschi"  :
>>>
>>>
>>>
>>> On Wed, Jan 31, 2018 at 12:46 PM, Pym  wrote:
>>>
 Hi:

 The current environment is as follows:

 Ovirt-engine version 4.2.0 is the source code compilation and
 installation. Add two hosts, host1 and host2, respectively. At host1, a
 virtual machine is created on vm1, and a vm2 is created on host2 and HA is
 configured.

 Operation steps:

 Use the shutdown -r command on host1. Vm1 successfully migrated to
 host2.
 When host1 is restarted, the following situation occurs:

 The state of the vm2 will be shown in two images, switching from up and
 pause.

 When I perform the "vdsm-client Host getVMList" in host1, I will get
 the information of vm1. When I execute the "vdsm-client Host getVMList" in
 host2, I will get the information of vm1 and vm2.
 When I do "virsh list" in host1, there is no virtual machine
 information. When I execute "virsh list" at host2, I will get information
 of vm1 and vm2.

 How to solve this problem?

 Is it the case that vm1 did not remove the information on host1 during
 the migration, or any other reason?

>>>
>>> Did you also check if your vms always remained up?
>>> In 4.2 we have libvirt-guests service on the hosts which tries to
>>> properly shutdown the running VMs on host shutdown.
>>>
>>>

 Thank you.




 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users


>>>
>>>
>>>
>>>
>>
>>
>>
>>
>>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] ovirt and gateway behavior

2018-02-05 Thread Alex K
Hi all,

I have a 3 nodes ovirt 4.1 cluster, self hosted on top of glusterfs. The
cluster is used to host several VMs.
I have observed that when gateway is lost (say the gateway device is down)
the ovirt cluster goes down.

It seems a bit extreme behavior especially when one does not care if the
hosted VMs have connectivity to Internet or not.

Can this behavior be disabled?

Thanx,
Alex
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Documentation about vGPU in oVirt 4.2

2018-02-05 Thread Jordan, Marcel
Hi,

i have some NVIDIA Tesla P100 and V100 gpu in our oVirt 4.2 cluster and
searching for a documentation how to use the new vGPU feature. Is there
any documentation out there how i configure it correctly?

-- 
Marcel Jordan



signature.asc
Description: OpenPGP digital signature
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Power management - oVirt 4,2

2018-02-05 Thread Terry hey
Dear Martin,

Um..Since i am going to use HPE ProLiant DL360 Gen10 Server to setup oVirt
Node(Hypervisor). HP G10 is using ilo5 rather than ilo4. Therefore, i would
like to ask whether oVirt power management support iLO5 or not.

If not, do you have any idea to setup power management with HP G10?

Regards,
Terry

2018-02-01 16:21 GMT+08:00 Martin Perina :

>
>
> On Wed, Jan 31, 2018 at 11:19 PM, Luca 'remix_tj' Lorenzetto <
> lorenzetto.l...@gmail.com> wrote:
>
>> Hi,
>>
>> From ilo3 and up, ilo fencing agents are an alias for fence_ipmi. Try
>> using the standard ipmi.
>>
>
> ​It's not just an alias, ilo3/ilo4 also have different defaults than
> ipmilan. For example if you use ilo4, then by default following is used:
>
> ​
>
> ​lanplus=1
>  power_wait=4
>
> ​So I recommend to start with ilo4 and add any necessary custom options
> into Options field. If you need some custom
> options, could you please share them with us? It would be very helpful for
> us, if needed we could introduce ilo5 with
> different defaults then ilo4
>
> Thanks
>
> Martin
>
>
>> Luca
>>
>>
>>
>> Il 31 gen 2018 11:14 PM, "Terry hey"  ha scritto:
>>
>>> Dear all,
>>> Did oVirt 4.2 Power management support iLO5 as i could not see iLO5
>>> option in Power Management.
>>>
>>> Regards
>>> Terry
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
> --
> Martin Perina
> Associate Manager, Software Engineering
> Red Hat Czech s.r.o.
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Host engine on virtual machine

2018-02-05 Thread Gianluca Cecchi
On Mon, Feb 5, 2018 at 9:09 AM, maoz zadok  wrote:

> Hi All,
> What do you think about installing the host engine on a virtual machine
> hosted on the same cluster managing it?
> is it make sense?
> I don't like the alternative to install it on physical hardware, on the
> other hand, if the host hosting the engine fall, there will be no access to
> management.
> Is there a best practice for it? please, share with me/us your
> implementation.
>
>
>
Yes, it is supported and it is called Self Hosted Engine.
See here:
https://www.ovirt.org/documentation/self-hosted/Self-Hosted_Engine_Guide/

Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Host engine on virtual machine

2018-02-05 Thread maoz zadok
Hi All,
What do you think about installing the host engine on a virtual machine
hosted on the same cluster managing it?
is it make sense?
I don't like the alternative to install it on physical hardware, on the
other hand, if the host hosting the engine fall, there will be no access to
management.
Is there a best practice for it? please, share with me/us your
implementation.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users