Re: [ovirt-users] How to automate the ovirt host deployment?

2016-05-30 Thread Karli Sjöberg

Den 30 maj 2016 22:22 skrev Arman Khalatyan :
>
> Sorry for the previous empty email.
>
> I was testing foreman plugins for ovirt deploy. They are some how broken. The 
> foreman-install --enable-ovirt-provisioning-plugin breaks the foreman 
> installation. I need to dig deeper:(

Don't know what distribution you're using but setting all up manually showed me 
that Foreman needs to be at at least 11 for the plugin to work. Otherwise it 
behaved in the same way for me; all fine and well until the provision plugin 
was installed and then *sadface* :)

Get Foreman up to version 11 and you'll be fine, is my guess.

/K
>
> Am 28.05.2016 4:07 nachm. schrieb "Yaniv Kaul" :
>>
>> >
>
> >
> >
> > On Sat, May 28, 2016 at 12:50 PM, Arman Khalatyan  wrote:
>>
>> >>
>
> >> Thank you for the hint. I will try next week.
> >> Foreman looks quite complex:)
> >
> >
> > I think this is an excellent suggestion - Foreman, while may take a while 
> > to set up, will also be extremely useful to provision and manage not only 
> > hosts, but VMs later on!
> >
>>
>> >> I would prefer simple Python script with 4 lines: add, install, setup 
>> >> networks and activate.
>
>
> >
> >
> > You can look at ovirt-system-tests , the testing suite for oVirt, on Python 
> > code for the above.
> > Y.
> >
>>
>>
> >>
> >> Am 27.05.2016 6:51 nachm. schrieb "Karli Sjöberg" :
>>
>> >>>
>
> >>>
> >>> Den 27 maj 2016 18:41 skrev Arman Khalatyan :
> >>> >
> >>> > Hi, I am looking some method to automate the host deployments in a 
> >>> > cluster environment.
> >>> > Assuming we have 20 nodes with centos 7 eth0/eth1 configured. Is it 
> >>> > possible to automate installation with ovirt-sdk?
> >>> > Are there some examples  ?
> >>>
> >>> You could do that, or look into full life cycle management with The 
> >>> Foreman.
> >>>
> >>> /K
> >>>
> >>> >
> >>> > Thanks,
> >>> > Arman.
> >>
> >>
> >> ___
> >> Users mailing list
> >> Users@ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >>
> >
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Can't perform search after setting up an Active Directory

2016-05-30 Thread Ondra Machacek



On 05/30/2016 06:17 PM, Alexis HAUSER wrote:

Default password is 'changeit' (without quotes).
Hmm, can you please try use the .jks file generated by aaa-ldap-setup
tool? Just to be sure.



I still have the same error with the default jks



Anyway, the strange thing is that aaa-ldap-setup tool passes, but
extension don't work later.
My guess is that it could be unsupported TLS version.
Can you please try running:
 LDAPTLS_CACERT=/somewhere/myca.pem ldapsearch -Z -H
ldap://myserver.com -x -D 'CN=Something,DC=myserver,DC=come' -w
'mypaswd' -b 'CN=users,DC=something,DC=com'
and
  LDAPTLS_PROTOCOL_MIN=3.2 LDAPTLS_CACERT=/somewhere/myca.pem -Z -H
ldap://myserver.com -x -D 'CN=Something,DC=myserver,DC=come' -w
'mypaswd' -b 'CN=users,DC=something,DC=com'



Does both commands succed?



Yes, they both succeed.



If the later one don't work then probably your AD don't accept TLSv1.
You can change it byt this configuration options:
pool.default.ssl.startTLSProtocol=TLSv1
to secure:
pool.default.ssl.startTLSProtocol=TLSv1.2
or:
 pool.default.ssl.startTLSProtocol=SSLv3
But, you should use TLSv1.2.
If none of this is true, then I would try to enable insecure connection:
 pool.default.ssl.insecure = true



I still get the same SSL error with all these options (even insecure)



If it will work, then the problem is most probably with certificate.
If it won't work, then the problem is most probably with startTLS
configuration on AD side.




So, do you think it's startTLS on AD side ?



Oh, I see it, we was blind all the time. The problem is in AD2 and AD3. 
AD1 and AD4 are fine.
So yes the problem is on AD side but only for AD2 and AD3, that's why it 
worked for

aaa-ldap-setup :)

So actually this command shouldn't work for you:

 LDAPTLS_CACERT=/somewhere/myca.pem ldapsearch -Z -H 
ldap://AD2.mydomain.com -x -D 'CN=Something,DC=myserver,DC=come' -w 
'mypaswd' -b 'CN=users,DC=something,DC=com'


but this should:

 LDAPTLS_CACERT=/somewhere/myca.pem ldapsearch -Z -H 
ldap://AD4.mydomain.com -x -D 'CN=Something,DC=myserver,DC=come' -w 
'mypaswd' -b 'CN=users,DC=something,DC=com'

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] How to automate the ovirt host deployment?

2016-05-30 Thread Arman Khalatyan
Sorry for the previous empty email.

I was testing foreman plugins for ovirt deploy. They are some how broken.
The foreman-install --enable-ovirt-provisioning-plugin breaks the foreman
installation. I need to dig deeper:(

Am 28.05.2016 4:07 nachm. schrieb "Yaniv Kaul" :

>

>
>
> On Sat, May 28, 2016 at 12:50 PM, Arman Khalatyan 
wrote:

>>

>> Thank you for the hint. I will try next week.
>> Foreman looks quite complex:)
>
>
> I think this is an excellent suggestion - Foreman, while may take a while
to set up, will also be extremely useful to provision and manage not only
hosts, but VMs later on!
>

>> I would prefer simple Python script with 4 lines: add, install, setup
networks and activate.


>
>
> You can look at ovirt-system-tests , the testing suite for oVirt, on
Python code for the above.
> Y.
>


>>
>> Am 27.05.2016 6:51 nachm. schrieb "Karli Sjöberg" :

>>>

>>>
>>> Den 27 maj 2016 18:41 skrev Arman Khalatyan :
>>> >
>>> > Hi, I am looking some method to automate the host deployments in a
cluster environment.
>>> > Assuming we have 20 nodes with centos 7 eth0/eth1 configured. Is it
possible to automate installation with ovirt-sdk?
>>> > Are there some examples  ?
>>>
>>> You could do that, or look into full life cycle management with The
Foreman.
>>>
>>> /K
>>>
>>> >
>>> > Thanks,
>>> > Arman.
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org 
>> http://lists.ovirt.org/mailman/listinfo/users

>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] What recovers a VM from pause?

2016-05-30 Thread Nicolas Ecarnot

Le 30/05/2016 21:09, Nir Soffer wrote... SOME VERY VALUABLE ANSWERS!

Thank you very much Nir, as your answers will give me food for thought 
for the weeks to come.


It's late here, I'll begin checking all this tomorrow, but just a note :


This enforces failing of io request on devices that by default will queue such
requests for long or unlimited time. Queuing requests is very bad for vdsm, and
cause various commands to block for minutes during storage outage,
failing various
flows in vdsm and the ui.
See https://bugzilla.redhat.com/880738


Though we own a Redhat customer active subscription, I logged in and yet 
I can not access the BZ above.

I'm sure you can help :)

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] What recovers a VM from pause?

2016-05-30 Thread Nir Soffer
On Mon, May 30, 2016 at 4:07 PM, Nicolas Ecarnot  wrote:
> Hello,
>
> We're planning a move from our old building towards a new one a few meters
> away.
>
>
>
> In a similar way of Martijn
> (https://www.mail-archive.com/users@ovirt.org/msg33182.html), I have
> maintenance planed on our storage side.
>
> Say an oVirt DC is using a SAN's LUN via iSCSI (Equallogic).
> This SAN allows me to setup block replication between two SANs, seen by
> oVirt as one (Dell is naming it SyncRep).
> Then switch all the iSCSI accesses to the replicated LUN.
>
> When doing this, the iSCSI stack of each oVirt host notices the
> de-connection, tries to reconnect, and succeeds.
> Amongst our hosts, this happens between 4 and 15 seconds.
>
> When this happens fast enough, oVirt engine and the VMs don't even notice,
> and they keep running happily.
>
> When this takes more than 4 seconds, there are 2 cases :
>
> 1 - The hosts and/or oVirt and/or the SPM (I actually don't know) notices
> that there is a storage failure, and pauses the VMs.
> When the iSCSI stack reconnects, the VMs are automatically recovered from
> pause, and this all takes less than 30 seconds. That is very acceptable for
> us, as this action is extremely rare.
>
> 2 - Same storage failure, VMs paused, and some VMs stay in pause mode
> forever.
> Manual "run" action is mandatory.
> When done, everything recovers correctly.
> This is also quite acceptable, but here come my questions :
>
> My questions : (!)
> - *WHAT* process or piece of code or what oVirt parts is responsible for
> deciding when to UN-pause a VM, and at what conditions?

Vms get paused by qemu, when you get ENOSPC or some other IO error.
This probably happens when a vm is writing to storage, and all paths to storage
are faulty - with current configuration, the scsi layer will fail
after 5 seconds,
and if no path is available, the write will fail.

If vdsm storage monitoring system detected the issue, the storage domain
will become invalid. When the storage domain will become valid again, we
try to resume all vms paused because of IO errors.

Storage monitoring is done every 10 seconds in normal conditions, but in
current release, there can be delays of up to couple of minutes in
extreme conditions,
for example, 50 storage domains and doing lot of io. So basically, the
storage domain
monitor may miss an error on storage, never become invalid, and would
never become valid again and the vm will have to be resumed manually.
See https://bugzilla.redhat.com/1081962

In ovirt 4.0 monitoring should be improved, and will always monitor
storage every
10 seconds, but even this cannot guarantee that we will detect all
storage errors
For example, if the storage outage is shorter then 10 seconds. But I
guess that chance
that storage outage was shorter then 10 seconds, but long enough to cause a vm
to pause is very low.

> That would help me to understand why some cases are working even more
> smoothly than others.
> - Are there related timeouts I could play with in engine-config options?

Nothing on the engine side...

> - [a bit off-topic] Is it safe to increase some iSCSI timeouts of
> buffer-sizes in the hope this kind of disconnection would get un-noticed?

But you may modify multipath configuration on the host.

We use now this multipath configuration (/etc/multipath.conf):

# VDSM REVISION 1.3

defaults {
polling_interval5
no_path_retry   fail
user_friendly_names no
flush_on_last_del   yes
fast_io_fail_tmo5
dev_loss_tmo30
max_fds 4096
deferred_remove yes
}

devices {
device {
all_devsyes
no_path_retry   fail
}
}

This enforces failing of io request on devices that by default will queue such
requests for long or unlimited time. Queuing requests is very bad for vdsm, and
cause various commands to block for minutes during storage outage,
failing various
flows in vdsm and the ui.
See https://bugzilla.redhat.com/880738

However, in your case, using queuing may be the best way to do the switch
from one storage to another in the smoothest way.

You may try this setting:

devices {
device {
all_devsyes
no_path_retry   30
}
}

This will queue io requests for 30 seconds before failing.
Using this normally would be a bad idea with vdsm, since during storage outage,
vdsm may block for 30 seconds when no paths is available, and is not designed
for this behavior, but blocking from time to time for short time should be ok.

I think that modifying the configuration and reloading multipathd service should
be enough to use the new settings, but I'm not sure if this changes
existing sessions
or open devices.

Adding Ben to add more info about this.

Nir
___
Users mailing list
Users@ovirt.org

Re: [ovirt-users] How to automate the ovirt host deployment?

2016-05-30 Thread Arman Khalatyan
Am 28.05.2016 4:07 nachm. schrieb "Yaniv Kaul" :

>

>
>
> On Sat, May 28, 2016 at 12:50 PM, Arman Khalatyan 
wrote:

>>

>> Thank you for the hint. I will try next week.
>> Foreman looks quite complex:)
>
>
> I think this is an excellent suggestion - Foreman, while may take a while
to set up, will also be extremely useful to provision and manage not only
hosts, but VMs later on!
>

>> I would prefer simple Python script with 4 lines: add, install, setup
networks and activate.


>
>
> You can look at ovirt-system-tests , the testing suite for oVirt, on
Python code for the above.
> Y.
>


>>
>> Am 27.05.2016 6:51 nachm. schrieb "Karli Sjöberg" :

>>>

>>>
>>> Den 27 maj 2016 18:41 skrev Arman Khalatyan :
>>> >
>>> > Hi, I am looking some method to automate the host deployments in a
cluster environment.
>>> > Assuming we have 20 nodes with centos 7 eth0/eth1 configured. Is it
possible to automate installation with ovirt-sdk?
>>> > Are there some examples  ?
>>>
>>> You could do that, or look into full life cycle management with The
Foreman.
>>>
>>> /K
>>>
>>> >
>>> > Thanks,
>>> > Arman.
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org 
>> http://lists.ovirt.org/mailman/listinfo/users

>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Ovirt 3.6.6 on Centos 7.2 not using native gluster (gfapi)

2016-05-30 Thread Ralf Schenk
Hello,

thanks for the hint, but wasn't it already there ? Many documents and
screenshots show the radio-button to enable gluster on the cluster tab
of the Engine Webinterfaces.

Bye


Am 30.05.2016 um 20:28 schrieb Yaniv Kaul:
> In the short term roadmap (covered
> by https://bugzilla.redhat.com/show_bug.cgi?id=1022961 ).
> Y.
>
> On Mon, May 30, 2016 at 3:30 PM, Ralf Schenk  > wrote:
>
> Hello,
>
> I set up 8 Hosts and self-hosted-engine running HA on 3 of them
> from gluster replica 3 Volume. HA is working, I can set one host
> of the 3 configured for hosted-engine to maintenance and engine
> migrates to other host. I did the hosted-engine --deploy with type
> gluster and my gluster hosted storage is accessed as
> glusterfs.mydomain.de:/engine
>
> I set up another gluster volume (distributed replicated 4x2=8) as
> Data storage for my virtual machines which is accessible as
> glusterfs.mydomain.de:/gv0.  ISO and Export Volume are defined
> from NFS Server.
>
> When I set up a VM on the gluster storage I expected it to run
> with native gluster support. However if I dumpxml the libvirt
> machine definition I've something like that in it's config:
>
> [...]
>
> 
>error_policy='stop' io='threads'/>
>
> file='/rhev/data-center/0001-0001-0001-0001-00b9/5d99af76-33b5-47d8-99da-1f32413c7bb0/images/011ab08e-71af-4d5b-a6a8-9b843a10329e/3f71d6c7-9b6d-4872-abc6-01a2b3329656'/>
>   
>   
>   011ab08e-71af-4d5b-a6a8-9b843a10329e
>   
>   
>function='0x0'/>
> 
>
> I expected to have something like this:
> 
>  error_policy='stop' io='threads'/>
>
> name='gv0/5d99af76-33b5-47d8-99da-1f32413c7bb0/images/011ab08e-71af-4d5b-a6a8-9b843a10329e/3f71d6c7-9b6d-4872-abc6-01a2b3329656'>
>   
>   
>   [...]
>
> All hosts have vdsm-gluster gluster installed:
> [root@microcloud21 libvirt]# yum list installed | grep vdsm-*
> vdsm.noarch   4.17.28-0.el7.centos  
> @ovirt-3.6
> vdsm-cli.noarch   4.17.28-0.el7.centos  
> @ovirt-3.6
> vdsm-gluster.noarch   4.17.28-0.el7.centos  
> @ovirt-3.6
> vdsm-hook-hugepages.noarch4.17.28-0.el7.centos  
> @ovirt-3.6
> vdsm-hook-vmfex-dev.noarch4.17.28-0.el7.centos  
> @ovirt-3.6
> vdsm-infra.noarch 4.17.28-0.el7.centos  
> @ovirt-3.6
> vdsm-jsonrpc.noarch   4.17.28-0.el7.centos  
> @ovirt-3.6
> vdsm-python.noarch4.17.28-0.el7.centos  
> @ovirt-3.6
> vdsm-xmlrpc.noarch4.17.28-0.el7.centos  
> @ovirt-3.6
> vdsm-yajsonrpc.noarch 4.17.28-0.el7.centos  
> @ovirt-3.6
>
> How do I get my most wanted feature native gluster support running ?
>
> -- 
>
>
> *Ralf Schenk*
> fon +49 (0) 24 05 / 40 83 70
> 
> fax +49 (0) 24 05 / 40 83 759
> 
> mail *r...@databay.de* 
>   
> *Databay AG*
> Jens-Otto-Krag-Straße 11
> D-52146 Würselen
> *www.databay.de* 
>
> Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
> Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari,
> Dipl.-Kfm. Philipp Hermanns
> Aufsichtsratsvorsitzender: Klaus Scholzen (RA)
>
> 
>
> ___
> Users mailing list
> Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users
>
>

-- 


*Ralf Schenk*
fon +49 (0) 24 05 / 40 83 70
fax +49 (0) 24 05 / 40 83 759
mail *r...@databay.de* 

*Databay AG*
Jens-Otto-Krag-Straße 11
D-52146 Würselen
*www.databay.de* 

Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
Philipp Hermanns
Aufsichtsratsvorsitzender: Klaus Scholzen (RA)


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ?==?utf-8?q? Problem using xenial cloud images with ovirt

2016-05-30 Thread Claude Durocher

I will reply to myself here.

Ubuntu 16.04 uses a different naming rules for it's network cards (no more 
ethXX). In the cloud image, they have a nic named ens3 wich defaults to DHCP 
(this is coded in file /etc/cloud/cloud.cfg.d/50-cloud-init.cfg).

The long delay in booting was in part due to the fact I had no DHCP server and 
I was trying to push a fixed IP address to eth0. So I ended up adding a DHCP 
server and using ens3 instead of eth0. Hope this helps someone.

Le Vendredi, Mai 27, 2016 12:13 EDT, "Claude Durocher" 
 a écrit:
  Hi.
I'm trying to run a Ubuntu cloud image on oVirt 3.6. It works fine with the 
14.04 image or Centos 7 image but I have trouble with the 16.04 image : I have 
no network and cloud-init dont receive any data from oVirt. Also, the VM takes 
about 5 minutes to boot (then I cant login as cloud-init dont seem to work).
I've validated the 16.04 image : I can boot it fine on my workstation running 
kvm.

 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Can't perform search after setting up an Active Directory

2016-05-30 Thread Alexis HAUSER
>Default password is 'changeit' (without quotes).
>Hmm, can you please try use the .jks file generated by aaa-ldap-setup 
>tool? Just to be sure.


I still have the same error with the default jks


>Anyway, the strange thing is that aaa-ldap-setup tool passes, but 
>extension don't work later.
>My guess is that it could be unsupported TLS version.
>Can you please try running:
>  LDAPTLS_CACERT=/somewhere/myca.pem ldapsearch -Z -H 
>ldap://myserver.com -x -D 'CN=Something,DC=myserver,DC=come' -w 
>'mypaswd' -b 'CN=users,DC=something,DC=com'
>and
>   LDAPTLS_PROTOCOL_MIN=3.2 LDAPTLS_CACERT=/somewhere/myca.pem -Z -H 
>ldap://myserver.com -x -D 'CN=Something,DC=myserver,DC=come' -w 
>'mypaswd' -b 'CN=users,DC=something,DC=com'

>Does both commands succed?


Yes, they both succeed.


>If the later one don't work then probably your AD don't accept TLSv1.
>You can change it byt this configuration options:
> pool.default.ssl.startTLSProtocol=TLSv1
>to secure:
> pool.default.ssl.startTLSProtocol=TLSv1.2
>or:
>  pool.default.ssl.startTLSProtocol=SSLv3
>But, you should use TLSv1.2.
>If none of this is true, then I would try to enable insecure connection:
>  pool.default.ssl.insecure = true


I still get the same SSL error with all these options (even insecure)


>If it will work, then the problem is most probably with certificate.
>If it won't work, then the problem is most probably with startTLS 
>configuration on AD side.



So, do you think it's startTLS on AD side ?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Using ceph volumes with ovirt

2016-05-30 Thread Nir Soffer
On Mon, May 30, 2016 at 1:44 PM, Alessandro De Salvo
 wrote:
> Hi,
> just to answer myself, I found these instructions that solved my problem:
>
> http://7xqb88.com1.z0.glb.clouddn.com/Features_Cinder%20Integration.pdf
>
> Basically, I was missing the step to add the ceph key to the Authentication
> Keys tab of the cinder External Provider.
> It's all working now.

Cool.

You can also look in the wiki:
http://www.ovirt.org/develop/release-management/features/storage/cinder-integration/
(Looks dead now, but usually it is up).

Please share your experience with this feature, we live to get feedback.

Cheers,
Nir

> Cheers,
>
> Alessandro
>
> Il 30/05/16 10:55, Alessandro De Salvo ha scritto:
>
> Hi,
> I'm happily using our research cluster in Italy via gluster, and now I'm
> trying to hotplug a ceph disk on a VM of my cluster, without success.
> The ceph cluster is managed via openstack cinder and I can create correctly
> the disk via ovirt (3.6.6.2-1 on CentOS 7.2).
> The problem comes when trying to hotplug, or start a machine with the given
> disk attached.
> In the vdsm log of the host where the VM is running or starting I see the
> following error:
>
>
> jsonrpc.Executor/5::INFO::2016-05-30
> 10:35:29,197::vm::2729::virt.vm::(hotplugDisk)
> vmId=`c189472e-25d2-4df1-b089-590009856dd3`::Hotplug disk xml:  address="" device="disk" snapshot="no" type="network">
>  protocol="rbd">
> 
> 
> 
> 
> 
> 
> 
>  type="raw"/>
> 
>
> jsonrpc.Executor/5::ERROR::2016-05-30
> 10:35:29,198::vm::2737::virt.vm::(hotplugDisk)
> vmId=`c189472e-25d2-4df1-b089-590009856dd3`::Hotplug failed
> Traceback (most recent call last):
>   File "/usr/share/vdsm/virt/vm.py", line 2735, in hotplugDisk
> self._dom.attachDevice(driveXml)
>   File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
> ret = attr(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
> 124, in wrapper
> ret = f(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in
> wrapper
> return func(inst, *args, **kwargs)
>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 530, in
> attachDevice
> if ret == -1: raise libvirtError ('virDomainAttachDevice() failed',
> dom=self)
> libvirtError: XML error: invalid auth secret uuid
>
>
>
> In fact the uuid of the secret used by ovirt to hotplug seems to be the ceph
> secret (masked here as ), while libvirt expects the
> uuid of the libvirt secret, by looking at the instructions
> http://docs.ceph.com/docs/jewel/rbd/libvirt/.
> Anyone got it working?
> Thanks,
>
> Alessandro
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Can't perform search after setting up an Active Directory

2016-05-30 Thread Ondra Machacek

On 05/30/2016 03:11 PM, Alexis HAUSER wrote:

This is output of installation script
'ovirt-engine-extension-aaa-ldap-setup', which is written in python, but
aaa-ldap extension in Java. So the strange thing is that you can connect
via
startTLS in python script, but later you can't connect with aaa-ldap
Java extension.
Can you please also share output of this command:
 $ ovirt-engine-extensions-tool --log-level=FINEST --log-file=login.log
aaa login-user --profile=AD2 --user-name=mysearchuser
--password=pass:password
Hopefully it tell more. Thanks.



Yes, Here it is :

https://bpaste.net/show/4530b8075e1d

I don't see much more than these SSL errors. What about you ?


By the way, I've never found out what password should be used for the 
automatically generated .jks files from the 
ovirt-engine-extension-aaa-ldap-setup.
That's why I use a generated .jks file (with keytool command). Anyway, I don't 
think there could be any problem with that, as I can use this cert for 
ldapsearch, I was just wondering what that default password of that 
automatically generated file could...



Default password is 'changeit' (without quotes).

Hmm, can you please try use the .jks file generated by aaa-ldap-setup 
tool? Just to be sure.


Anyway, the strange thing is that aaa-ldap-setup tool passes, but 
extension don't work later.

My guess is that it could be unsupported TLS version.

Can you please try running:

 LDAPTLS_CACERT=/somewhere/myca.pem ldapsearch -Z -H 
ldap://myserver.com -x -D 'CN=Something,DC=myserver,DC=come' -w 
'mypaswd' -b 'CN=users,DC=something,DC=com'


and

  LDAPTLS_PROTOCOL_MIN=3.2 LDAPTLS_CACERT=/somewhere/myca.pem -Z -H 
ldap://myserver.com -x -D 'CN=Something,DC=myserver,DC=come' -w 
'mypaswd' -b 'CN=users,DC=something,DC=com'


Does both commands succed?

If the later one don't work then probably your AD don't accept TLSv1.
You can change it byt this configuration options:

 pool.default.ssl.startTLSProtocol=TLSv1

to secure:

 pool.default.ssl.startTLSProtocol=TLSv1.2

or:

  pool.default.ssl.startTLSProtocol=SSLv3

But, you should use TLSv1.2.

If none of this is true, then I would try to enable insecure connection:

 pool.default.ssl.insecure = true

If it will work, then the problem is most probably with certificate.
If it won't work, then the problem is most probably with startTLS 
configuration on AD side.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] What recovers a VM from pause?

2016-05-30 Thread InterNetX - Juergen Gotteswinter
Am 5/30/2016 um 3:59 PM schrieb Nicolas Ecarnot:
> Le 30/05/2016 15:30, InterNetX - Juergen Gotteswinter a écrit :
>> Hi,
>>
>> you are aware of the fact that eql sync replication is just about
>> replication, no single piece of high availability? i am not even sure if
>> it does ip failover itself. so better think about minutes of
>> interruptions than seconds.
> 
> Hi Juergen,
> 
> I'm absolutely aware that there is no HA discussed here, at least in my
> mind.
> It does ip fail-over, but I'm not even blindly trusting it enough,
> that's why I'm doing numerous tests and measures.
> I'm gladly surprised by how the iSCSI stack is reacting, and its log
> files are readable enough for me to decide.
> 
> Actually, I was more worrying about the iSCSI reconnection storm, but
> googling about it does not seem to get any warnings.

This works pretty well with the Eql Boxes, except you use the EQL
without Hit Kit. With installed HitKit on each Client i dont think that
this will cause problems.


> 
>> anyway, dont count on ovirts pause/unpause. theres a real chance that it
>> will go horrible wrong. a scheduled maint. window where everything gets
>> shut down whould be best practice
> 
> Indeed, this would the best choice, if I had it.
> 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] What recovers a VM from pause?

2016-05-30 Thread Nicolas Ecarnot

Le 30/05/2016 15:30, InterNetX - Juergen Gotteswinter a écrit :

Hi,

you are aware of the fact that eql sync replication is just about
replication, no single piece of high availability? i am not even sure if
it does ip failover itself. so better think about minutes of
interruptions than seconds.


Hi Juergen,

I'm absolutely aware that there is no HA discussed here, at least in my 
mind.
It does ip fail-over, but I'm not even blindly trusting it enough, 
that's why I'm doing numerous tests and measures.
I'm gladly surprised by how the iSCSI stack is reacting, and its log 
files are readable enough for me to decide.


Actually, I was more worrying about the iSCSI reconnection storm, but 
googling about it does not seem to get any warnings.



anyway, dont count on ovirts pause/unpause. theres a real chance that it
will go horrible wrong. a scheduled maint. window where everything gets
shut down whould be best practice


Indeed, this would the best choice, if I had it.

--
Nicolas ECARNOT
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] failing update ovirt-engine on centos 7

2016-05-30 Thread Yedidyah Bar David
On Mon, May 30, 2016 at 4:49 PM, Michal Skrivanek
 wrote:
>
>> On 30 May 2016, at 15:35, Pavel Gashev  wrote:
>>
>> In my case oVirt is running in an OpenVZ container. Since selinux doesn't 
>> support namespaces, it's disabled.
>>
>> I don't want to fuel the holy war stopdisablingselinux.com vs 
>> selinuxsucks.com. Just please allow us to choose. Thanks.
>
> yep, I guess it’s fair in experimental cases like yours. And you can skip 
> over the ovirt-vmconsole deployment in engine-setup completely, so even when 
> the bug is still here it shouldn’t affect you at all.
> It’s not about a choice, it’s about supportability and reasonable 
> verification.

I'll just note that generally speaking, we do fix such bugs, see e.g. [1].
So please open one and eventually it will be handled. Thanks.

That said, we do work hard to make everything work with selinux enabled.
If something in ovirt fails for you when it's enabled, and works if you
disable selinux, that's a much higher priority bug.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=980042

>
> Thanks,
> michal
>
>>
>> On 30/05/16 16:01, "users-boun...@ovirt.org on behalf of Michal Skrivanek" 
>>  wrote:
>>
>>>
 On 30 May 2016, at 14:57, Fabrice Bacchella  
 wrote:

>
> Running with selinux disabled is not recommended nor supported.
> It should be easy to skip over that problem, but in general this is not 
> something you should hit in normal environment

 That's very theorical recommandation. selinux is very very often disabled, 
 because nobody really understand it.
>>>
>>> It is not theoretical, it’s mandatory. there is an assumption it is 
>>> enabled, after bare OS installation it is enabled, so when you disable it 
>>> it is an explicit decision done by the admin for some reason. What did you 
>>> find not working? Did you really encounter anything not being solved by 
>>> setting Permissive mode instead disabling completely?
>>>
>>> Thanks,
>>> michal
>>>
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] failing update ovirt-engine on centos 7

2016-05-30 Thread Michal Skrivanek

> On 30 May 2016, at 15:35, Pavel Gashev  wrote:
> 
> In my case oVirt is running in an OpenVZ container. Since selinux doesn't 
> support namespaces, it's disabled.
> 
> I don't want to fuel the holy war stopdisablingselinux.com vs 
> selinuxsucks.com. Just please allow us to choose. Thanks.

yep, I guess it’s fair in experimental cases like yours. And you can skip over 
the ovirt-vmconsole deployment in engine-setup completely, so even when the bug 
is still here it shouldn’t affect you at all.
It’s not about a choice, it’s about supportability and reasonable verification.

Thanks,
michal

> 
> On 30/05/16 16:01, "users-boun...@ovirt.org on behalf of Michal Skrivanek" 
>  wrote:
> 
>> 
>>> On 30 May 2016, at 14:57, Fabrice Bacchella  
>>> wrote:
>>> 
 
 Running with selinux disabled is not recommended nor supported.
 It should be easy to skip over that problem, but in general this is not 
 something you should hit in normal environment
>>> 
>>> That's very theorical recommandation. selinux is very very often disabled, 
>>> because nobody really understand it.
>> 
>> It is not theoretical, it’s mandatory. there is an assumption it is enabled, 
>> after bare OS installation it is enabled, so when you disable it it is an 
>> explicit decision done by the admin for some reason. What did you find not 
>> working? Did you really encounter anything not being solved by setting 
>> Permissive mode instead disabling completely?
>> 
>> Thanks,
>> michal
>> 
>> 
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
> 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] failing update ovirt-engine on centos 7

2016-05-30 Thread Michal Skrivanek

> On 30 May 2016, at 15:22, Fabrice Bacchella  
> wrote:
> 
>> 
>> Le 30 mai 2016 à 15:01, Michal Skrivanek  a 
>> écrit :
>> 
>> 
>>> On 30 May 2016, at 14:57, Fabrice Bacchella  
>>> wrote:
>>> 
 
 Running with selinux disabled is not recommended nor supported.
 It should be easy to skip over that problem, but in general this is not 
 something you should hit in normal environment
>>> 
>>> That's very theorical recommandation. selinux is very very often disabled, 
>>> because nobody really understand it.
>> 
>> It is not theoretical, it’s mandatory. there is an assumption it is enabled, 
>> after bare OS installation it is enabled, so when you disable it it is an 
>> explicit decision done by the admin for some reason. What did you find not 
>> working? Did you really encounter anything not being solved by setting 
>> Permissive mode instead disabling completely?
>> 
> 
> What's the purpose of permissive ? if everything is allowed, what selinux is 
> good for ? Instead of having something that run doing nothing, I shutdown it, 
> and selinux is part of that generic policy.

there is a difference between “no support for selinux” and “allowing 
everything”. Functionally it is different as e.g. labelling is not getting done 
when selinux is disabled, that’s why typically when you disable selinux, and 
install/change something those files do not have set up the context properly 
and when you enable selinux again things break completely (this bug is a 
different case)

> 
> What is a bad practice is switching selinux on and off. So my installation 
> setup is done with selinux down and stay so for the whole server life of the 
> server.
> 
> I never met a product that requisite selinux.

I’m not going to start a flamewar on selinux, there are plenty of those out 
there:) But oVirt is built with security in mind on a RHEL-based distro, so it 
uses SELinux.
All I can say is that disabling SELinux is discouraged for security as well as 
functionality reasons.

> 
> And more, I just have a look at your administration guide 
> (http://www.ovirt.org/documentation/admin-guide/administration-guide/) and 
> quickstart guide 
> (http://www.ovirt.org/documentation/quickstart/quickstart-guide/). selinux is 
> never declared as mandatory. There is just a few tips about the problem that 
> one can have with selinux. 

yes, most things tend to work…until they don’t. You’ve just encountered the 
situation when it doesn’t work. It shall be fixed, but it is not at the moment.

Thanks,
michal

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] failing update ovirt-engine on centos 7

2016-05-30 Thread Pavel Gashev
In my case oVirt is running in an OpenVZ container. Since selinux doesn't 
support namespaces, it's disabled.

I don't want to fuel the holy war stopdisablingselinux.com vs selinuxsucks.com. 
Just please allow us to choose. Thanks.

On 30/05/16 16:01, "users-boun...@ovirt.org on behalf of Michal Skrivanek" 
 wrote:

>
>> On 30 May 2016, at 14:57, Fabrice Bacchella  
>> wrote:
>> 
>>> 
>>> Running with selinux disabled is not recommended nor supported.
>>> It should be easy to skip over that problem, but in general this is not 
>>> something you should hit in normal environment
>> 
>> That's very theorical recommandation. selinux is very very often disabled, 
>> because nobody really understand it.
>
>It is not theoretical, it’s mandatory. there is an assumption it is enabled, 
>after bare OS installation it is enabled, so when you disable it it is an 
>explicit decision done by the admin for some reason. What did you find not 
>working? Did you really encounter anything not being solved by setting 
>Permissive mode instead disabling completely?
>
>Thanks,
>michal
>
>
>___
>Users mailing list
>Users@ovirt.org
>http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] What recovers a VM from pause?

2016-05-30 Thread InterNetX - Juergen Gotteswinter
Hi,

you are aware of the fact that eql sync replication is just about
replication, no single piece of high availability? i am not even sure if
it does ip failover itself. so better think about minutes of
interruptions than seconds.

anyway, dont count on ovirts pause/unpause. theres a real chance that it
will go horrible wrong. a scheduled maint. window where everything gets
shut down whould be best practice

Juergen

Am 5/30/2016 um 3:07 PM schrieb Nicolas Ecarnot:
> Hello,
> 
> We're planning a move from our old building towards a new one a few
> meters away.
> 
> 
> 
> In a similar way of Martijn
> (https://www.mail-archive.com/users@ovirt.org/msg33182.html), I have
> maintenance planed on our storage side.
> 
> Say an oVirt DC is using a SAN's LUN via iSCSI (Equallogic).
> This SAN allows me to setup block replication between two SANs, seen by
> oVirt as one (Dell is naming it SyncRep).
> Then switch all the iSCSI accesses to the replicated LUN.
> 
> When doing this, the iSCSI stack of each oVirt host notices the
> de-connection, tries to reconnect, and succeeds.
> Amongst our hosts, this happens between 4 and 15 seconds.
> 
> When this happens fast enough, oVirt engine and the VMs don't even
> notice, and they keep running happily.
> 
> When this takes more than 4 seconds, there are 2 cases :
> 
> 1 - The hosts and/or oVirt and/or the SPM (I actually don't know)
> notices that there is a storage failure, and pauses the VMs.
> When the iSCSI stack reconnects, the VMs are automatically recovered
> from pause, and this all takes less than 30 seconds. That is very
> acceptable for us, as this action is extremely rare.
> 
> 2 - Same storage failure, VMs paused, and some VMs stay in pause mode
> forever.
> Manual "run" action is mandatory.
> When done, everything recovers correctly.
> This is also quite acceptable, but here come my questions :
> 
> My questions : (!)
> - *WHAT* process or piece of code or what oVirt parts is responsible for
> deciding when to UN-pause a VM, and at what conditions?
> That would help me to understand why some cases are working even more
> smoothly than others.
> - Are there related timeouts I could play with in engine-config options?
> - [a bit off-topic] Is it safe to increase some iSCSI timeouts of
> buffer-sizes in the hope this kind of disconnection would get un-noticed?
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] failing update ovirt-engine on centos 7

2016-05-30 Thread Fabrice Bacchella

> Le 30 mai 2016 à 15:01, Michal Skrivanek  a 
> écrit :
> 
> 
>> On 30 May 2016, at 14:57, Fabrice Bacchella  
>> wrote:
>> 
>>> 
>>> Running with selinux disabled is not recommended nor supported.
>>> It should be easy to skip over that problem, but in general this is not 
>>> something you should hit in normal environment
>> 
>> That's very theorical recommandation. selinux is very very often disabled, 
>> because nobody really understand it.
> 
> It is not theoretical, it’s mandatory. there is an assumption it is enabled, 
> after bare OS installation it is enabled, so when you disable it it is an 
> explicit decision done by the admin for some reason. What did you find not 
> working? Did you really encounter anything not being solved by setting 
> Permissive mode instead disabling completely?
> 

What's the purpose of permissive ? if everything is allowed, what selinux is 
good for ? Instead of having something that run doing nothing, I shutdown it, 
and selinux is part of that generic policy.

What is a bad practice is switching selinux on and off. So my installation 
setup is done with selinux down and stay so for the whole server life of the 
server.

I never met a product that requisite selinux.

And more, I just have a look at your administration guide 
(http://www.ovirt.org/documentation/admin-guide/administration-guide/) and 
quickstart guide 
(http://www.ovirt.org/documentation/quickstart/quickstart-guide/). selinux is 
never declared as mandatory. There is just a few tips about the problem that 
one can have with selinux. 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Can't perform search after setting up an Active Directory

2016-05-30 Thread Alexis HAUSER
>This is output of installation script 
>'ovirt-engine-extension-aaa-ldap-setup', which is written in python, but 
>aaa-ldap extension in Java. So the strange thing is that you can connect 
>via
>startTLS in python script, but later you can't connect with aaa-ldap 
>Java extension.
>Can you please also share output of this command:
>  $ ovirt-engine-extensions-tool --log-level=FINEST --log-file=login.log 
>aaa login-user --profile=AD2 --user-name=mysearchuser 
>--password=pass:password
>Hopefully it tell more. Thanks.


Yes, Here it is :

https://bpaste.net/show/4530b8075e1d

I don't see much more than these SSL errors. What about you ?


By the way, I've never found out what password should be used for the 
automatically generated .jks files from the 
ovirt-engine-extension-aaa-ldap-setup.
That's why I use a generated .jks file (with keytool command). Anyway, I don't 
think there could be any problem with that, as I can use this cert for 
ldapsearch, I was just wondering what that default password of that 
automatically generated file could...
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] failing update ovirt-engine on centos 7

2016-05-30 Thread Michal Skrivanek

> On 30 May 2016, at 14:57, Fabrice Bacchella  
> wrote:
> 
>> 
>> Running with selinux disabled is not recommended nor supported.
>> It should be easy to skip over that problem, but in general this is not 
>> something you should hit in normal environment
> 
> That's very theorical recommandation. selinux is very very often disabled, 
> because nobody really understand it.

It is not theoretical, it’s mandatory. there is an assumption it is enabled, 
after bare OS installation it is enabled, so when you disable it it is an 
explicit decision done by the admin for some reason. What did you find not 
working? Did you really encounter anything not being solved by setting 
Permissive mode instead disabling completely?

Thanks,
michal


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] failing update ovirt-engine on centos 7

2016-05-30 Thread Fabrice Bacchella
> 
> Running with selinux disabled is not recommended nor supported.
> It should be easy to skip over that problem, but in general this is not 
> something you should hit in normal environment

That's very theorical recommandation. selinux is very very often disabled, 
because nobody really understand it.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] failing update ovirt-engine on centos 7

2016-05-30 Thread Michal Skrivanek

> On 26 May 2016, at 18:17, Sandro Bonazzola  wrote:
> 
> Il 26/Mag/2016 12:50, "Yedidyah Bar David"  ha scritto:
>> 
>> On Thu, May 26, 2016 at 1:21 PM, Pavel Gashev  wrote:
>>> I had an issue with updating to 3.6.6. There were errors during
> engine-setup:
>>> 
>>> [ ERROR ] Yum Non-fatal POSTUN scriptlet failure in rpm package
> ovirt-vmconsole-1.0.0-1.el7.centos.noarch
>>> 
>>> [ ERROR ] Yum Transaction close failed: Traceback (most recent call
> last):   File "/usr/lib/python2.7/site-packages/otopi/miniyum.py", line
> 778, in endTransaction self.processTransaction()   File
> "/usr/lib/python2.7/site-packages/otopi/miniyum.py", line 1064, in
> processTransaction _('One or more elements within Yum transaction
> failed') RuntimeError: One or more elements within Yum transaction failed
>>> 
>>> ovirt-vmconsole has the following uninstall script:
>>> postuninstall scriptlet (using /bin/sh):
>>> if [ "$1" -ge "1" ]; then
>>>semodule -i
> "/usr/share/selinux/packages/ovirt-vmconsole/ovirt_vmconsole.pp"
>>> fi
>>> 
>>> In other words you can't update if you have SELINUX disabled.
>>> 
>>> The workaround is the following:
>>> ln -fs /bin/true /usr/sbin/semodule
>> 
>> Thanks for the report. Adding Francesco.
>> 
> 
> Please open a bz on ovirt-vmconsole.

Running with selinux disabled is not recommended nor supported.
It should be easy to skip over that problem, but in general this is not 
something you should hit in normal environment

Thanks,
michal

> 
>>> 
>>> 
>>> On 26/05/16 08:43, "users-boun...@ovirt.org on behalf of Yedidyah Bar
> David"  wrote:
>>> 
 On Wed, May 25, 2016 at 9:11 PM, Fabrice Bacchella
  wrote:
> 
> Le 25 mai 2016 à 17:25, Kapetanakis Giannis 
> a
> écrit :
> 
> On 25/05/16 17:59, Fabrice Bacchella wrote:
> 
> I have an dedicated machin to run ovirt-engine (not hosted). It's an
> up to
> date centos 7.2.1511
> 
> I installed ovirt 3.6.6 a few weeks ago (May 10 17:56:44 tells me
> yum.log)
> 
> Now, I'm trying a full yum update and getting :
> # yum update
> 
> 
> Error: Package: ovirt-engine-tools-3.6.5.3-1.el7.centos.noarch
> (@ovirt-3.6)
>   Requires: ovirt-engine-tools-backup = 3.6.5.3-1.el7.centos
>   Removing:
> ovirt-engine-tools-backup-3.6.5.3-1.el7.centos.noarch
> (@ovirt-3.6)
>   ovirt-engine-tools-backup = 3.6.5.3-1.el7.centos
>   Updated By:
> ovirt-engine-tools-backup-3.6.6.2-1.el7.centos.noarch
> (ovirt-3.6)
>   ovirt-engine-tools-backup = 3.6.6.2-1.el7.centos
> 
> 
> 
> Follow 3.6.6 release notes to update:
> https://www.ovirt.org/release/3.6.6/
> 
> 
> yum install
> http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
> yum update ovirt\*setup\*
> and then run
> engine-setup to update the rest of the packages.
> 
> 
> I have seen this doc.
> 
> It updates a few components and what about the others ? The readme
> talk
> about running engine-setup, but not that it will updates other
> packages. I
> thought that ovirt-engine is for engine setup, not upgrading.
 
 Right.
 
 After engine-setup finishes, you should 'yum update' to update the rest

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Ovirt 3.6.6 on Centos 7.2 not using native gluster (gfapi)

2016-05-30 Thread Ralf Schenk
Hello,

I set up 8 Hosts and self-hosted-engine running HA on 3 of them from
gluster replica 3 Volume. HA is working, I can set one host of the 3
configured for hosted-engine to maintenance and engine migrates to other
host. I did the hosted-engine --deploy with type gluster and my gluster
hosted storage is accessed as glusterfs.mydomain.de:/engine

I set up another gluster volume (distributed replicated 4x2=8) as Data
storage for my virtual machines which is accessible as
glusterfs.mydomain.de:/gv0.  ISO and Export Volume are defined from NFS
Server.

When I set up a VM on the gluster storage I expected it to run with
native gluster support. However if I dumpxml the libvirt machine
definition I've something like that in it's config:

[...]


  
  
  
  
  011ab08e-71af-4d5b-a6a8-9b843a10329e
  
  
  


I expected to have something like this:


  
  
  
  [...]

All hosts have vdsm-gluster gluster installed:
[root@microcloud21 libvirt]# yum list installed | grep vdsm-*
vdsm.noarch   4.17.28-0.el7.centos   @ovirt-3.6
vdsm-cli.noarch   4.17.28-0.el7.centos   @ovirt-3.6
vdsm-gluster.noarch   4.17.28-0.el7.centos   @ovirt-3.6
vdsm-hook-hugepages.noarch4.17.28-0.el7.centos   @ovirt-3.6
vdsm-hook-vmfex-dev.noarch4.17.28-0.el7.centos   @ovirt-3.6
vdsm-infra.noarch 4.17.28-0.el7.centos   @ovirt-3.6
vdsm-jsonrpc.noarch   4.17.28-0.el7.centos   @ovirt-3.6
vdsm-python.noarch4.17.28-0.el7.centos   @ovirt-3.6
vdsm-xmlrpc.noarch4.17.28-0.el7.centos   @ovirt-3.6
vdsm-yajsonrpc.noarch 4.17.28-0.el7.centos   @ovirt-3.6

How do I get my most wanted feature native gluster support running ?

-- 


*Ralf Schenk*
fon +49 (0) 24 05 / 40 83 70
fax +49 (0) 24 05 / 40 83 759
mail *r...@databay.de* 

*Databay AG*
Jens-Otto-Krag-Straße 11
D-52146 Würselen
*www.databay.de* 

Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
Philipp Hermanns
Aufsichtsratsvorsitzender: Klaus Scholzen (RA)


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Using ceph volumes with ovirt

2016-05-30 Thread Alessandro De Salvo

Hi,
just to answer myself, I found these instructions that solved my problem:

http://7xqb88.com1.z0.glb.clouddn.com/Features_Cinder%20Integration.pdf

Basically, I was missing the step to add the ceph key to the 
Authentication Keys tab of the cinder External Provider.

It's all working now.
Cheers,

Alessandro

Il 30/05/16 10:55, Alessandro De Salvo ha scritto:


Hi,
I'm happily using our research cluster in Italy via gluster, and now 
I'm trying to hotplug a ceph disk on a VM of my cluster, without success.
The ceph cluster is managed via openstack cinder and I can create 
correctly the disk via ovirt (3.6.6.2-1 on CentOS 7.2).
The problem comes when trying to hotplug, or start a machine with the 
given disk attached.
In the vdsm log of the host where the VM is running or starting I see 
the following error:



jsonrpc.Executor/5::INFO::2016-05-30 
10:35:29,197::vm::2729::virt.vm::(hotplugDisk) 
vmId=`c189472e-25d2-4df1-b089-590009856dd3`::Hotplug disk xml: address="" device="disk" snapshot="no" type="network">
name="images/volume-9134b639-c23c-4ff1-91ca-0462c80026d2" protocol="rbd">








name="qemu" type="raw"/>



jsonrpc.Executor/5::ERROR::2016-05-30 
10:35:29,198::vm::2737::virt.vm::(hotplugDisk) 
vmId=`c189472e-25d2-4df1-b089-590009856dd3`::Hotplug failed

Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 2735, in hotplugDisk
self._dom.attachDevice(driveXml)
  File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", 
line 124, in wrapper

ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in 
wrapper

return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 530, in 
attachDevice
if ret == -1: raise libvirtError ('virDomainAttachDevice() 
failed', dom=self)

libvirtError: XML error: invalid auth secret uuid



In fact the uuid of the secret used by ovirt to hotplug seems to be 
the ceph secret (masked here as ), while libvirt 
expects the uuid of the libvirt secret, by looking at the instructions 
http://docs.ceph.com/docs/jewel/rbd/libvirt/.

Anyone got it working?
Thanks,

Alessandro


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Can't perform search after setting up an Active Directory

2016-05-30 Thread Ondra Machacek

On 05/30/2016 12:03 PM, Alexis HAUSER wrote:

'ovirt-engine-extensions-tool' logs would be more helpfull.


Here it is :
https://bpaste.net/show/a166df875909

I can't see anything else than this SSL error and what seems to be a missing python 
module : "ImportError: No module named dnf"

Can you see something else or do you have any idea of what I could do to solve 
this StartTLS problem ?



This is output of installation script 
'ovirt-engine-extension-aaa-ldap-setup', which is written in python, but 
aaa-ldap extension in Java. So the strange thing is that you can connect 
via
startTLS in python script, but later you can't connect with aaa-ldap 
Java extension.


Can you please also share output of this command:
 $ ovirt-engine-extensions-tool --log-level=FINEST --log-file=login.log 
aaa login-user --profile=AD2 --user-name=mysearchuser 
--password=pass:password


Hopefully it tell more. Thanks.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Can't perform search after setting up an Active Directory

2016-05-30 Thread Alexis HAUSER
>'ovirt-engine-extensions-tool' logs would be more helpfull.

Here it is :
https://bpaste.net/show/a166df875909

I can't see anything else than this SSL error and what seems to be a missing 
python module : "ImportError: No module named dnf"

Can you see something else or do you have any idea of what I could do to solve 
this StartTLS problem ?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Using ceph volumes with ovirt

2016-05-30 Thread Alessandro De Salvo

Hi,
I'm happily using our research cluster in Italy via gluster, and now I'm 
trying to hotplug a ceph disk on a VM of my cluster, without success.
The ceph cluster is managed via openstack cinder and I can create 
correctly the disk via ovirt (3.6.6.2-1 on CentOS 7.2).
The problem comes when trying to hotplug, or start a machine with the 
given disk attached.
In the vdsm log of the host where the VM is running or starting I see 
the following error:



jsonrpc.Executor/5::INFO::2016-05-30 
10:35:29,197::vm::2729::virt.vm::(hotplugDisk) 
vmId=`c189472e-25d2-4df1-b089-590009856dd3`::Hotplug disk xml: address="" device="disk" snapshot="no" type="network">
name="images/volume-9134b639-c23c-4ff1-91ca-0462c80026d2" protocol="rbd">








name="qemu" type="raw"/>



jsonrpc.Executor/5::ERROR::2016-05-30 
10:35:29,198::vm::2737::virt.vm::(hotplugDisk) 
vmId=`c189472e-25d2-4df1-b089-590009856dd3`::Hotplug failed

Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 2735, in hotplugDisk
self._dom.attachDevice(driveXml)
  File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", 
line 124, in wrapper

ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1313, in 
wrapper

return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 530, in 
attachDevice
if ret == -1: raise libvirtError ('virDomainAttachDevice() failed', 
dom=self)

libvirtError: XML error: invalid auth secret uuid



In fact the uuid of the secret used by ovirt to hotplug seems to be the 
ceph secret (masked here as ), while libvirt 
expects the uuid of the libvirt secret, by looking at the instructions 
http://docs.ceph.com/docs/jewel/rbd/libvirt/.

Anyone got it working?
Thanks,

Alessandro
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] One RHEV Virtual Machine does not Automatically Resume following Compellent SAN Controller Failover

2016-05-30 Thread Yaniv Dary
Can you reply on my question?

Yaniv Dary
Technical Product Manager
Red Hat Israel Ltd.
34 Jerusalem Road
Building A, 4th floor
Ra'anana, Israel 4350109

Tel : +972 (9) 7692306
8272306
Email: yd...@redhat.com
IRC : ydary


On Thu, May 26, 2016 at 9:14 AM, Yaniv Dary  wrote:

> What DR solution are you using?
>
> Yaniv Dary
> Technical Product Manager
> Red Hat Israel Ltd.
> 34 Jerusalem Road
> Building A, 4th floor
> Ra'anana, Israel 4350109
>
> Tel : +972 (9) 7692306
> 8272306
> Email: yd...@redhat.com
> IRC : ydary
>
>
> On Wed, Nov 25, 2015 at 1:15 PM, Simone Tiraboschi 
> wrote:
>
>> Adding Nir who knows it far better than me.
>>
>>
>> On Mon, Nov 23, 2015 at 8:37 PM, Duckworth, Douglas C 
>> wrote:
>>
>>> Hello --
>>>
>>> Not sure if y'all can help with this issue we've been seeing with RHEV...
>>>
>>> On 11/13/2015, during Code Upgrade of Compellent SAN at our Disaster
>>> Recovery Site, we Failed Over to Secondary SAN Controller.  Most Virtual
>>> Machines in our DR Cluster Resumed automatically after Pausing except VM
>>> "BADVM" on Host "BADHOST."
>>>
>>> In Engine.log you can see that BADVM was sent into "VM_PAUSED_EIO" state
>>> at 10:47:57:
>>>
>>> "VM BADVM has paused due to storage I/O problem."
>>>
>>> On this Red Hat Enterprise Virtualization Hypervisor 6.6
>>> (20150512.0.el6ev) Host, two other VMs paused but then automatically
>>> resumed without System Administrator intervention...
>>>
>>> In our DR Cluster, 22 VMs also resumed automatically...
>>>
>>> None of these Guest VMs are engaged in high I/O as these are DR site VMs
>>> not currently doing anything.
>>>
>>> We sent this information to Dell.  Their response:
>>>
>>> "The root cause may reside within your virtualization solution, not the
>>> parent OS (RHEV-Hypervisor disc) or Storage (Dell Compellent.)"
>>>
>>> We are doing this Failover again on Sunday November 29th so we would
>>> like to know how to mitigate this issue, given we have to manually
>>> resume paused VMs that don't resume automatically.
>>>
>>> Before we initiated SAN Controller Failover, all iSCSI paths to Targets
>>> were present on Host tulhv2p03.
>>>
>>> VM logs on Host show in /var/log/libvirt/qemu/badhost.log that Storage
>>> error was reported:
>>>
>>> block I/O error in device 'drive-virtio-disk0': Input/output error (5)
>>> block I/O error in device 'drive-virtio-disk0': Input/output error (5)
>>> block I/O error in device 'drive-virtio-disk0': Input/output error (5)
>>> block I/O error in device 'drive-virtio-disk0': Input/output error (5)
>>>
>>> All disks used by this Guest VM are provided by single Storage Domain
>>> COM_3TB4_DR with serial "270."  In syslog we do see that all paths for
>>> that Storage Domain Failed:
>>>
>>> Nov 13 16:47:40 multipathd: 36000d310005caf000270: remaining
>>> active paths: 0
>>>
>>> Though these recovered later:
>>>
>>> Nov 13 16:59:17 multipathd: 36000d310005caf000270: sdbg -
>>> tur checker reports path is up
>>> Nov 13 16:59:17 multipathd: 36000d310005caf000270: remaining
>>> active paths: 8
>>>
>>> Does anyone have an idea of why the VM would fail to automatically
>>> resume if the iSCSI paths used by its Storage Domain recovered?
>>>
>>> Thanks
>>> Doug
>>>
>>> --
>>> Thanks
>>>
>>> Douglas Charles Duckworth
>>> Unix Administrator
>>> Tulane University
>>> Technology Services
>>> 1555 Poydras Ave
>>> NOLA -- 70112
>>>
>>> E: du...@tulane.edu
>>> O: 504-988-9341
>>> F: 504-988-8505
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Could not associate gluster brick with correct network warning

2016-05-30 Thread Roderick Mooi
Hi

Yes, I created the volume using "gluster volume create ..." prior to
installing ovirt. Something I noticed is that there is no "gluster" bridge
on top of the interface I selected for the "Gluster Management" network -
could this be the problem?

Thanks,

Roderick

Roderick Mooi

Senior Engineer: South African National Research Network (SANReN)
Meraka Institute, CSIR

roder...@sanren.ac.za | +27 12 841 4111 | www.sanren.ac.za

On Fri, May 27, 2016 at 11:35 AM, Ramesh Nachimuthu 
wrote:

> How did you create the volume?. Looks like the volume was created using
> FQDN in Gluster CLI.
>
>
> Regards,
> Ramesh
>
> - Original Message -
> > From: "Roderick Mooi" 
> > To: "users" 
> > Sent: Friday, May 27, 2016 2:34:51 PM
> > Subject: [ovirt-users] Could not associate gluster brick with correct
> network warning
> >
> > Good day
> >
> > I've setup a "Gluster Management" network in DC, cluster and all hosts.
> It is
> > appearing as "operational" in the cluster and all host networks look
> > correct. But I'm seeing this warning continually in the engine.log:
> >
> > 2016-05-27 08:56:58,988 WARN
> >
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
> > (DefaultQuartzScheduler_Worker-80) [] Could not associate brick
> > 'glustermount.host1:/gluster/data/brick' of volume
> > '7a25d2fb-1048-48d8-a26d-f288ff0e28cb' with correct network as no gluster
> > network found in cluster '0002-0002-0002-0002-02b8'
> >
> > This is on ovirt 3.6.5.
> >
> > Can anyone assist?
> >
> > Thanks,
> >
> > Roderick Mooi
> >
> > Senior Engineer: South African National Research Network (SANReN)
> > Meraka Institute, CSIR
> >
> > roder...@sanren.ac.za | +27 12 841 4111 | www.sanren.ac.za
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Sanlock add Lockspace Errors

2016-05-30 Thread InterNetX - Juergen Gotteswinter
Hi,

since some time we get Error Messages from Sanlock, and so far i was not
able to figure out what exactly they try to tell and more important if
its something which can be ignored or needs to be fixed (and how).

Here are the Versions we are using currently:

Engine

ovirt-engine-3.5.6.2-1.el6.noarch

Nodes

vdsm-4.16.34-0.el7.centos.x86_64
sanlock-3.2.4-1.el7.x86_64
libvirt-lock-sanlock-1.2.17-13.el7_2.3.x86_64
libvirt-daemon-1.2.17-13.el7_2.3.x86_64
libvirt-lock-sanlock-1.2.17-13.el7_2.3.x86_64
libvirt-1.2.17-13.el7_2.3.x86_64

-- snip --
May 30 09:55:27 vm2 sanlock[1094]: 2016-05-30 09:55:27+0200 294109
[60137]: verify_leader 2 wrong space name
4643f652-8014-4951-8a1a-02af41e67d08
f757b127-a951-4fa9-bf90-81180c0702e6
/dev/f757b127-a951-4fa9-bf90-81180c0702e6/ids
May 30 09:55:27 vm2 sanlock[1094]: 2016-05-30 09:55:27+0200 294109
[60137]: leader1 delta_acquire_begin error -226 lockspace
f757b127-a951-4fa9-bf90-81180c0702e6 host_id 2
May 30 09:55:27 vm2 sanlock[1094]: 2016-05-30 09:55:27+0200 294109
[60137]: leader2 path /dev/f757b127-a951-4fa9-bf90-81180c0702e6/ids offset 0
May 30 09:55:27 vm2 sanlock[1094]: 2016-05-30 09:55:27+0200 294109
[60137]: leader3 m 12212010 v 30003 ss 512 nh 0 mh 1 oi 2 og 8 lv 0
May 30 09:55:27 vm2 sanlock[1094]: 2016-05-30 09:55:27+0200 294109
[60137]: leader4 sn 4643f652-8014-4951-8a1a-02af41e67d08 rn
1eed8aa9-8fb5-4d27-8d1c-03ebce2c36d4.vm2.intern ts 3786679 cs 1474f033
May 30 09:55:28 vm2 sanlock[1094]: 2016-05-30 09:55:28+0200 294110
[1099]: s9703 add_lockspace fail result -226
May 30 09:55:58 vm2 sanlock[1094]: 2016-05-30 09:55:58+0200 294140
[60331]: verify_leader 2 wrong space name
4643f652-8014-4951-8a1a-02af41e67d08
f757b127-a951-4fa9-bf90-81180c0702e6
/dev/f757b127-a951-4fa9-bf90-81180c0702e6/ids
May 30 09:55:58 vm2 sanlock[1094]: 2016-05-30 09:55:58+0200 294140
[60331]: leader1 delta_acquire_begin error -226 lockspace
f757b127-a951-4fa9-bf90-81180c0702e6 host_id 2
May 30 09:55:58 vm2 sanlock[1094]: 2016-05-30 09:55:58+0200 294140
[60331]: leader2 path /dev/f757b127-a951-4fa9-bf90-81180c0702e6/ids offset 0
May 30 09:55:58 vm2 sanlock[1094]: 2016-05-30 09:55:58+0200 294140
[60331]: leader3 m 12212010 v 30003 ss 512 nh 0 mh 1 oi 2 og 8 lv 0
May 30 09:55:58 vm2 sanlock[1094]: 2016-05-30 09:55:58+0200 294140
[60331]: leader4 sn 4643f652-8014-4951-8a1a-02af41e67d08 rn
1eed8aa9-8fb5-4d27-8d1c-03ebce2c36d4.vm2.intern ts 3786679 cs 1474f033
May 30 09:55:59 vm2 sanlock[1094]: 2016-05-30 09:55:59+0200 294141
[1098]: s9704 add_lockspace fail result -226
May 30 09:56:05 vm2 sanlock[1094]: 2016-05-30 09:56:05+0200 294148
[1094]: s1527 check_other_lease invalid for host 0 0 ts 7566376 name  in
4643f652-8014-4951-8a1a-02af41e67d08
May 30 09:56:05 vm2 sanlock[1094]: 2016-05-30 09:56:05+0200 294148
[1094]: s1527 check_other_lease leader 12212010 owner 1 11 ts 7566376 sn
f757b127-a951-4fa9-bf90-81180c0702e6 rn
f888524b-27aa-4724-8bae-051f9e950a21.vm1.intern
May 30 09:56:28 vm2 sanlock[1094]: 2016-05-30 09:56:28+0200 294170
[60496]: verify_leader 2 wrong space name
4643f652-8014-4951-8a1a-02af41e67d08
f757b127-a951-4fa9-bf90-81180c0702e6
/dev/f757b127-a951-4fa9-bf90-81180c0702e6/ids
May 30 09:56:28 vm2 sanlock[1094]: 2016-05-30 09:56:28+0200 294170
[60496]: leader1 delta_acquire_begin error -226 lockspace
f757b127-a951-4fa9-bf90-81180c0702e6 host_id 2
May 30 09:56:28 vm2 sanlock[1094]: 2016-05-30 09:56:28+0200 294170
[60496]: leader2 path /dev/f757b127-a951-4fa9-bf90-81180c0702e6/ids offset 0
May 30 09:56:28 vm2 sanlock[1094]: 2016-05-30 09:56:28+0200 294170
[60496]: leader3 m 12212010 v 30003 ss 512 nh 0 mh 1 oi 2 og 8 lv 0
May 30 09:56:28 vm2 sanlock[1094]: 2016-05-30 09:56:28+0200 294170
[60496]: leader4 sn 4643f652-8014-4951-8a1a-02af41e67d08 rn
1eed8aa9-8fb5-4d27-8d1c-03ebce2c36d4.vm2.intern ts 3786679 cs 1474f033
May 30 09:56:29 vm2 sanlock[1094]: 2016-05-30 09:56:29+0200 294171
[6415]: s9705 add_lockspace fail result -226
May 30 09:56:58 vm2 sanlock[1094]: 2016-05-30 09:56:58+0200 294200
[60645]: verify_leader 2 wrong space name
4643f652-8014-4951-8a1a-02af41e67d08
f757b127-a951-4fa9-bf90-81180c0702e6
/dev/f757b127-a951-4fa9-bf90-81180c0702e6/ids
May 30 09:56:58 vm2 sanlock[1094]: 2016-05-30 09:56:58+0200 294200
[60645]: leader1 delta_acquire_begin error -226 lockspace
f757b127-a951-4fa9-bf90-81180c0702e6 host_id 2
May 30 09:56:58 vm2 sanlock[1094]: 2016-05-30 09:56:58+0200 294200
[60645]: leader2 path /dev/f757b127-a951-4fa9-bf90-81180c0702e6/ids offset 0
May 30 09:56:58 vm2 sanlock[1094]: 2016-05-30 09:56:58+0200 294200
[60645]: leader3 m 12212010 v 30003 ss 512 nh 0 mh 1 oi 2 og 8 lv 0
May 30 09:56:58 vm2 sanlock[1094]: 2016-05-30 09:56:58+0200 294200
[60645]: leader4 sn 4643f652-8014-4951-8a1a-02af41e67d08 rn
1eed8aa9-8fb5-4d27-8d1c-03ebce2c36d4.vm2.intern ts 3786679 cs 1474f033
May 30 09:56:59 vm2 sanlock[1094]: 2016-05-30 09:56:59+0200 294201
[6373]: s9706 add_lockspace fail result -226
May 30 09:57:28 vm2 sanlock[1094]: 2016-05-30 09:57:28+0200 294230
[60806]: 

Re: [ovirt-users] Unable to start vdsmd. Dependency vdsm-network keeps failing

2016-05-30 Thread Dan Kenigsberg
On Mon, May 30, 2016 at 01:25:09AM +, Christopher Lord wrote:
> I have a host that has dropped out of my cluster because it can't start 
> vdsmd. Initially the logs were reporting a duplicate gateway, so I removed 
> the duplicate. But I still can't start vdsmd. /var/log/vdsm/supervdsm.log is 
> showing the following.
> 
> Traceback (most recent call last):
>   File "/usr/share/vdsm/vdsm-restore-net-config", line 439, in restore
> unified_restoration()
>   File "/usr/share/vdsm/vdsm-restore-net-config", line 131, in 
> unified_restoration
> changed_config = _filter_changed_nets_bonds(available_config)
>   File "/usr/share/vdsm/vdsm-restore-net-config", line 258, in 
> _filter_changed_nets_bonds
> kernel_config = KernelConfig(netinfo.NetInfo())
>   File "/usr/lib/python2.7/site-packages/vdsm/netconfpersistence.py", line 
> 204, in __init__
> for net, net_attr in self._analyze_netinfo_nets(netinfo):
>   File "/usr/lib/python2.7/site-packages/vdsm/netconfpersistence.py", line 
> 216, in _analyze_netinfo_nets
> yield net, self._translate_netinfo_net(net, net_attr)
>   File "/usr/lib/python2.7/site-packages/vdsm/netconfpersistence.py", line 
> 232, in _translate_netinfo_net
> self._translate_nics(attributes, nics)
>   File "/usr/lib/python2.7/site-packages/vdsm/netconfpersistence.py", line 
> 269, in _translate_nics
> nic, = nics
> ValueError: too many values to unpack
> 
>  I've downloaded the source code and have tried to follow along and see 
> what's happening, but it's going a little (a lot) over my head at the moment. 
> Could anyone help me out?

Which version of vdsm is stalled on your host?

Could you share the content of your /var/lib/vdsm/persistence/netconf
directory?

Your supervdsm.log may hold more hints - could you share more of it?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] One RHEV Virtual Machine does not Automatically Resume following Compellent SAN Controller Failover

2016-05-30 Thread InterNetX - Juergen Gotteswinter
We see exactly the same, and it does not seem to be Vendor dependend.

- Equallogic Controller Failover -> VM get paused and maybe unpaused but
most dont
- Nexenta ZFS iSCSI with RSF1 HA -> same
- FreeBSD ctld iscsi-target + Heartbeat -> same
- CentOS + iscsi-target + Heartbeat -> same

Multipath Settings are, where available, modified to match the best
practice supplied by the Vendor. On Open Source Solutions we started
with known working multipath/iscsi Settings, and meanwhile nearly every
possible setting has been tested. Without much success.

To me it looks like Ovirt/Rhev is way to sensitive to iSCSI
Interruptions, and it feels like gambling what the engine might do to
your VM (or not).

Am 11/23/2015 um 8:37 PM schrieb Duckworth, Douglas C:
> Hello --
> 
> Not sure if y'all can help with this issue we've been seeing with RHEV...
> 
> On 11/13/2015, during Code Upgrade of Compellent SAN at our Disaster
> Recovery Site, we Failed Over to Secondary SAN Controller.  Most Virtual
> Machines in our DR Cluster Resumed automatically after Pausing except VM
> "BADVM" on Host "BADHOST."
> 
> In Engine.log you can see that BADVM was sent into "VM_PAUSED_EIO" state
> at 10:47:57:
> 
> "VM BADVM has paused due to storage I/O problem."
> 
> On this Red Hat Enterprise Virtualization Hypervisor 6.6
> (20150512.0.el6ev) Host, two other VMs paused but then automatically
> resumed without System Administrator intervention...
> 
> In our DR Cluster, 22 VMs also resumed automatically...
> 
> None of these Guest VMs are engaged in high I/O as these are DR site VMs
> not currently doing anything.
> 
> We sent this information to Dell.  Their response:
> 
> "The root cause may reside within your virtualization solution, not the
> parent OS (RHEV-Hypervisor disc) or Storage (Dell Compellent.)"
> 
> We are doing this Failover again on Sunday November 29th so we would
> like to know how to mitigate this issue, given we have to manually
> resume paused VMs that don't resume automatically.
> 
> Before we initiated SAN Controller Failover, all iSCSI paths to Targets
> were present on Host tulhv2p03.
> 
> VM logs on Host show in /var/log/libvirt/qemu/badhost.log that Storage
> error was reported:
> 
> block I/O error in device 'drive-virtio-disk0': Input/output error (5)
> block I/O error in device 'drive-virtio-disk0': Input/output error (5)
> block I/O error in device 'drive-virtio-disk0': Input/output error (5)
> block I/O error in device 'drive-virtio-disk0': Input/output error (5)
> 
> All disks used by this Guest VM are provided by single Storage Domain
> COM_3TB4_DR with serial "270."  In syslog we do see that all paths for
> that Storage Domain Failed:
> 
> Nov 13 16:47:40 multipathd: 36000d310005caf000270: remaining
> active paths: 0
> 
> Though these recovered later:
> 
> Nov 13 16:59:17 multipathd: 36000d310005caf000270: sdbg -
> tur checker reports path is up
> Nov 13 16:59:17 multipathd: 36000d310005caf000270: remaining
> active paths: 8
> 
> Does anyone have an idea of why the VM would fail to automatically
> resume if the iSCSI paths used by its Storage Domain recovered?
> 
> Thanks
> Doug
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users