Re: [ovirt-users] Import Domain and snapshot issue ... please help !!!

2018-02-14 Thread Enrico Becchetti

  Hi,
also you can download them throught these
links:

https://owncloud.pg.infn.it/index.php/s/QpsTyGxtRTPYRTD
https://owncloud.pg.infn.it/index.php/s/ph8pLcABe0nadeb

Thanks again 
Best Regards
Enrico


Il 13/02/2018 14:52, Maor Lipchuk ha scritto:



On Tue, Feb 13, 2018 at 3:51 PM, Maor Lipchuk > wrote:



On Tue, Feb 13, 2018 at 3:42 PM, Enrico Becchetti
mailto:enrico.becche...@pg.infn.it>> wrote:

see the attach files please ... thanks for your attention !!!



Seems like the engine logs does not contain the entire process,
can you please share older logs since the import operation?


And VDSM logs as well from your host

Best Regards
Enrico


Il 13/02/2018 14:09, Maor Lipchuk ha scritto:



On Tue, Feb 13, 2018 at 1:48 PM, Enrico Becchetti
mailto:enrico.becche...@pg.infn.it>> wrote:

 Dear All,
I have been using ovirt for a long time with three
hypervisors and an external engine running in a centos vm .

This three hypervisors have HBAs and access to fiber
channel storage. Until recently I used version 3.5, then
I reinstalled everything from scratch and now I have 4.2.

Before formatting everything, I detach the storage data
domani (FC) with the virtual machines and reimported it
to the new 4.2 and all went well. In
this domain there were virtual machines with and without
snapshots.

Now I have two problems. The first is that if I try to
delete a snapshot the process is not end successful and
remains hanging and the second problem is that
in one case I lost the virtual machine !!!



Not sure that I fully understand the scneario.'
How was the virtual machine got lost if you only tried to
delete a snapshot?


So I need your help to kill the three running zombie
tasks because with taskcleaner.sh I can't do anything
and then I need to know how I can delete the old snapshots
made with the 3.5 without losing other data or without
having new processes that terminate correctly.

If you want some log files please let me know.



Hi Enrico,

Can you please attach the engine and VDSM logs


Thank you so much.
Best Regards
Enrico



___
Users mailing list
Users@ovirt.org 
http://lists.ovirt.org/mailman/listinfo/users





-- 
___


Enrico BecchettiServizio di Calcolo e Reti

Istituto Nazionale di Fisica Nucleare - Sezione di Perugia
Via Pascoli,c/o Dipartimento di Fisica  06123 Perugia (ITALY)
Phone:+39 075 5852777   Mail: 
Enrico.Becchettipg.infn.it 
__





--
___

Enrico BecchettiServizio di Calcolo e Reti

Istituto Nazionale di Fisica Nucleare - Sezione di Perugia
Via Pascoli,c/o Dipartimento di Fisica  06123 Perugia (ITALY)
Phone:+39 075 5852777 Mail: Enrico.Becchettipg.infn.it
__



--
___

Enrico BecchettiServizio di Calcolo e Reti

Istituto Nazionale di Fisica Nucleare - Sezione di Perugia
Via Pascoli,c/o Dipartimento di Fisica  06123 Perugia (ITALY)
Phone:+39 075 5852777 Mail: Enrico.Becchettipg.infn.it
__



smime.p7s
Description: Firma crittografica S/MIME
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Unable to connect to the graphic server

2018-02-14 Thread Yedidyah Bar David
On Wed, Feb 14, 2018 at 5:20 AM, Alex Bartonek  wrote:
> I've built and rebuilt about 4 oVirt servers.  Consider myself pretty good
> at this.  LOL.
> So I am setting up a oVirt server for a friend on his r710.  CentOS 7, ovirt
> 4.2.   /etc/hosts has the correct IP and FQDN setup.
>
> When I build a VM and try to open a console session via  SPICE I am unable
> to connect to the graphic server.  I'm connecting from a Windows 10 box.
> Using virt-manager to connect.

What happens when you try?

>
> I've googled and I just cant seem to find any resolution to this.  Now, I
> did build the server on my home network but the subnet its on is the same..
> internal 192.168.1.xxx.   The web interface is accessible also.
>
> Any hints as to what else I can check?

If virt-viewer does open up but fails to connect, check (e.g. with netstat)
where it tries to connect to. Check that you have network access there (no
filtering/routing/NAT/etc issues), that qemu on the host is listening on the
port it tries etc.

If it does not open, try to tell your browser (if it does not already) to
not open it automatically, but ask you what to do. Then save the file you
get and check it.

Best regards,

>
> Thanks!
>
>
> Sent with ProtonMail Secure Email.
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>



-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] CentOS 7 Hyperconverged oVirt 4.2 with Self-Hosted-Engine with glusterfs with 2 Hypervisors and 1 glusterfs-Arbiter-only

2018-02-14 Thread Simone Tiraboschi
On Tue, Feb 13, 2018 at 3:57 PM, Philipp Richter <
philipp.rich...@linforge.com> wrote:

> Hi,
>
> > The recommended way to install this would be by using one of the
> > "full" nodes and deploying hosted engine via cockpit there. The
> > gdeploy plugin in cockpit should allow you to configure the arbiter
> > node.
> >
> > The documentation for deploying RHHI (hyper converged RH product) is
> > here:
> > https://access.redhat.com/documentation/en-us/red_hat_
> hyperconverged_infrastructure/1.1/html-single/deploying_red_
> hat_hyperconverged_infrastructure/index#deploy
>
> Thanks for the documentation pointer about RHHI.
> I was able to successfully setup all three Nodes. I had to edit the final
> gdeploy File, as the Installer reserves 20GB per arbiter volume and I don't
> have that much space available for this POC.
>
> The problem now is that I don't see the third node i.e. in the Storage /
> Volumes / Bricks view, and I get warning messages every few seconds into
> the /var/log/ovirt-engine/engine.log like:
>
> 2018-02-13 15:40:26,188+01 WARN  
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
> (DefaultQuartzScheduler3) [5a8c68e2] Could not add brick
> 'ovirtpoc03-storage:/gluster_bricks/engine/engine' to volume
> '2e7a0ac3-3a74-40ba-81ff-d45b2b35aace' - server uuid
> '0a100f2f-a9ee-4711-b997-b674ee61f539' not found in cluster
> 'cab4ba5c-10ba-11e8-aed5-00163e6a7af9'
> 2018-02-13 15:40:26,193+01 WARN  
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn]
> (DefaultQuartzScheduler3) [5a8c68e2] Could not add brick
> 'ovirtpoc03-storage:/gluster_bricks/vmstore/vmstore' to volume
> '5a356223-8774-4944-9a95-3962a3c657e4' - server uuid
> '0a100f2f-a9ee-4711-b997-b674ee61f539' not found in cluster
> 'cab4ba5c-10ba-11e8-aed5-00163e6a7af9'
>
> Of course I cannot add the third node as normal oVirt Host as it is slow,
> has only minimal amount of RAM and the CPU (AMD) is different than that one
> of the two "real" Hypervisors (Intel).
>
> Is there a way to add the third Node only for gluster management, not as
> Hypervisor? Or is there any other method to at least quieten the log?
>

Adding Sahina here.


>
> thanks,
> --
>
> : Philipp Richter
> : LINFORGE | Peace of mind for your IT
> :
> : T: +43 1 890 79 99
> : E: philipp.rich...@linforge.com
> : https://www.xing.com/profile/Philipp_Richter15
> : https://www.linkedin.com/in/philipp-richter
> :
> : LINFORGE Technologies GmbH
> : Brehmstraße 10
> : 1110 Wien
> : Österreich
> :
> : Firmenbuchnummer: FN 216034y
> : USt.- Nummer : ATU53054901
> : Gerichtsstand: Wien
> :
> : LINFORGE® is a registered trademark of LINFORGE, Austria.
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] VMs with multiple vdisks don't migrate

2018-02-14 Thread fsoyer

Hi all,
I discovered yesterday a problem when migrating VM with more than one vdisk.
On our test servers (oVirt4.1, shared storage with Gluster), I created 2 VMs 
needed for a test, from a template with a 20G vdisk. On this VMs I added a 100G 
vdisk (for this tests I didn't want to waste time to extend the existing 
vdisks... But I lost time finally...). The VMs with the 2 vdisks works well.
Now I saw some updates waiting on the host. I tried to put it in maintenance... 
But it stopped on the two VM. They were marked "migrating", but no more 
accessible. Other (small) VMs with only 1 vdisk was migrated without problem at 
the same time.
I saw that a kvm process for the (big) VMs was launched on the source AND 
destination host, but after tens of minutes, the migration and the VMs was 
always freezed. I tried to cancel the migration for the VMs : failed. The only 
way to stop it was to poweroff the VMs : the kvm process died on the 2 hosts 
and the GUI alerted on a failed migration.
In doubt, I tried to delete the second vdisk on one of this VMs : it migrates 
then without error ! And no access problem.
I tried to extend the first vdisk of the second VM, the delete the second vdisk 
: it migrates now without problem !   

So after another test with a VM with 2 vdisks, I can say that this blocked the 
migration process :(

In engine.log, for a VMs with 1 vdisk migrating well, we see :2018-02-12 
16:46:29,705+01 INFO  [org.ovirt.engine.core.bll.MigrateVmToServerCommand] 
(default task-28) [2f712024-5982-46a8-82c8-fd8293da5725] Lock Acquired to 
object 'EngineLock:{exclusiveLocks='[3f57e669-5e4c-4d10-85cc-d573004a099d=VM]', 
sharedLocks=''}'
2018-02-12 16:46:29,955+01 INFO  
[org.ovirt.engine.core.bll.MigrateVmToServerCommand] 
(org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725] 
Running command: MigrateVmToServerCommand internal: false. Entities affected :  
ID: 3f57e669-5e4c-4d10-85cc-d573004a099d Type: VMAction group MIGRATE_VM with 
role type USER
2018-02-12 16:46:30,261+01 INFO  
[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] 
(org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725] 
START, MigrateVDSCommand( MigrateVDSCommandParameters:{runAsync='true', 
hostId='ce3938b1-b23f-4d22-840a-f17d7cd87bb1', 
vmId='3f57e669-5e4c-4d10-85cc-d573004a099d', srcHost='192.168.0.6', 
dstVdsId='d569c2dd-8f30-4878-8aea-858db285cf69', dstHost='192.168.0.5:54321', 
migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', 
autoConverge='true', migrateCompressed='false', consoleAddress='null', 
maxBandwidth='500', enableGuestEvents='true', maxIncomingMigrations='2', 
maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, 
params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, 
{limit=2, action={name=setDowntime, params=[200]}}, {limit=3, 
action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, 
params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, 
action={name=abort, params=[]}}]]'}), log id: 14f61ee0
2018-02-12 16:46:30,262+01 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] 
(org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725] 
START, MigrateBrokerVDSCommand(HostName = victor.local.systea.fr, 
MigrateVDSCommandParameters:{runAsync='true', 
hostId='ce3938b1-b23f-4d22-840a-f17d7cd87bb1', 
vmId='3f57e669-5e4c-4d10-85cc-d573004a099d', srcHost='192.168.0.6', 
dstVdsId='d569c2dd-8f30-4878-8aea-858db285cf69', dstHost='192.168.0.5:54321', 
migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', 
autoConverge='true', migrateCompressed='false', consoleAddress='null', 
maxBandwidth='500', enableGuestEvents='true', maxIncomingMigrations='2', 
maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, 
params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, 
{limit=2, action={name=setDowntime, params=[200]}}, {limit=3, 
action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, 
params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, 
action={name=abort, params=[]}}]]'}), log id: 775cd381
2018-02-12 16:46:30,277+01 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] 
(org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725] 
FINISH, MigrateBrokerVDSCommand, log id: 775cd381
2018-02-12 16:46:30,285+01 INFO  
[org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] 
(org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725] 
FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 14f61ee0
2018-02-12 16:46:30,301+01 INFO  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725] 
EVENT_ID: VM_MIGRATION_START(62), Correlation ID: 
2f712024-5982-46a8-82c8-fd8293da5725, Job ID: 
4bd19aa9-cc99-4d02-884e-5a1e8

Re: [ovirt-users] hosted engine install fails on useless DHCP lookup

2018-02-14 Thread Simone Tiraboschi
On Wed, Feb 14, 2018 at 2:11 AM, Jamie Lawrence 
wrote:

> Hello,
>
> I'm seeing the hosted engine install fail on an Ansible playbook step. Log
> below. I tried looking at the file specified for retry, below
> (/usr/share/ovirt-hosted-engine-setup/ansible/bootstrap_local_vm.retry);
> it contains the word, 'localhost'.
>
> The log below didn't contain anything I could see that was actionable;
> given that it was an ansible error, I hunted down the config and enabled
> logging. On this run the error was different - the installer log was the
> same, but the reported error (from the installer changed).
>
> The first time, the installer said:
>
> [ INFO  ] TASK [Wait for the host to become non operational]
> [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts":
> []}, "attempts": 150, "changed": false}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing
> ansible-playbook
> [ INFO  ] Stage: Clean up
>

'localhost' here is not an issue by itself: the playbook is executed on the
host against the same host over a local connection so localhost is
absolutely fine there.

Maybe you hit this one:
https://bugzilla.redhat.com/show_bug.cgi?id=1540451

It seams NetworkManager related but still not that clear.
Stopping NetworkManager and starting network before the deployment seams to
help.


>
>
> Second:
>
> [ INFO  ] TASK [Get local vm ip]
> [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": true,
> "cmd": "virsh -r net-dhcp-leases default | grep -i 00:16:3e:11:e7:bd | awk
> '{ print $5 }' | cut -f1 -d'/'", "delta": "0:00:00.093840", "end":
> "2018-02-13 16:53:08.658556", "rc": 0, "start": "2018-02-13
> 16:53:08.564716", "stderr": "", "stderr_lines": [], "stdout": "",
> "stdout_lines": []}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing
> ansible-playbook
> [ INFO  ] Stage: Clean up
>
>
>
>  Ansible log below; as with that second snippet, it appears that it was
> trying to parse out a host name from virsh's list of DHCP leases, couldn't,
> and died.
>
> Which makes sense: I gave it a static IP, and unless I'm missing
> something, setup should not have been doing that. I verified that the
> answer file has the IP:
>
> OVEHOSTED_VM/cloudinitVMStaticCIDR=str:10.181.26.150/24
>
> Anyone see what is wrong here?
>

This is absolutely fine.
The new ansible based flow (also called node zero) uses an engine running
on a local virtual machine to bootstrap the system.
The bootstrap local VM runs over libvirt default natted network with its
own dhcp instance, that's why we are consuming it.
The locally running engine will create a target virtual machine on the
shared storage and that one will be instead configured as you specified.



>
> -j
>
>
> hosted-engine --deploy log:
>
> 2018-02-13 16:20:32,138-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [Force host-deploy in offline mode]
> 2018-02-13 16:20:33,041-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 changed: [localhost]
> 2018-02-13 16:20:33,342-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [include_tasks]
> 2018-02-13 16:20:33,443-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 ok: [localhost]
> 2018-02-13 16:20:33,744-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [Obtain SSO token using
> username/password credentials]
> 2018-02-13 16:20:35,248-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 ok: [localhost]
> 2018-02-13 16:20:35,550-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [Add host]
> 2018-02-13 16:20:37,053-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 changed: [localhost]
> 2018-02-13 16:20:37,355-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [Wait for the host to become non
> operational]
> 2018-02-13 16:27:48,895-0800 DEBUG 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:94 {u'_ansible_parsed': True,
> u'_ansible_no_log': False, u'changed': False, u'attempts': 150,
> u'invocation': {u'module_args': {u'pattern': u'name=
> ovirt-1.squaretrade.com', u'fetch_nested': False, u'nested_attributes':
> []}}, u'ansible_facts': {u'ovirt_hosts': []}}
> 2018-02-13 16:27:48,995-0800 ERROR 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:98 fatal: [localhost]: FAILED! =>
> {"ansible_facts": {"ovirt_hosts": []}, "attempts": 150, "changed": false}
> 2018-02-13 16:27:49,297-0800 DEBUG 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:94 PLAY RECAP [localhost] : ok: 42 changed:
> 17 unreachable: 0 skipped: 2 failed: 1
> 2018-02-13 16:27:49,397-0800 DEBUG 
> otopi.o

Re: [ovirt-users] VM with multiple vdisks can't migrate

2018-02-14 Thread Maor Lipchuk
Hi Frank,

Can you please attach the VDSM logs from the time of the migration failure
for both hosts:
  ginger.local.systea.f r and v
ictor.local.systea.fr

Thanks,
Maor

On Tue, Feb 13, 2018 at 12:07 PM, fsoyer  wrote:

> Hi all,
> I discovered yesterday a problem when migrating VM with more than one
> vdisk.
> On our test servers (oVirt4.1, shared storage with Gluster), I created 2
> VMs needed for a test, from a template with a 20G vdisk. On this VMs I
> added a 100G vdisk (for this tests I didn't want to waste time to extend
> the existing vdisks... But I lost time finally...). The VMs with the 2
> vdisks works well.
> Now I saw some updates waiting on the host. I tried to put it in
> maintenance... But it stopped on the two VM. They were marked "migrating",
> but no more accessible. Other (small) VMs with only 1 vdisk was migrated
> without problem at the same time.
> I saw that a kvm process for the (big) VMs was launched on the source AND
> destination host, but after tens of minutes, the migration and the VMs was
> always freezed. I tried to cancel the migration for the VMs : failed. The
> only way to stop it was to poweroff the VMs : the kvm process died on the 2
> hosts and the GUI alerted on a failed migration.
> In doubt, I tried to delete the second vdisk on one of this VMs : it
> migrates then without error ! And no access problem.
> I tried to extend the first vdisk of the second VM, the delete the second
> vdisk : it migrates now without problem !
>
> So after another test with a VM with 2 vdisks, I can say that this blocked
> the migration process :(
>
> In engine.log, for a VMs with 1 vdisk migrating well, we see :
>
> 2018-02-12 16:46:29,705+01 INFO  
> [org.ovirt.engine.core.bll.MigrateVmToServerCommand]
> (default task-28) [2f712024-5982-46a8-82c8-fd8293da5725] Lock Acquired to
> object 
> 'EngineLock:{exclusiveLocks='[3f57e669-5e4c-4d10-85cc-d573004a099d=VM]',
> sharedLocks=''}'
> 2018-02-12 16:46:29,955+01 INFO  
> [org.ovirt.engine.core.bll.MigrateVmToServerCommand]
> (org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725]
> Running command: MigrateVmToServerCommand internal: false. Entities
> affected :  ID: 3f57e669-5e4c-4d10-85cc-d573004a099d Type: VMAction group
> MIGRATE_VM with role type USER
> 2018-02-12 16:46:30,261+01 INFO  
> [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
> (org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725]
> START, MigrateVDSCommand( MigrateVDSCommandParameters:{runAsync='true',
> hostId='ce3938b1-b23f-4d22-840a-f17d7cd87bb1',
> vmId='3f57e669-5e4c-4d10-85cc-d573004a099d', srcHost='192.168.0.6',
> dstVdsId='d569c2dd-8f30-4878-8aea-858db285cf69', dstHost='
> 192.168.0.5:54321', migrationMethod='ONLINE', tunnelMigration='false',
> migrationDowntime='0', autoConverge='true', migrateCompressed='false',
> consoleAddress='null', maxBandwidth='500', enableGuestEvents='true',
> maxIncomingMigrations='2', maxOutgoingMigrations='2',
> convergenceSchedule='[init=[{name=setDowntime, params=[100]}],
> stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2,
> action={name=setDowntime, params=[200]}}, {limit=3,
> action={name=setDowntime, params=[300]}}, {limit=4,
> action={name=setDowntime, params=[400]}}, {limit=6,
> action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort,
> params=[]}}]]'}), log id: 14f61ee0
> 2018-02-12 16:46:30,262+01 INFO  [org.ovirt.engine.core.
> vdsbroker.vdsbroker.MigrateBrokerVDSCommand] 
> (org.ovirt.thread.pool-6-thread-32)
> [2f712024-5982-46a8-82c8-fd8293da5725] START, MigrateBrokerVDSCommand(HostName
> = victor.local.systea.fr, MigrateVDSCommandParameters:{runAsync='true',
> hostId='ce3938b1-b23f-4d22-840a-f17d7cd87bb1',
> vmId='3f57e669-5e4c-4d10-85cc-d573004a099d', srcHost='192.168.0.6',
> dstVdsId='d569c2dd-8f30-4878-8aea-858db285cf69', dstHost='
> 192.168.0.5:54321', migrationMethod='ONLINE', tunnelMigration='false',
> migrationDowntime='0', autoConverge='true', migrateCompressed='false',
> consoleAddress='null', maxBandwidth='500', enableGuestEvents='true',
> maxIncomingMigrations='2', maxOutgoingMigrations='2',
> convergenceSchedule='[init=[{name=setDowntime, params=[100]}],
> stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2,
> action={name=setDowntime, params=[200]}}, {limit=3,
> action={name=setDowntime, params=[300]}}, {limit=4,
> action={name=setDowntime, params=[400]}}, {limit=6,
> action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort,
> params=[]}}]]'}), log id: 775cd381
> 2018-02-12 16:46:30,277+01 INFO  [org.ovirt.engine.core.
> vdsbroker.vdsbroker.MigrateBrokerVDSCommand] 
> (org.ovirt.thread.pool-6-thread-32)
> [2f712024-5982-46a8-82c8-fd8293da5725] FINISH, MigrateBrokerVDSCommand,
> log id: 775cd381
> 2018-02-12 16:46:30,285+01 INFO  
> [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
> (org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725]
> FINISH

Re: [ovirt-users] VMs with multiple vdisks don't migrate

2018-02-14 Thread Maor Lipchuk
Hi Frank,

I already replied on your last email.
Can you provide the VDSM logs from the time of the migration failure for
both hosts:
  ginger.local.systea.f r and v
ictor.local.systea.fr

Thanks,
Maor

On Wed, Feb 14, 2018 at 11:23 AM, fsoyer  wrote:

> Hi all,
> I discovered yesterday a problem when migrating VM with more than one
> vdisk.
> On our test servers (oVirt4.1, shared storage with Gluster), I created 2
> VMs needed for a test, from a template with a 20G vdisk. On this VMs I
> added a 100G vdisk (for this tests I didn't want to waste time to extend
> the existing vdisks... But I lost time finally...). The VMs with the 2
> vdisks works well.
> Now I saw some updates waiting on the host. I tried to put it in
> maintenance... But it stopped on the two VM. They were marked "migrating",
> but no more accessible. Other (small) VMs with only 1 vdisk was migrated
> without problem at the same time.
> I saw that a kvm process for the (big) VMs was launched on the source AND
> destination host, but after tens of minutes, the migration and the VMs was
> always freezed. I tried to cancel the migration for the VMs : failed. The
> only way to stop it was to poweroff the VMs : the kvm process died on the 2
> hosts and the GUI alerted on a failed migration.
> In doubt, I tried to delete the second vdisk on one of this VMs : it
> migrates then without error ! And no access problem.
> I tried to extend the first vdisk of the second VM, the delete the second
> vdisk : it migrates now without problem !
>
> So after another test with a VM with 2 vdisks, I can say that this blocked
> the migration process :(
>
> In engine.log, for a VMs with 1 vdisk migrating well, we see :
>
> 2018-02-12 16:46:29,705+01 INFO  
> [org.ovirt.engine.core.bll.MigrateVmToServerCommand]
> (default task-28) [2f712024-5982-46a8-82c8-fd8293da5725] Lock Acquired to
> object 
> 'EngineLock:{exclusiveLocks='[3f57e669-5e4c-4d10-85cc-d573004a099d=VM]',
> sharedLocks=''}'
> 2018-02-12 16:46:29,955+01 INFO  
> [org.ovirt.engine.core.bll.MigrateVmToServerCommand]
> (org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725]
> Running command: MigrateVmToServerCommand internal: false. Entities
> affected :  ID: 3f57e669-5e4c-4d10-85cc-d573004a099d Type: VMAction group
> MIGRATE_VM with role type USER
> 2018-02-12 16:46:30,261+01 INFO  
> [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
> (org.ovirt.thread.pool-6-thread-32) [2f712024-5982-46a8-82c8-fd8293da5725]
> START, MigrateVDSCommand( MigrateVDSCommandParameters:{runAsync='true',
> hostId='ce3938b1-b23f-4d22-840a-f17d7cd87bb1',
> vmId='3f57e669-5e4c-4d10-85cc-d573004a099d', srcHost='192.168.0.6',
> dstVdsId='d569c2dd-8f30-4878-8aea-858db285cf69', dstHost='
> 192.168.0.5:54321', migrationMethod='ONLINE', tunnelMigration='false',
> migrationDowntime='0', autoConverge='true', migrateCompressed='false',
> consoleAddress='null', maxBandwidth='500', enableGuestEvents='true',
> maxIncomingMigrations='2', maxOutgoingMigrations='2',
> convergenceSchedule='[init=[{name=setDowntime, params=[100]}],
> stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2,
> action={name=setDowntime, params=[200]}}, {limit=3,
> action={name=setDowntime, params=[300]}}, {limit=4,
> action={name=setDowntime, params=[400]}}, {limit=6,
> action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort,
> params=[]}}]]'}), log id: 14f61ee0
> 2018-02-12 16:46:30,262+01 INFO  [org.ovirt.engine.core.
> vdsbroker.vdsbroker.MigrateBrokerVDSCommand] 
> (org.ovirt.thread.pool-6-thread-32)
> [2f712024-5982-46a8-82c8-fd8293da5725] START, MigrateBrokerVDSCommand(HostName
> = victor.local.systea.fr, MigrateVDSCommandParameters:{runAsync='true',
> hostId='ce3938b1-b23f-4d22-840a-f17d7cd87bb1',
> vmId='3f57e669-5e4c-4d10-85cc-d573004a099d', srcHost='192.168.0.6',
> dstVdsId='d569c2dd-8f30-4878-8aea-858db285cf69', dstHost='
> 192.168.0.5:54321', migrationMethod='ONLINE', tunnelMigration='false',
> migrationDowntime='0', autoConverge='true', migrateCompressed='false',
> consoleAddress='null', maxBandwidth='500', enableGuestEvents='true',
> maxIncomingMigrations='2', maxOutgoingMigrations='2',
> convergenceSchedule='[init=[{name=setDowntime, params=[100]}],
> stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2,
> action={name=setDowntime, params=[200]}}, {limit=3,
> action={name=setDowntime, params=[300]}}, {limit=4,
> action={name=setDowntime, params=[400]}}, {limit=6,
> action={name=setDowntime, params=[500]}}, {limit=-1, action={name=abort,
> params=[]}}]]'}), log id: 775cd381
> 2018-02-12 16:46:30,277+01 INFO  [org.ovirt.engine.core.
> vdsbroker.vdsbroker.MigrateBrokerVDSCommand] 
> (org.ovirt.thread.pool-6-thread-32)
> [2f712024-5982-46a8-82c8-fd8293da5725] FINISH, MigrateBrokerVDSCommand,
> log id: 775cd381
> 2018-02-12 16:46:30,285+01 INFO  
> [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
> (org.ovirt.thread.pool-6-thread-32) [2f712024-5982-

Re: [ovirt-users] Slow conversion from VMware in 4.1

2018-02-14 Thread Luca 'remix_tj' Lorenzetto
On Tue, Feb 6, 2018 at 11:19 AM, Richard W.M. Jones  wrote:
> On Tue, Feb 06, 2018 at 11:11:37AM +0100, Luca 'remix_tj' Lorenzetto wrote:
>> Il 6 feb 2018 10:52 AM, "Yaniv Kaul"  ha scritto:
>>
>>
>> I assume its network interfaces are also a bottleneck as well. Certainly if
>> they are 1g.
>> Y.
>>
>>
>> That's not the case, vcenter uses 10g and also all the involved hosts.
>>
>> We first supposed the culprit was network, but investigations has cleared
>> its position. Network usage is under 40% with 4 ongoing migrations.
>
> The problem is two-fold and is common to all vCenter transformations:
>
> (1) A single https connection is used and each block of data that is
> requested is processed serially.
>
> (2) vCenter has to forward each request to the ESXi hypervisor.
>
> (1) + (2) => most time is spent waiting on the lengthy round trips for
> each requested block of data.
>
> This is why overlapping multiple parallel conversions works and
> (although each conversion is just as slow) improves throughput,
> because you're filling in the long idle gaps by serving other
> conversions.
>
[cut]

FYI it was a cpu utilization issue. Now that vcenter has a lower
average cpu usage, migration times halved and returned back to the
original estimations.

Thank Richard for the infos about virt-v2v, we improved our knowledge
on this tool :-)

Luca

-- 
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)

"Internet è la più grande biblioteca del mondo.
Ma il problema è che i libri sono tutti sparsi sul pavimento"
John Allen Paulos, Matematico (1945-vivente)

Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt 4.1 unable deploy HostedEngine on next host Configuration value not found: file=/etc/.../hosted-engine.conf

2018-02-14 Thread Reznikov Alexei

13.02.2018 13:42, Simone Tiraboschi пишет:


Yes, ufortunately you are absolutely right on that: there is a bug there.
As a side effect, hosted-engine --set-shared-config and hosted-engine 
--get-shared-config always refresh the local copy of hosted-engine 
configuration files with the copy on the shared storage and so you 
will always end with host_id=1 in 
/etc/ovirt-hosted-engine/hosted-engine.conf which can lead to SPM 
conflicts.
I'd suggest to manually fix host_id parameter in 
/etc/ovirt-hosted-engine/hosted-engine.conf to its original value 
(double check with engine DB with 'sudo -u postgres psql engine -c 
"SELECT vds_spm_id, vds.vds_name FROM vds"' on the engine VM) to avoid 
that.

https://bugzilla.redhat.com/1543988

Simon, I'm trying to set the right values ... but unfortunately I fail.

[root@h3 ovirt-hosted-engine]# cat hosted-engine.conf | grep conf_
conf_volume_UUID=a20d9700-1b9a-41d8-bb4b-f2b7c168104f
conf_image_UUID=b5f353f5-9357-4aad-b1a3-751d411e6278


[root@h3 ~]# hosted-engine --set-shared-config conf_image_UUID 
b5f353f5-9357-4aad-b1a3-751d411e6278 --type he_conf

Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
  .
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", 
line 226, in get

    key
KeyError: 'Configuration value not found: 
file=/etc/ovirt-hosted-engine/hosted-engine.conf, key=conf_volume_UUID'


How to fix this, else there is any way to edit hosted-engine.conf on 
shared storage?



Regards,

Alex.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Ovirt 3.6 to 4.2 upgrade

2018-02-14 Thread Gary Lloyd
Hi Yaniv

We attempted to share the code a few years back, but I don't think it got
accepted.

In vdsm.conf we have two bridged interfaces, each connected to a SAN uplink:

[irs]
iscsi_default_ifaces = san1,san2

And here is a diff of the file
/usr/lib/python2.7/site-packages/vdsm/storage/ vs the original for
vdsm-4.20.17-1
:

463,498c463,464
<
< # Original Code ##
<
< #iscsi.addIscsiNode(self._iface, self._target, self._cred)
< #timeout = config.getint("irs", "udev_settle_timeout")
< #udevadm.settle(timeout)
<
< ### Altered Code for EqualLogic Direct LUNs for Keele University
: G.Lloyd ###
<
< ifaceNames = config.get('irs', 'iscsi_default_ifaces').split(',')
< if not ifaceNames:
< iscsi.addIscsiNode(self._iface, self._target, self._cred)
< else:
< self.log.debug("Connecting on interfaces:
{}".format(ifaceNames))
< #for ifaceName in ifaceNames:
< success = False
< while ifaceNames:
< self.log.debug("Remaining interfaces to try:
{}".format(ifaceNames))
< ifaceName = ifaceNames.pop()
< try:
< self.log.debug("Connecting on {}".format(ifaceName))
< iscsi.addIscsiNode(iscsi.IscsiInterface(ifaceName),
self._target, self._cred)
< self.log.debug("Success connecting on
{}".format(ifaceName))
< success = True
< except:
< self.log.debug("Failure connecting on interface
{}".format(ifaceName))
< if ifaceNames:
<   self.log.debug("More iscsi interfaces to try,
continuing")
<   pass
< elif success:
<   self.log.debug("Already succeded on an interface,
continuing")
<   pass
< else:
<   self.log.debug("Could not connect to iscsi target
on any interface, raising exception")
<   raise
< timeout = config.getint("irs", "scsi_settle_timeout")
---
> iscsi.addIscsiNode(self._iface, self._target, self._cred)
> timeout = config.getint("irs", "udev_settle_timeout")
501,502d466
< ### End of Custom Alterations ###
<

Regards

*Gary Lloyd*

I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK
+44 1782 733063 <%2B44%201782%20733073>


On 11 February 2018 at 08:38, Yaniv Kaul  wrote:

>
>
> On Fri, Feb 9, 2018 at 4:06 PM, Gary Lloyd  wrote:
>
>> Hi
>>
>> Is it possible/supported to upgrade from Ovirt 3.6 straight to Ovirt 4.2 ?
>>
>
> No, you go through 4.0, 4.1.
>
>
>> Does live migration still function between the older vdsm nodes and vdsm
>> nodes with software built against Ovirt 4.2 ?
>>
>
> Yes, keep the cluster level at 3.6.
>
>
>>
>> We changed a couple of the vdsm python files to enable iscsi multipath on
>> direct luns.
>> (It's a fairly simple change to a couple of the python files).
>>
>
> Nice!
> Can you please contribute those patches to oVirt?
> Y.
>
>
>>
>> We've been running it this way since 2012 (Ovirt 3.2).
>>
>> Many Thanks
>>
>> *Gary Lloyd*
>> 
>> I.T. Systems:Keele University
>> Finance & IT Directorate
>> Keele:Staffs:IC1 Building:ST5 5NB:UK
>> +44 1782 733063 <%2B44%201782%20733073>
>> 
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Import Domain and snapshot issue ... please help !!!

2018-02-14 Thread Maor Lipchuk
Seems like all the engine logs are full with the same error.
>From vdsm.log.16.xz I can see an error which might explain this failure:

2018-02-12 07:51:16,161+0100 INFO  (ioprocess communication (40573))
[IOProcess] Starting ioprocess (__init__:447)
2018-02-12 07:51:16,201+0100 INFO  (jsonrpc/3) [vdsm.api] FINISH
mergeSnapshots return=None from=:::10.0.0.46,57032,
flow_id=fd4041b3-2301-44b0-aa65-02bd089f6568,
task_id=1be430dc-eeb0-4dc9-92df-3f5b7943c6e0 (api:52)
2018-02-12 07:51:16,275+0100 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
call Image.mergeSnapshots succeeded in 0.13 seconds (__init__:573)
2018-02-12 07:51:16,276+0100 INFO  (tasks/1)
[storage.ThreadPool.WorkerThread] START task
1be430dc-eeb0-4dc9-92df-3f5b7943c6e0 (cmd=>, args=None)
(threadPool:208)
2018-02-12 07:51:16,543+0100 INFO  (tasks/1) [storage.Image]
sdUUID=47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5 vmUUID=
imgUUID=ee9ab34c-47a8-4306-95d7-dd4318c69ef5
ancestor=9cdc96de-65b7-4187-8ec3-8190b78c1825
successor=8f595e80-1013-4c14-a2f5-252bce9526fdpostZero=False discard=False
(image:1240)
2018-02-12 07:51:16,669+0100 ERROR (tasks/1) [storage.TaskManager.Task]
(Task='1be430dc-eeb0-4dc9-92df-3f5b7943c6e0') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
in _run
return fn(*args, **kargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336,
in run
return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line
79, in wrapper
return method(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1853, in
mergeSnapshots
discard)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 1251,
in merge
srcVol = vols[successor]
KeyError: u'8f595e80-1013-4c14-a2f5-252bce9526fd'

Ala, maybe you know if there is any known issue with mergeSnapshots?
The usecase here are VMs from oVirt 3.5 which got registered to oVirt 4.2.

Regards,
Maor


On Wed, Feb 14, 2018 at 10:11 AM, Enrico Becchetti <
enrico.becche...@pg.infn.it> wrote:

>   Hi,
> also you can download them throught these
> links:
>
> https://owncloud.pg.infn.it/index.php/s/QpsTyGxtRTPYRTD
> https://owncloud.pg.infn.it/index.php/s/ph8pLcABe0nadeb
>
> Thanks again 
>
> Best Regards
> Enrico
>
> Il 13/02/2018 14:52, Maor Lipchuk ha scritto:
>
>
>
> On Tue, Feb 13, 2018 at 3:51 PM, Maor Lipchuk  wrote:
>
>>
>> On Tue, Feb 13, 2018 at 3:42 PM, Enrico Becchetti <
>> enrico.becche...@pg.infn.it> wrote:
>>
>>> see the attach files please ... thanks for your attention !!!
>>>
>>
>>
>> Seems like the engine logs does not contain the entire process, can you
>> please share older logs since the import operation?
>>
>
> And VDSM logs as well from your host
>
>
>>
>>
>>> Best Regards
>>> Enrico
>>>
>>>
>>> Il 13/02/2018 14:09, Maor Lipchuk ha scritto:
>>>
>>>
>>>
>>> On Tue, Feb 13, 2018 at 1:48 PM, Enrico Becchetti <
>>> enrico.becche...@pg.infn.it> wrote:
>>>
  Dear All,
 I have been using ovirt for a long time with three hypervisors and an
 external engine running in a centos vm .

 This three hypervisors have HBAs and access to fiber channel storage.
 Until recently I used version 3.5, then I reinstalled everything from
 scratch and now I have 4.2.

 Before formatting everything, I detach the storage data domani (FC)
 with the virtual machines and reimported it to the new 4.2 and all went
 well. In
 this domain there were virtual machines with and without snapshots.

 Now I have two problems. The first is that if I try to delete a
 snapshot the process is not end successful and remains hanging and the
 second problem is that
 in one case I lost the virtual machine !!!

>>>
>>>
>>> Not sure that I fully understand the scneario.'
>>> How was the virtual machine got lost if you only tried to delete a
>>> snapshot?
>>>
>>>

 So I need your help to kill the three running zombie tasks because with
 taskcleaner.sh I can't do anything and then I need to know how I can delete
 the old snapshots
 made with the 3.5 without losing other data or without having new
 processes that terminate correctly.

 If you want some log files please let me know.

>>>
>>>
>>> Hi Enrico,
>>>
>>> Can you please attach the engine and VDSM logs
>>>
>>>

 Thank you so much.
 Best Regards
 Enrico



 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users


>>>
>>> --
>>> ___
>>>
>>> Enrico BecchettiServizio di Calcolo e Reti
>>>
>>> Istituto Nazionale di Fisica Nucleare - Sezione di Perugia
>>> Via Pascoli,c/o Dipartimento di Fisica  06123 Perugia (ITALY)
>>> Phone:+39 075 5852777 <+39%20

Re: [ovirt-users] [Qemu-block] qcow2 images corruption

2018-02-14 Thread Nicolas Ecarnot



https://framadrop.org/r/Lvvr392QZo#/wOeYUUlHQAtkUw1E+x2YdqTqq21Pbic6OPBIH0TjZE=

Le 14/02/2018 à 00:01, John Snow a écrit :



On 02/13/2018 04:41 AM, Kevin Wolf wrote:

Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben:

TL; DR : qcow2 images keep getting corrupted. Any workaround?


Not without knowing the cause.

The first thing to make sure is that the image isn't touched by a second
process while QEMU is running a VM. The classic one is using 'qemu-img
snapshot' on the image of a running VM, which is instant corruption (and
newer QEMU versions have locking in place to prevent this), but we have
seen more absurd cases of things outside QEMU tampering with the image
when we were investigating previous corruption reports.

This covers the majority of all reports, we haven't had a real
corruption caused by a QEMU bug in ages.


After having found (https://access.redhat.com/solutions/1173623) the right
logical volume hosting the qcow2 image, I can run qemu-img check on it.
- On 80% of my VMs, I find no errors.
- On 15% of them, I find Leaked cluster errors that I can correct using
"qemu-img check -r all"
- On 5% of them, I find Leaked clusters errors and further fatal errors,
which can not be corrected with qemu-img.
In rare cases, qemu-img can correct them, but destroys large parts of the
image (becomes unusable), and on other cases it can not correct them at all.


It would be good if you could make the 'qemu-img check' output available
somewhere.

It would be even better if we could have a look at the respective image.
I seem to remember that John (CCed) had a few scripts to analyse
corrupted qcow2 images, maybe we would be able to see something there.



Hi! I did write a pretty simplistic tool for trying to tell the shape of
a corruption at a glance. It seems to work pretty similarly to the other
tool you already found, but it won't hurt anything to run it:

https://github.com/jnsnow/qcheck

(Actually, that other tool looks like it has an awful lot of options.
I'll have to check it out.)

It can print a really upsetting amount of data (especially for very
corrupt images), but in the default case, the simple setting should do
the trick just fine.

You could always put the output from this tool in a pastebin too; it
might help me visualize the problem a bit more -- I find seeing the
exact offsets and locations of where all the various tables and things
to be pretty helpful.

You can also always use the "deluge" option and compress it if you want,
just don't let it print to your terminal:

jsnow@probe (dev) ~/s/qcheck> ./qcheck -xd
/home/bos/jsnow/src/qemu/bin/git/install_test_f26.qcow2 > deluge.log;
and ls -sh deluge.log
4.3M deluge.log

but it compresses down very well:

jsnow@probe (dev) ~/s/qcheck> 7z a -t7z -m0=ppmd deluge.ppmd.7z deluge.log
jsnow@probe (dev) ~/s/qcheck> ls -s deluge.ppmd.7z
316 deluge.ppmd.7z

So I suppose if you want to send along:
(1) The basic output without any flags, in a pastebin
(2) The zipped deluge output, just in case

and I will try my hand at guessing what went wrong.


(Also, maybe my tool will totally choke for your image, who knows. It
hasn't received an overwhelming amount of testing apart from when I go
to use it personally and inevitably wind up displeased with how it
handles certain situations, so ...)


What I read similar to my case is :
- usage of qcow2
- heavy disk I/O
- using the virtio-blk driver

In the proxmox thread, they tend to say that using virtio-scsi is the
solution. Having asked this question to oVirt experts
(https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's
not clear the driver is to blame.


This seems very unlikely. The corruption you're seeing is in the qcow2
metadata, not only in the guest data. If anything, virtio-scsi exercises
more qcow2 code paths than virtio-blk, so any potential bug that affects
virtio-blk should also affect virtio-scsi, but not the other way around.


I agree with the answer Yaniv Kaul gave to me, saying I have to properly
report the issue, so I'm longing to know which peculiar information I can
give you now.


To be honest, debugging corruption after the fact is pretty hard. We'd
need the 'qemu-img check' output and ideally the image to do anything,
but I can't promise that anything would come out of this.

Best would be a reproducer, or at least some operation that you can link
to the appearance of the corruption. Then we could take a more targeted
look at the respective code.


As you can imagine, all this setup is in production, and for most of the
VMs, I can not "play" with them. Moreover, we launched a campaign of nightly
stopping every VM, qemu-img check them one by one, then boot.
So it might take some time before I find another corrupted image.
(which I'll preciously store for debug)

Other informations : We very rarely do snapshots, but I'm close to imagine
that automated migrations of VMs could trigger similar behaviors on qcow2
images.


To my knowledge, oVirt only uses ex

Re: [ovirt-users] Import Domain and snapshot issue ... please help !!!

2018-02-14 Thread Enrico Becchetti

Dear All,
old snapsahots seem to be the problem. In fact domain DATA_FC running in 
3.5 had some
lvm snapshot volume. Before deactivate DATA_FC  I didin't remove this 
snapshots so when
I attach this volume to new ovirt 4.2 and import all vm at the same time 
I also import
all snapshots but now How I can remove them ? Throught ovirt web 
interface the remove

tasks running are still hang. Are there any other methods ?
Thank to following this case.
Best Regads
Enrico

Il 14/02/2018 14:34, Maor Lipchuk ha scritto:

Seems like all the engine logs are full with the same error.
From vdsm.log.16.xz I can see an error which might explain this failure:

2018-02-12 07:51:16,161+0100 INFO  (ioprocess communication (40573)) 
[IOProcess] Starting ioprocess (__init__:447)
2018-02-12 07:51:16,201+0100 INFO  (jsonrpc/3) [vdsm.api] FINISH 
mergeSnapshots return=None from=:::10.0.0.46,57032, 
flow_id=fd4041b3-2301-44b0-aa65-02bd089f6568, 
task_id=1be430dc-eeb0-4dc9-92df-3f5b7943c6e0 (api:52)
2018-02-12 07:51:16,275+0100 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] 
RPC call Image.mergeSnapshots succeeded in 0.13 seconds (__init__:573)
2018-02-12 07:51:16,276+0100 INFO  (tasks/1) 
[storage.ThreadPool.WorkerThread] START task 
1be430dc-eeb0-4dc9-92df-3f5b7943c6e0 (cmd=>, args=None) 
(threadPool:208)
2018-02-12 07:51:16,543+0100 INFO  (tasks/1) [storage.Image] 
sdUUID=47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5 vmUUID= 
imgUUID=ee9ab34c-47a8-4306-95d7-dd4318c69ef5 
ancestor=9cdc96de-65b7-4187-8ec3-8190b78c1825 
successor=8f595e80-1013-4c14-a2f5-252bce9526fdpostZero=False 
discard=False (image:1240)
2018-02-12 07:51:16,669+0100 ERROR (tasks/1) 
[storage.TaskManager.Task] 
(Task='1be430dc-eeb0-4dc9-92df-3f5b7943c6e0') Unexpected error (task:875)

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 
882, in _run

    return fn(*args, **kargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 
336, in run

    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", 
line 79, in wrapper

    return method(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 
1853, in mergeSnapshots

    discard)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 
1251, in merge

    srcVol = vols[successor]
KeyError: u'8f595e80-1013-4c14-a2f5-252bce9526fd'

Ala, maybe you know if there is any known issue with mergeSnapshots?
The usecase here are VMs from oVirt 3.5 which got registered to oVirt 4.2.

Regards,
Maor


On Wed, Feb 14, 2018 at 10:11 AM, Enrico Becchetti 
mailto:enrico.becche...@pg.infn.it>> wrote:


  Hi,
also you can download them throught these
links:

https://owncloud.pg.infn.it/index.php/s/QpsTyGxtRTPYRTD

https://owncloud.pg.infn.it/index.php/s/ph8pLcABe0nadeb


Thanks again 

Best Regards
Enrico


Il 13/02/2018 14:52, Maor Lipchuk ha scritto:



On Tue, Feb 13, 2018 at 3:51 PM, Maor Lipchuk
mailto:mlipc...@redhat.com>> wrote:


On Tue, Feb 13, 2018 at 3:42 PM, Enrico Becchetti
mailto:enrico.becche...@pg.infn.it>> wrote:

see the attach files please ... thanks for your
attention !!!



Seems like the engine logs does not contain the entire
process, can you please share older logs since the import
operation?


And VDSM logs as well from your host

Best Regards
Enrico


Il 13/02/2018 14:09, Maor Lipchuk ha scritto:



On Tue, Feb 13, 2018 at 1:48 PM, Enrico Becchetti
mailto:enrico.becche...@pg.infn.it>> wrote:

 Dear All,
I have been using ovirt for a long time with three
hypervisors and an external engine running in a
centos vm .

This three hypervisors have HBAs and access to
fiber channel storage. Until recently I used
version 3.5, then I reinstalled everything from
scratch and now I have 4.2.

Before formatting everything, I detach the storage
data domani (FC) with the virtual machines and
reimported it to the new 4.2 and all went well. In
this domain there were virtual machines with and
without snapshots.

Now I have two problems. The first is that if I try
to delete a snapshot the process is not end
successful and remains hanging and the second
problem is that
in one case I lost the virtual machine !!!



Not sure that I fully understand the scneario.'
How was the virtual machine got lost if you only tried
to delete a snapshot?


  

[ovirt-users] Q: Upgrade 4.2 -> 4.2.1 Dependency Problem

2018-02-14 Thread Andrei V
Hi !

I run into unexpected problem upgrading oVirt node (installed manually on 
CentOS):
This problem have to be fixed manually otherwise upgrade command from host 
engine also fail.

-> glusterfs-rdma = 3.12.5-2.el7
was installed manually as a dependency resolution for 
ovirt-host-4.2.1-1.el7.centos.x86_64

Q: How to get around this problem? Thanks in advance.


Error: Package: ovirt-host-4.2.1-1.el7.centos.x86_64 (ovirt-4.2)
   Requires: glusterfs-rdma
   Removing: glusterfs-rdma-3.12.5-2.el7.x86_64 
(@ovirt-4.2-centos-gluster312)
   glusterfs-rdma = 3.12.5-2.el7
   Obsoleted By: 
mlnx-ofa_kernel-3.4-OFED.3.4.2.1.5.1.ged26eb5.1.rhel7u3.x86_64 (HP-spp)
   Not found
   Available: glusterfs-rdma-3.8.4-18.4.el7.centos.x86_64 (base)
   glusterfs-rdma = 3.8.4-18.4.el7.centos
   Available: glusterfs-rdma-3.12.0-1.el7.x86_64 
(ovirt-4.2-centos-gluster312)
   glusterfs-rdma = 3.12.0-1.el7
   Available: glusterfs-rdma-3.12.1-1.el7.x86_64 
(ovirt-4.2-centos-gluster312)
   glusterfs-rdma = 3.12.1-1.el7
   Available: glusterfs-rdma-3.12.1-2.el7.x86_64 
(ovirt-4.2-centos-gluster312)
   glusterfs-rdma = 3.12.1-2.el7
   Available: glusterfs-rdma-3.12.3-1.el7.x86_64 
(ovirt-4.2-centos-gluster312)
   glusterfs-rdma = 3.12.3-1.el7
   Available: glusterfs-rdma-3.12.4-1.el7.x86_64 
(ovirt-4.2-centos-gluster312)
   glusterfs-rdma = 3.12.4-1.el7



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Q: Upgrade 4.2 -> 4.2.1 Dependency Problem

2018-02-14 Thread Luca 'remix_tj' Lorenzetto
On Wed, Feb 14, 2018 at 4:26 PM, Andrei V  wrote:
> Hi !
>
> I run into unexpected problem upgrading oVirt node (installed manually on 
> CentOS):
> This problem have to be fixed manually otherwise upgrade command from host 
> engine also fail.
>
> -> glusterfs-rdma = 3.12.5-2.el7
> was installed manually as a dependency resolution for 
> ovirt-host-4.2.1-1.el7.centos.x86_64
>
> Q: How to get around this problem? Thanks in advance.
>
>
> Error: Package: ovirt-host-4.2.1-1.el7.centos.x86_64 (ovirt-4.2)
>Requires: glusterfs-rdma
>Removing: glusterfs-rdma-3.12.5-2.el7.x86_64 
> (@ovirt-4.2-centos-gluster312)
>glusterfs-rdma = 3.12.5-2.el7
>Obsoleted By: 
> mlnx-ofa_kernel-3.4-OFED.3.4.2.1.5.1.ged26eb5.1.rhel7u3.x86_64 (HP-spp)
>Not found
[cut]

Try with yum clean all and then upgrade. And disable HP-spp if you
don't them need now.


Luca
-- 
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)

"Internet è la più grande biblioteca del mondo.
Ma il problema è che i libri sono tutti sparsi sul pavimento"
John Allen Paulos, Matematico (1945-vivente)

Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Q: Upgrade 4.2 -> 4.2.1 Dependency Problem

2018-02-14 Thread andreil1


> On 14 Feb 2018, at 17:31, Luca 'remix_tj' Lorenzetto 
>  wrote:
> 
> On Wed, Feb 14, 2018 at 4:26 PM, Andrei V  wrote:
>> Hi !
>> 
>> I run into unexpected problem upgrading oVirt node (installed manually on 
>> CentOS):
>> This problem have to be fixed manually otherwise upgrade command from host 
>> engine also fail.
>> 
>> -> glusterfs-rdma = 3.12.5-2.el7
>> was installed manually as a dependency resolution for 
>> ovirt-host-4.2.1-1.el7.centos.x86_64
>> 
>> Q: How to get around this problem? Thanks in advance.
>> 
>> 
>> Error: Package: ovirt-host-4.2.1-1.el7.centos.x86_64 (ovirt-4.2)
>>   Requires: glusterfs-rdma
>>   Removing: glusterfs-rdma-3.12.5-2.el7.x86_64 
>> (@ovirt-4.2-centos-gluster312)
>>   glusterfs-rdma = 3.12.5-2.el7
>>   Obsoleted By: 
>> mlnx-ofa_kernel-3.4-OFED.3.4.2.1.5.1.ged26eb5.1.rhel7u3.x86_64 (HP-spp)
>>   Not found
> [cut]
> 
> Try with yum clean all and then upgrade. And disable HP-spp if you
> don't them need now.
> 
> 
> Luca



"yum clean all" did before posting on oVirt list, same problem.

However, disabling HP-spp repository did the trick, thanks a lot !

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] ovirt 4.2 gluster configuration

2018-02-14 Thread Edoardo Mazza
Hi all,
Scenario:
3 nodes each with 3 interfaces: 1 for management, 1 for gluster, 1 for VMs
Management interface has it own name and its own ip (es. name = ov1, ip=
192.168.1.1/24), the same is for gluster interface which has its own name
and its own ip (es. name = gluster1, ip= 192.168.2.1/24).

When configuring bricks from Ovirt Management tools I get the error: "no
uuid for the name ov1".

Network for gluster communication has been defined on network/interface
gluster1.

What's wrong with this configuration?

Thanks in advance.

Edoardo
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Unable to connect to the graphic server

2018-02-14 Thread Alex Bartonek

 Original Message 
 On February 14, 2018 2:23 AM, Yedidyah Bar David  wrote:

>On Wed, Feb 14, 2018 at 5:20 AM, Alex Bartonek a...@unix1337.com wrote:
>>I've built and rebuilt about 4 oVirt servers.  Consider myself pretty good
>> at this.  LOL.
>> So I am setting up a oVirt server for a friend on his r710.  CentOS 7, ovirt
>> 4.2.   /etc/hosts has the correct IP and FQDN setup.
>>When I build a VM and try to open a console session via  SPICE I am unable
>> to connect to the graphic server.  I'm connecting from a Windows 10 box.
>> Using virt-manager to connect.
>>
> What happens when you try?
>

Unable to connect to the graphic console is what the error says.  Here is the 
.vv file other than the cert stuff in it:

[virt-viewer]
type=spice
host=192.168.1.83
port=-1
password=
# Password is valid for 120 seconds.
delete-this-file=1
fullscreen=0
title=Win_7_32bit:%d
toggle-fullscreen=shift+f11
release-cursor=shift+f12
tls-port=5900
enable-smartcard=0
enable-usb-autoshare=1
usb-filter=-1,-1,-1,-1,0
tls-ciphers=DEFAULT
host-subject=O=williams.com,CN=randb.williams.com



Port 5900 is listening by IP on the server, so that looks correct.  I shut the 
firewall off just in case it was the issue..no go.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Virtual networks in oVirt 4.2 and MTU 1500

2018-02-14 Thread Dmitry Semenov
I have a not big cluster on oVirt 4.2.
Each node has a bond, that has several vlans in its turn.
I use virtual networks OVN (External Provider -> ovirt-provider-ovn).

While testing I have noticed that in virtual network MTU must be less 1500, so 
my question is may I change something in network or in bond in order everything 
in virtual network works correctly with MTU 1500?

Below link with my settings:
https://pastebin.com/F7ssCVFa

-- 
Best regards
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine install fails on useless DHCP lookup

2018-02-14 Thread Jamie Lawrence
> On Feb 14, 2018, at 1:27 AM, Simone Tiraboschi  wrote:
> On Wed, Feb 14, 2018 at 2:11 AM, Jamie Lawrence  
> wrote:
> Hello,
> 
> I'm seeing the hosted engine install fail on an Ansible playbook step. Log 
> below. I tried looking at the file specified for retry, below 
> (/usr/share/ovirt-hosted-engine-setup/ansible/bootstrap_local_vm.retry); it 
> contains the word, 'localhost'.
> 
> The log below didn't contain anything I could see that was actionable; given 
> that it was an ansible error, I hunted down the config and enabled logging. 
> On this run the error was different - the installer log was the same, but the 
> reported error (from the installer changed).
> 
> The first time, the installer said:
> 
> [ INFO  ] TASK [Wait for the host to become non operational]
> [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": 
> []}, "attempts": 150, "changed": false}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing 
> ansible-playbook
> [ INFO  ] Stage: Clean up
> 
> 'localhost' here is not an issue by itself: the playbook is executed on the 
> host against the same host over a local connection so localhost is absolutely 
> fine there.
> 
> Maybe you hit this one:
> https://bugzilla.redhat.com/show_bug.cgi?id=1540451

That seems likely. 


> It seams NetworkManager related but still not that clear.
> Stopping NetworkManager and starting network before the deployment seams to 
> help.

Tried this, got the same results.

[snip]
> Anyone see what is wrong here?
> 
> This is absolutely fine.
> The new ansible based flow (also called node zero) uses an engine running on 
> a local virtual machine to bootstrap the system.
> The bootstrap local VM runs over libvirt default natted network with its own 
> dhcp instance, that's why we are consuming it.
> The locally running engine will create a target virtual machine on the shared 
> storage and that one will be instead configured as you specified.

Thanks for the context - that's useful, and presumably explains why 192.168 
addresses (which we don't use) are appearing in the logs.

Not being entirely sure where to go from here, I guess I'll spend the evening 
figuring out ansible-ese in order to try to figure out why it is blowing chunks.

Thanks for the note. 

-j
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Unable to connect to the graphic server

2018-02-14 Thread Yedidyah Bar David
On Wed, Feb 14, 2018 at 9:20 PM, Alex Bartonek  wrote:
>
>  Original Message 
>  On February 14, 2018 2:23 AM, Yedidyah Bar David  wrote:
>
>>On Wed, Feb 14, 2018 at 5:20 AM, Alex Bartonek a...@unix1337.com wrote:
>>>I've built and rebuilt about 4 oVirt servers.  Consider myself pretty good
>>> at this.  LOL.
>>> So I am setting up a oVirt server for a friend on his r710.  CentOS 7, ovirt
>>> 4.2.   /etc/hosts has the correct IP and FQDN setup.
>>>When I build a VM and try to open a console session via  SPICE I am unable
>>> to connect to the graphic server.  I'm connecting from a Windows 10 box.
>>> Using virt-manager to connect.
>>>
>> What happens when you try?
>>
>
> Unable to connect to the graphic console is what the error says.  Here is the 
> .vv file other than the cert stuff in it:
>
> [virt-viewer]
> type=spice
> host=192.168.1.83
> port=-1
> password=
> # Password is valid for 120 seconds.
> delete-this-file=1
> fullscreen=0
> title=Win_7_32bit:%d
> toggle-fullscreen=shift+f11
> release-cursor=shift+f12
> tls-port=5900
> enable-smartcard=0
> enable-usb-autoshare=1
> usb-filter=-1,-1,-1,-1,0
> tls-ciphers=DEFAULT
> host-subject=O=williams.com,CN=randb.williams.com
>
>
>
> Port 5900 is listening by IP on the server, so that looks correct.  I shut 
> the firewall off just in case it was the issue..no go.

Did you verify that you can connect there manually (e.g. with telnet)?
Can you run a sniffer on both sides to make sure traffic passes correctly?
Can you check vdsm/libvirt logs on the host side?

Thanks,
-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Moving Combined Engine & Node to new network.

2018-02-14 Thread Rulas Mur
Hi,

I setup the host+engine on centos 7 on my home network and everything
worked perfectly, However when I connected it to my work network networking
failed completely.

hostname -I would be blank.
lspci does list the hardware
nmcli d is empty
nmcli con show is empty
nmcli device status is empty

there is a device in /sys/class/net/

Is there a way to fix this? or do I have to reinstall?

On another note, ovirt is amazing!

Thanks for the quality product,
Rulasmur
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users