[ovirt-users] Re: Storage domain 'Inactive' but still functional

2019-07-24 Thread Martijn Grendelman
Op 24-7-2019 om 10:07 schreef Benny Zlotnik:
We have seen something similar in the past and patches were posted to deal with 
this issue, but it's still in progress[1]

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1553133

That's some interesting reading, and it sure looks like the problem I had. 
Thanks!

Best regards,
Martijn.



On Mon, Jul 22, 2019 at 8:07 PM Strahil 
mailto:hunter86...@yahoo.com>> wrote:

I have a theory... But after all without any proof it will remain theory.

The storage volumes are just VGs over a shared storage.The SPM host is supposed 
to be the only one that is working with the LVM metadata, but I have observed 
that when someone is executing a simple LVM command  (for example -lvs, vgs or 
pvs ) while another one is going on on another host - your metadata can 
corrupt, due to lack of clvmd.

As a protection, I could offer you to try the following solution:
1. Create new iSCSI lun
2. Share it to all nodes and create the storage domain. Set it to maintenance.
3. Start dlm & clvmd services on all hosts
4. Convert the VG of your shared storage domain to have a 'cluster'-ed  flag:
vgchange -c y mynewVG
5. Check the lvs of that VG.
6. Activate the storage domain.

Of course  test it on a test cluster before inplementing it on Prod.
This is one of the approaches used in Linux HA clusters in order to avoid  LVM 
metadata corruption.

Best Regards,
Strahil Nikolov

On Jul 22, 2019 15:46, Martijn Grendelman 
mailto:martijn.grendel...@isaac.nl>> wrote:
Hi,

Op 22-7-2019 om 14:30 schreef Strahil:

If you can give directions (some kind of history) , the dev might try to 
reproduce this type of issue.

If it is reproduceable - a fix can be provided.

Based on my experience, if something as used as Linux LVM gets broken, the case 
is way hard to reproduce.

Yes, I'd think so too, especially since this activity (online moving of disk 
images) is done all the time, mostly without problems. In this case, there was 
a lot of activity on all storage domains, because I'm moving all my storage (> 
10TB in 185 disk images) to a new storage platform. During the online move of 
one the images, the metadata checksum became corrupted and the storage domain 
went offline.

Of course, I could dig up the engine logs and vdsm logs of when it happened, 
but that would be some work and I'm not very confident that the actual cause 
would be in there.

If any oVirt devs are interested in the logs, I'll provide them, but otherwise 
I think I'll just see it as an incident and move on.

Best regards,
Martijn.




On Jul 22, 2019 10:17, Martijn Grendelman 
<mailto:martijn.grendel...@isaac.nl> wrote:
Hi,

Thanks for the tips! I didn't know about 'pvmove', thanks.

In  the mean time, I managed to get it fixed by restoring the VG metadata on 
the iSCSI server, so on the underlying Zvol directly, rather than via the iSCSI 
session on the oVirt host. That allowed me to perform the restore without 
bringing all VMs down, which was important to me, because if I had to shut down 
VMs, I was sure I wouldn't be able to restart them before the storage domain 
was back online.

Of course this is a more a Linux problem than an oVirt problem, but oVirt did 
cause it ;-)

Thanks,
Martijn.



Op 19-7-2019 om 19:06 schreef Strahil Nikolov:
Hi Martin,

First check what went wrong with the VG -as it could be something simple.
vgcfgbackup -f VGname will create a file which you can use to compare current 
metadata with a previous version.

If you have Linux boxes - you can add disks from another storage and then 
pvmove the data inside the VM. Of course , you will need to reinstall grub on 
the new OS disk , or you won't be able to boot afterwards.
If possible, try with a test VM before proceeding with important ones.

Backing up the VMs is very important , because working on LVM metada
___
Users mailing list -- users@ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to 
users-le...@ovirt.org<mailto:users-le...@ovirt.org>
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/37UDAWDXON3URKVGSR3YGIZML2ZVPZOG/

--
Met vriendelijke groet,
Kind regards,

[Martijn]<mailto:martijn.grendel...@isaac.nl>

Martijn Grendelman  Infrastructure Architect
T: +31 (0)40 264 94 44



[ISAAC]<https://www.isaac.nl>
ISAAC   Marconilaan 16   5621 AA Eindhoven   The Netherlands
T: +31 (0)40 290 89 79   www.isaac.nl<https://www.isaac.nl>

[ISAAC #1 
Again!]<https://www.isaac.nl/nl/over-ons/nieuws/isaac-news/ISAAC-voor-tweede-keer-nummer-1-Fullservice-Digital-Agency-Emerce100>

Dit e-mail bericht is alleen bestemd voor de geadresseerde(n). Indien dit 
bericht niet voor u is bedoeld wordt u verzocht de afzender hiervan op de 
hoogte te stellen door het bericht te retour

[ovirt-users] Re: Storage domain 'Inactive' but still functional

2019-07-22 Thread Martijn Grendelman
Hi,

Thanks for the tips! I didn't know about 'pvmove', thanks.

In  the mean time, I managed to get it fixed by restoring the VG metadata on 
the iSCSI server, so on the underlying Zvol directly, rather than via the iSCSI 
session on the oVirt host. That allowed me to perform the restore without 
bringing all VMs down, which was important to me, because if I had to shut down 
VMs, I was sure I wouldn't be able to restart them before the storage domain 
was back online.

Of course this is a more a Linux problem than an oVirt problem, but oVirt did 
cause it ;-)

Thanks,
Martijn.



Op 19-7-2019 om 19:06 schreef Strahil Nikolov:
Hi Martin,

First check what went wrong with the VG -as it could be something simple.
vgcfgbackup -f VGname will create a file which you can use to compare current 
metadata with a previous version.

If you have Linux boxes - you can add disks from another storage and then 
pvmove the data inside the VM. Of course , you will need to reinstall grub on 
the new OS disk , or you won't be able to boot afterwards.
If possible, try with a test VM before proceeding with important ones.

Backing up the VMs is very important , because working on LVM metadata is quite 
risky.
Last time I had such an issue , I was working on clustered LVs which got their 
PVs "Missing". For me , restore from VG backup fixed the issue - but that might 
not be always the case.

Just get the vgcfgbackup's output and compare with diff or vimdiff and check 
what is different.

Sadly, I think that this is more a Linux problem , than an oVirt problem.

Best Regards,
Strahil Nikolov

В четвъртък, 18 юли 2019 г., 18:51:32 ч. Гринуич+3, Martijn Grendelman 
<mailto:martijn.grendel...@isaac.nl> написа:


Hi!

Thanks. Like I wrote, I have metadata backups from /etc/lvm/backup and 
-/archive, and I also have the current metadata as it exists on disk. What I'm 
most concerned about, is the proposed procedure.

I would create a backup of the VG, but I'm not sure what would be the most 
sensible way to do it. I could make a new iSCSI target and simply 'dd' the 
whole disk over, but that would take quite some time (it's 2,5 TB) and there 
are VMs that can't really be down for that long. And I'm not even sure that 
dd'ing the disk like that is a sensible strategy.

Moving disks out of the domain is currently not possible. oVirt says 'Source 
Storage Domain is not active'.

Thanks,
Martijn.


Op 18-7-2019 om 17:44 schreef Strahil Nikolov:
Can you check the /etc/lvm/backup and /etc/lvm/archive on your SPM host (check 
the other hosts, just in case you find anything useful) ?
Usually LVM makes backup of everything.

I would recommend you to:
1. Create a backup of the problematic VG
2. Compare the backup file and a file from backup/archive folders for the same 
VG
Check what is different with diff/vimdiff . It might give you a clue.

I had some issues (non-related to oVirt) and restoring the VG from older backup 
did help me .Still ,any operation on block devices should be considered risky 
and a proper backup is needed.
You could  try to move a less important VM's disks out of this storage domain 
to another one.

If it succeeds - then you can evacuate all VMs away before you can start 
"breaking" the storage domain.

Best Regards,
Strahil Nikolov



В четвъртък, 18 юли 2019 г., 16:59:46 ч. Гринуич+3, Martijn Grendelman 
<mailto:martijn.grendel...@isaac.nl> написа:


Hi,

It appears that O365 has trouble delivering mails to this list, so two
earlier mails of mine are still somewhere in a queue and may yet be delivered.

This mail has all of the content of 3 successive mails. I apologize for this
format.

Op 18-7-2019 om 11:20 schreef Martijn Grendelman:
Op 18-7-2019 om 10:16 schreef Martijn Grendelman:
Hi,

For the first time in many months I have run into some trouble with oVirt 
(4.3.4.3) and I need some help.

Yesterday, I noticed one of my iSCSI storage domains was almost full, and tried 
to move a disk image off of it, to another domain. This failed, and somewhere 
in the process, the whole storage domain went to status 'Inactive'.

From engine.log:
2019-07-17 16:30:35,319+02 INFO  
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] 
(EE-ManagedThreadFactory-engine-Thread-1836383) [] starting 
processDomainRecovery for domain 
'875847b6-29a4-4419-be92-9315f4435429:HQST0_ISCSI02'.
2019-07-17 16:30:35,337+02 ERROR 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] 
(EE-ManagedThreadFactory-engine-Thread-1836383) [] Domain 
'875847b6-29a4-4419-be92-9315f4435429:HQST0_ISCSI02' was reported by all hosts 
in status UP as problematic. Moving the domain to NonOperational.
2019-07-17 16:30:35,410+02 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engine-Thread-1836383) [5f6fd35e] EVENT_ID: 
SYSTEM_DEACTIVATED_STORAGE_DOMAIN(970), Storage Domain HQST0_ISCSI02 (Data 
Center ISAAC01) was deactivated by system because it's not visible

[ovirt-users] Re: Storage domain 'Inactive' but still functional

2019-07-18 Thread Martijn Grendelman
Hi!

Thanks. Like I wrote, I have metadata backups from /etc/lvm/backup and 
-/archive, and I also have the current metadata as it exists on disk. What I'm 
most concerned about, is the proposed procedure.

I would create a backup of the VG, but I'm not sure what would be the most 
sensible way to do it. I could make a new iSCSI target and simply 'dd' the 
whole disk over, but that would take quite some time (it's 2,5 TB) and there 
are VMs that can't really be down for that long. And I'm not even sure that 
dd'ing the disk like that is a sensible strategy.

Moving disks out of the domain is currently not possible. oVirt says 'Source 
Storage Domain is not active'.

Thanks,
Martijn.


Op 18-7-2019 om 17:44 schreef Strahil Nikolov:
Can you check the /etc/lvm/backup and /etc/lvm/archive on your SPM host (check 
the other hosts, just in case you find anything useful) ?
Usually LVM makes backup of everything.

I would recommend you to:
1. Create a backup of the problematic VG
2. Compare the backup file and a file from backup/archive folders for the same 
VG
Check what is different with diff/vimdiff . It might give you a clue.

I had some issues (non-related to oVirt) and restoring the VG from older backup 
did help me .Still ,any operation on block devices should be considered risky 
and a proper backup is needed.
You could  try to move a less important VM's disks out of this storage domain 
to another one.

If it succeeds - then you can evacuate all VMs away before you can start 
"breaking" the storage domain.

Best Regards,
Strahil Nikolov



В четвъртък, 18 юли 2019 г., 16:59:46 ч. Гринуич+3, Martijn Grendelman 
<mailto:martijn.grendel...@isaac.nl> написа:


Hi,

It appears that O365 has trouble delivering mails to this list, so two
earlier mails of mine are still somewhere in a queue and may yet be delivered.

This mail has all of the content of 3 successive mails. I apologize for this
format.

Op 18-7-2019 om 11:20 schreef Martijn Grendelman:
Op 18-7-2019 om 10:16 schreef Martijn Grendelman:
Hi,

For the first time in many months I have run into some trouble with oVirt 
(4.3.4.3) and I need some help.

Yesterday, I noticed one of my iSCSI storage domains was almost full, and tried 
to move a disk image off of it, to another domain. This failed, and somewhere 
in the process, the whole storage domain went to status 'Inactive'.

From engine.log:
2019-07-17 16:30:35,319+02 INFO  
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] 
(EE-ManagedThreadFactory-engine-Thread-1836383) [] starting 
processDomainRecovery for domain 
'875847b6-29a4-4419-be92-9315f4435429:HQST0_ISCSI02'.
2019-07-17 16:30:35,337+02 ERROR 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] 
(EE-ManagedThreadFactory-engine-Thread-1836383) [] Domain 
'875847b6-29a4-4419-be92-9315f4435429:HQST0_ISCSI02' was reported by all hosts 
in status UP as problematic. Moving the domain to NonOperational.
2019-07-17 16:30:35,410+02 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engine-Thread-1836383) [5f6fd35e] EVENT_ID: 
SYSTEM_DEACTIVATED_STORAGE_DOMAIN(970), Storage Domain HQST0_ISCSI02 (Data 
Center ISAAC01) was deactivated by system because it's not visible by any of 
the hosts.
The thing is, the domain is still functional on all my hosts. It carries over 
50 disks, and all involved VMs are up and running, and don't seem to have any 
problems. Also, 'iscsiadm' on all hosts seems to indiciate that everything is 
fine with this specific target and reading from the device with dd, or getting 
its size with 'blockdev' all works without issue.

When I try to reactivate the domain, these errors are logged:

2019-07-18 09:34:53,631+02 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engine-Thread-43475) [79e386e] EVENT_ID: 
IRS_BROKER_COMMAND_FAILURE(10,803), VDSM command ActivateStorageDomainVDS 
failed: Storage domain does not exist: 
(u'875847b6-29a4-4419-be92-9315f4435429',)
2019-07-18 09:34:53,631+02 ERROR 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(EE-ManagedThreadFactory-engine-Thread-43475) [79e386e] 
IrsBroker::Failed::ActivateStorageDomainVDS: IRSGenericException: 
IRSErrorException: Failed to ActivateStorageDomainVDS, error = Storage domain 
does not exist: (u'875847b6-29a4-4419-be92-9315f4435429',), code = 358
2019-07-18 09:34:53,648+02 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engine-Thread-43475) [79e386e] EVENT_ID: 
USER_ACTIVATE_STORAGE_DOMAIN_FAILED(967), Failed to activate Storage Domain 
HQST0_ISCSI02 (Data Center ISAAC01) by martijn@-authz
On the SPM host, there are errors that indicate problems with the LVM volume 
group:
2019-07-18 09:34:50,462+0200 INFO  (jsonrpc/2) [vdsm.api] START 
activateStorageDomain(sdUUID=u'875847b6-29a4-4419-be92-9315f4435429', 
spUUID=u'aefd5844-6e01-4070-b3b9-c0d73cc40c78', options=No

[ovirt-users] Storage domain 'Inactive' but still functional

2019-07-18 Thread Martijn Grendelman
hon2.7/site-packages/vdsm/storage/sp.py", line 1127, in 
activateSD
dom = sdCache.produce(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in 
produce
domain.getRealDomain()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in 
getRealDomain
return self._cache._realProduce(self._sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in 
_realProduce
domain = self._findDomain(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in 
_findDomain
return findMethod(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 1807, 
in findDomain
return BlockStorageDomain(BlockStorageDomain.findDomainPath(sdUUID))
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 1665, 
in findDomainPath
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: 
(u'875847b6-29a4-4419-be92-9315f4435429',)
2019-07-18 09:34:50,629+0200 INFO  (jsonrpc/2) [storage.TaskManager.Task] 
(Task='51107845-d80b-47f4-aed8-345aaa49f0f8') aborting: Task is aborted: 
"Storage domain does not exist: (u'875847b6-29a4-4419-be92-9315f4435429',)" - 
code 358 (task:1181)
2019-07-18 09:34:50,629+0200 ERROR (jsonrpc/2) [storage.Dispatcher] FINISH 
activateStorageDomain error=Storage domain does not exist: 
(u'875847b6-29a4-4419-be92-9315f4435429',) (dispatcher:83)

I need help getting this storage domain back online. Can anyone here help me? 
If you need any additional information, please let me know!

Best regards,

Martijn Grendelman
ISAAC

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WHMR657BMKUA6XSQGU722Y2U5U4QJIZR/


[ovirt-users] Re: Storage domain 'Inactive' but still functional

2019-07-18 Thread Martijn Grendelman

Op 18-7-2019 om 10:16 schreef Martijn Grendelman:
Hi,

For the first time in many months I have run into some trouble with oVirt 
(4.3.4.3) and I need some help.

Yesterday, I noticed one of my iSCSI storage domains was almost full, and tried 
to move a disk image off of it, to another domain. This failed, and somewhere 
in the process, the whole storage domain went to status 'Inactive'.

From engine.log:
2019-07-17 16:30:35,319+02 INFO  
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] 
(EE-ManagedThreadFactory-engine-Thread-1836383) [] starting 
processDomainRecovery for domain 
'875847b6-29a4-4419-be92-9315f4435429:HQST0_ISCSI02'.
2019-07-17 16:30:35,337+02 ERROR 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] 
(EE-ManagedThreadFactory-engine-Thread-1836383) [] Domain 
'875847b6-29a4-4419-be92-9315f4435429:HQST0_ISCSI02' was reported by all hosts 
in status UP as problematic. Moving the domain to NonOperational.
2019-07-17 16:30:35,410+02 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engine-Thread-1836383) [5f6fd35e] EVENT_ID: 
SYSTEM_DEACTIVATED_STORAGE_DOMAIN(970), Storage Domain HQST0_ISCSI02 (Data 
Center ISAAC01) was deactivated by system because it's not visible by any of 
the hosts.
The thing is, the domain is still functional on all my hosts. It carries over 
50 disks, and all involved VMs are up and running, and don't seem to have any 
problems. Also, 'iscsiadm' on all hosts seems to indiciate that everything is 
fine with this specific target and reading from the device with dd, or getting 
its size with 'blockdev' all works without issue.

When I try to reactivate the domain, these errors are logged:

2019-07-18 09:34:53,631+02 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engine-Thread-43475) [79e386e] EVENT_ID: 
IRS_BROKER_COMMAND_FAILURE(10,803), VDSM command ActivateStorageDomainVDS 
failed: Storage domain does not exist: 
(u'875847b6-29a4-4419-be92-9315f4435429',)
2019-07-18 09:34:53,631+02 ERROR 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(EE-ManagedThreadFactory-engine-Thread-43475) [79e386e] 
IrsBroker::Failed::ActivateStorageDomainVDS: IRSGenericException: 
IRSErrorException: Failed to ActivateStorageDomainVDS, error = Storage domain 
does not exist: (u'875847b6-29a4-4419-be92-9315f4435429',), code = 358
2019-07-18 09:34:53,648+02 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(EE-ManagedThreadFactory-engine-Thread-43475) [79e386e] EVENT_ID: 
USER_ACTIVATE_STORAGE_DOMAIN_FAILED(967), Failed to activate Storage Domain 
HQST0_ISCSI02 (Data Center ISAAC01) by martijn@-authz
On the SPM host, there are errors that indicate problems with the LVM volume 
group:
2019-07-18 09:34:50,462+0200 INFO  (jsonrpc/2) [vdsm.api] START 
activateStorageDomain(sdUUID=u'875847b6-29a4-4419-be92-9315f4435429', 
spUUID=u'aefd5844-6e01-4070-b3b9-c0d73cc40c78', options=None) 
from=:::172.17.1.140,56570, flow_id=197dadec, 
task_id=51107845-d80b-47f4-aed8-345aaa49f0f8 (api:48)
2019-07-18 09:34:50,464+0200 INFO  (jsonrpc/2) [storage.StoragePool] 
sdUUID=875847b6-29a4-4419-be92-9315f4435429 
spUUID=aefd5844-6e01-4070-b3b9-c0d73cc40c78 (sp:1125)
2019-07-18 09:34:50,629+0200 WARN  (jsonrpc/2) [storage.LVM] Reloading VGs 
failed (vgs=[u'875847b6-29a4-4419-be92-9315f4435429'] rc=5 out=[] err=['  
/dev/mapper/23536316636393463: Checksum error at offset 2748693688832', "  
Couldn't read volume group metadata from /dev/mapper/23536316636393463.", '  
Metadata location on /dev/mapper/23536316636393463 at 2748693688832 has invalid 
summary for VG.', '  Failed to read metadata summary from 
/dev/mapper/23536316636393463', '  Failed to scan VG from 
/dev/mapper/23536316636393463', '  Volume group 
"875847b6-29a4-4419-be92-9315f4435429" not found', '  Cannot process volume 
group 875847b6-29a4-4419-be92-9315f4435429']) (lvm:442)
2019-07-18 09:34:50,629+0200 INFO  (jsonrpc/2) [vdsm.api] FINISH 
activateStorageDomain error=Storage domain does not exist: 
(u'875847b6-29a4-4419-be92-9315f4435429',) from=:::172.17.1.140,56570, 
flow_id=197dadec, task_id=51107845-d80b-47f4-aed8-345aaa49f0f8 (api:52)
2019-07-18 09:34:50,629+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] 
(Task='51107845-d80b-47f4-aed8-345aaa49f0f8') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in 
_run
return fn(*args, **kargs)
  File "", line 2, in activateStorageDomain
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method
ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1262, in 
activateStorageDomain
pool.activateSD(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, 
in wrapper
return metho

[ovirt-users] Re: Storage domain 'Inactive' but still functional

2019-07-18 Thread Martijn Grendelman
Hi,

It appears that O365 has trouble delivering mails to this list, so two
earlier mails of mine are still somewhere in a queue and may yet be delivered.

This mail has all of the content of 3 successive mails. I apologize for this
format.

Op 18-7-2019 om 11:20 schreef Martijn Grendelman:
> Op 18-7-2019 om 10:16 schreef Martijn Grendelman:
>> Hi,
>>
>> For the first time in many months I have run into some trouble with
>> oVirt (4.3.4.3) and I need some help.
>>
>> Yesterday, I noticed one of my iSCSI storage domains was almost full,
>> and tried to move a disk image off of it, to another domain. This
>> failed, and somewhere in the process, the whole storage domain went
>> to status 'Inactive'.
>>
>> From engine.log:
>>
>> 2019-07-17 16:30:35,319+02 INFOÂ 
>> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
>> (EE-ManagedThreadFactory-engine-Thread-1836383) [] starting
>> processDomainRecovery for domain
>> '875847b6-29a4-4419-be92-9315f4435429:HQST0_ISCSI02'.
>> 2019-07-17 16:30:35,337+02 ERROR
>> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
>> (EE-ManagedThreadFactory-engine-Thread-1836383) [] Domain
>> '875847b6-29a4-4419-be92-9315f4435429:HQST0_ISCSI02' was reported
>> by all hosts in status UP as problematic. Moving the domain to
>> NonOperational.
>> 2019-07-17 16:30:35,410+02 WARNÂ 
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (EE-ManagedThreadFactory-engine-Thread-1836383) [5f6fd35e]
>> EVENT_ID: SYSTEM_DEACTIVATED_STORAGE_DOMAIN(970), Storage Domain
>> HQST0_ISCSI02 (Data Center ISAAC01) was deactivated by system
>> because it's not visible by any of the hosts.
>>
>> The thing is, the domain is still functional on all my hosts. It
>> carries over 50 disks, and all involved VMs are up and running, and
>> don't seem to have any problems. Also, 'iscsiadm' on all hosts seems
>> to indiciate that everything is fine with this specific target and
>> reading from the device with dd, or getting its size with 'blockdev'
>> all works without issue.
>>
>> When I try to reactivate the domain, these errors are logged:
>>
>> 2019-07-18 09:34:53,631+02 ERROR
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (EE-ManagedThreadFactory-engine-Thread-43475) [79e386e] EVENT_ID:
>> IRS_BROKER_COMMAND_FAILURE(10,803), VDSM command
>> ActivateStorageDomainVDS failed: Storage domain does not exist:
>> (u'875847b6-29a4-4419-be92-9315f4435429',)
>> 2019-07-18 09:34:53,631+02 ERROR
>> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
>> (EE-ManagedThreadFactory-engine-Thread-43475) [79e386e]
>> IrsBroker::Failed::ActivateStorageDomainVDS: IRSGenericException:
>> IRSErrorException: Failed to ActivateStorageDomainVDS, error =
>> Storage domain does not exist:
>> (u'875847b6-29a4-4419-be92-9315f4435429',), code = 358
>> 2019-07-18 09:34:53,648+02 ERROR
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (EE-ManagedThreadFactory-engine-Thread-43475) [79e386e] EVENT_ID:
>> USER_ACTIVATE_STORAGE_DOMAIN_FAILED(967), Failed to activate
>> Storage Domain HQST0_ISCSI02 (Data Center ISAAC01) by martijn@-authz
>>
>> On the SPM host, there are errors that indicate problems with the LVM
>> volume group:
>>
>> 2019-07-18 09:34:50,462+0200 INFOÂ  (jsonrpc/2) [vdsm.api] START
>> activateStorageDomain(sdUUID=u'875847b6-29a4-4419-be92-9315f4435429',
>> spUUID=u'aefd5844-6e01-4070-b3b9-c0d73cc40c78', options=None)
>> from=:::172.17.1.140,56570, flow_id=197dadec,
>> task_id=51107845-d80b-47f4-aed8-345aaa49f0f8 (api:48)
>> 2019-07-18 09:34:50,464+0200 INFOÂ  (jsonrpc/2)
>> [storage.StoragePool] sdUUID=875847b6-29a4-4419-be92-9315f4435429
>> spUUID=aefd5844-6e01-4070-b3b9-c0d73cc40c78 (sp:1125)
>> 2019-07-18 09:34:50,629+0200 WARNÂ  (jsonrpc/2) [storage.LVM]
>> Reloading VGs failed
>> (vgs=[u'875847b6-29a4-4419-be92-9315f4435429'] rc=5 out=[]
>> err=['Â  /dev/mapper/23536316636393463: Checksum error at offset
>> 2748693688832', "Â  Couldn't read volume group metadata from
>> /dev/mapper/23536316636393463.", 'Â  Metadata location on
>> /dev/mapper/23536316636393463 at 2748693688832 has invalid
>> summary for VG.', 'Â  Failed to read metadata summary from
>> /dev/mapper/23536316636393463', 'Â  Failed to scan VG from
>> /dev/mapper/235363166

Re: [ovirt-users] Import VM from export domain fails

2016-11-15 Thread Martijn Grendelman
Strange, the import failed twice, and it succeeded when I tried a third
time.  I'll report back when I encounter this problem again. Thanks.

Best regards,
Martijn.


Op 15-11-2016 om 08:33 schreef Elad Ben Aharon:
> Can you please attach engine.log?
> Thanks
>
> On Mon, Nov 14, 2016 at 6:28 PM, Martijn Grendelman
> <martijn.grendel...@isaac.nl <mailto:martijn.grendel...@isaac.nl>> wrote:
>
> 2016-11-14 17:24:27,980 ERROR
> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
> (org.ovirt.thread.pool-8-thread-39) [4a0b828a] Exception:
> java.lang.reflect.UndeclaredThrowableException
> at com.sun.proxy.$Proxy183.isMacInRange(Unknown Source)
> at
> java.util.function.Predicate.lambda$negate$1(Predicate.java:80)
> [rt.jar:1.8.0_111]
> at
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
> [rt.jar:1.8.0_111]
> at
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
> [rt.jar:1.8.0_111]
> at
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> [rt.jar:1.8.0_111]
> at
> 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
> [rt.jar:1.8.0_111]
> at
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> [rt.jar:1.8.0_111]
> at
> 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> [rt.jar:1.8.0_111]
> at
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> [rt.jar:1.8.0_111]
> at
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> [rt.jar:1.8.0_111]
> at
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> [rt.jar:1.8.0_111]
> at
> 
> org.ovirt.engine.core.bll.network.vm.ExternalVmMacsFinder.findExternalMacAddresses(ExternalVmMacsFinder.java:38)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.reportExternalMacs(ImportVmCommandBase.java:438)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.addVmInterfaces(ImportVmCommandBase.java:552)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.lambda$addVmToDb$0(ImportVmCommandBase.java:458)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:204)
> [utils.jar:]
> at
> 
> org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.addVmToDb(ImportVmCommandBase.java:454)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.executeCommand(ImportVmCommandBase.java:425)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1305)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1447)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2075)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:166)
> [utils.jar:]
> at
> 
> org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:105)
> [utils.jar:]
> at
> org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1490)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:398)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.executeValidatedCommand(PrevalidatingMultipleActionsRunner.java:204)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.runCommands(PrevalidatingMultipleActionsRunner.java:176)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.lambda$invokeCommands$3(PrevalidatingMultipleActionsRunner.java:182)
> [bll.jar:]
> at
> 
> org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:92)
> [utils.jar:]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [rt.jar:1.8.0_111]

[ovirt-users] Import VM from export domain fails

2016-11-14 Thread Martijn Grendelman
2016-11-14 17:24:27,980 ERROR
[org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
(org.ovirt.thread.pool-8-thread-39) [4a0b828a] Exception:
java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy183.isMacInRange(Unknown Source)
at
java.util.function.Predicate.lambda$negate$1(Predicate.java:80)
[rt.jar:1.8.0_111]
at
java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
[rt.jar:1.8.0_111]
at
java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
[rt.jar:1.8.0_111]
at
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
[rt.jar:1.8.0_111]
at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
[rt.jar:1.8.0_111]
at
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
[rt.jar:1.8.0_111]
at
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
[rt.jar:1.8.0_111]
at
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
[rt.jar:1.8.0_111]
at
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
[rt.jar:1.8.0_111]
at
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
[rt.jar:1.8.0_111]
at
org.ovirt.engine.core.bll.network.vm.ExternalVmMacsFinder.findExternalMacAddresses(ExternalVmMacsFinder.java:38)
[bll.jar:]
at
org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.reportExternalMacs(ImportVmCommandBase.java:438)
[bll.jar:]
at
org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.addVmInterfaces(ImportVmCommandBase.java:552)
[bll.jar:]
at
org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.lambda$addVmToDb$0(ImportVmCommandBase.java:458)
[bll.jar:]
at
org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:204)
[utils.jar:]
at
org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.addVmToDb(ImportVmCommandBase.java:454)
[bll.jar:]
at
org.ovirt.engine.core.bll.exportimport.ImportVmCommandBase.executeCommand(ImportVmCommandBase.java:425)
[bll.jar:]
at
org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1305)
[bll.jar:]
at
org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1447)
[bll.jar:]
at
org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2075)
[bll.jar:]
at
org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:166)
[utils.jar:]
at
org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:105)
[utils.jar:]
at
org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1490)
[bll.jar:]
at
org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:398)
[bll.jar:]
at
org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.executeValidatedCommand(PrevalidatingMultipleActionsRunner.java:204)
[bll.jar:]
at
org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.runCommands(PrevalidatingMultipleActionsRunner.java:176)
[bll.jar:]
at
org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.lambda$invokeCommands$3(PrevalidatingMultipleActionsRunner.java:182)
[bll.jar:]
at
org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:92)
[utils.jar:]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[rt.jar:1.8.0_111]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[rt.jar:1.8.0_111]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[rt.jar:1.8.0_111]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[rt.jar:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_111]
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[rt.jar:1.8.0_111]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[rt.jar:1.8.0_111]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[rt.jar:1.8.0_111]
at java.lang.reflect.Method.invoke(Method.java:498)
[rt.jar:1.8.0_111]
at
org.ovirt.engine.core.utils.lock.LockedObjectFactory$LockingInvocationHandler.invoke(LockedObjectFactory.java:59)
[utils.jar:]
... 34 more
Caused by: java.lang.NumberFormatException: For input string: ""
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
[rt.jar:1.8.0_111]
at java.lang.Long.parseLong(Long.java:601) [rt.jar:1.8.0_111]
at
org.ovirt.engine.core.utils.MacAddressRangeUtils.macToLong(MacAddressRangeUtils.java:123)
[utils.jar:]
at

Re: [ovirt-users] VMware import, username with '.' not accepted

2016-10-25 Thread Martijn Grendelman
Done. https://bugzilla.redhat.com/show_bug.cgi?id=1388336

Cheers,
Martijn.

Op 25-10-2016 om 00:41 schreef Tomáš Golembiovský:
> Hi,
>
> unfortunately the mentioned bug is related only to DC and Cluster
> fields. The problem with dot in user name is still present in oVirt 4.0.
>
> Martijn, can you please open a bug for us?
>
> Thanks,
>
> Tomas
>
>
> On Mon, 24 Oct 2016 16:33:25 +0200
> Martijn Grendelman <martijn.grendel...@isaac.nl> wrote:
>
>> Op 24-10-2016 om 16:06 schreef Michal Skrivanek:
>>>> On 24 Oct 2016, at 11:21, Martijn Grendelman <martijn.grendel...@isaac.nl> 
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm trying to import a VM from VMware. On the "Import Virtual
>>>> Machine(s)" screen, my VCenter username is not accepted, because "Name
>>>> can only contain 'A-Z', 'a-z', '0-9', '_' or '-' characters" and it
>>>> contains a '.', which should be perfectly fine.  
>>> What is the exact version?
>>> There were improvements in bug 1377271 which is in 4.0.5 RC2  
>> This is 4.0.3.
>>
>> Thanks,
>> Martijn.
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VMware import, username with '.' not accepted

2016-10-24 Thread Martijn Grendelman

Op 24-10-2016 om 16:06 schreef Michal Skrivanek:
>> On 24 Oct 2016, at 11:21, Martijn Grendelman <martijn.grendel...@isaac.nl> 
>> wrote:
>>
>> Hi,
>>
>> I'm trying to import a VM from VMware. On the "Import Virtual
>> Machine(s)" screen, my VCenter username is not accepted, because "Name
>> can only contain 'A-Z', 'a-z', '0-9', '_' or '-' characters" and it
>> contains a '.', which should be perfectly fine.
> What is the exact version?
> There were improvements in bug 1377271 which is in 4.0.5 RC2

This is 4.0.3.

Thanks,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] VMware import, username with '.' not accepted

2016-10-24 Thread Martijn Grendelman
Hi,

I'm trying to import a VM from VMware. On the "Import Virtual
Machine(s)" screen, my VCenter username is not accepted, because "Name
can only contain 'A-Z', 'a-z', '0-9', '_' or '-' characters" and it
contains a '.', which should be perfectly fine.

Now what?

Regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] iSCSI domain on 4kn drives

2016-09-05 Thread Martijn Grendelman
Op 7-8-2016 om 8:19 schreef Yaniv Kaul:
>
> On Fri, Aug 5, 2016 at 4:42 PM, Martijn Grendelman
> <martijn.grendel...@isaac.nl <mailto:martijn.grendel...@isaac.nl>> wrote:
>
> Op 4-8-2016 om 18:36 schreef Yaniv Kaul:
>> On Thu, Aug 4, 2016 at 11:49 AM, Martijn Grendelman
>> <martijn.grendel...@isaac.nl
>> <mailto:martijn.grendel...@isaac.nl>> wrote:
>>
>> Hi,
>>
>> Does oVirt support iSCSI storage domains on target LUNs using
>> a block
>> size of 4k?
>>
>>
>> No, we do not - not if it exposes 4K blocks.
>> Y.
>
> Is this on the roadmap?
>
>
> Not in the short term roadmap.
> Of course, patches are welcome. It's mainly in VDSM.
> I wonder if it'll work in NFS.
> Y.

I don't think I ever replied to this, but I can confirm that in RHEV 3.6
it works with NFS.

Best regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] iSCSI domain on 4kn drives

2016-08-05 Thread Martijn Grendelman
Op 4-8-2016 om 18:36 schreef Yaniv Kaul:
> On Thu, Aug 4, 2016 at 11:49 AM, Martijn Grendelman
> <martijn.grendel...@isaac.nl <mailto:martijn.grendel...@isaac.nl>> wrote:
>
> Hi,
>
> Does oVirt support iSCSI storage domains on target LUNs using a block
> size of 4k?
>
>
> No, we do not - not if it exposes 4K blocks.
> Y.

Is this on the roadmap?

I just bought a bunch of 4k native drives and spent a LOT of money. Now
it seems they are useless in my oVirt/RHEV environment...

Regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] iSCSI domain on 4kn drives

2016-08-04 Thread Martijn Grendelman
Hi,

Does oVirt support iSCSI storage domains on target LUNs using a block
size of 4k?

Best regards,
Martijn Grendelman

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] iSCSI storage maintenance

2016-05-26 Thread Martijn Grendelman
Hi,

I have a simple oVirt setup, with one storage server and a couple of
hypervisors, running some 50 VMs. The storage server uses ZFS zvols,
exported over iSCSI with SCST. Now I want to do some maintenance on the
storage, specificly I want to update SCST to a new version.

I expect that the normal procedure for this would be:
- shutdown all VMs
- put storage domain into maintenance
- perform maintenance
- get everything back online

Now I know I can also take the following ugly shortcut:
- stop SCST daemon
- see all VMs go to Paused
- perform maintenance
- restart SCST
- resume all VMs or wait for them to resume themselves

The win being of course, that nothing has to be restarted/rebooted.
Extremely small scale testing (one running VM on a 20 GB test domain)
indicates, that this works like a charm. The VM resumes without a
problem and doesn't log anything storage related.

My question is: what are the risks involved in the shortcut scenario?

I understand that there are IOPS that never reach the disk, so they have
to be queued somewhere (inside Qemu I presume). What happens if this
happens with 50 VMs at once?

Best regards,
Martijn.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Bad performance with Windows 2012 guests

2015-05-04 Thread Martijn Grendelman

Hi,



Ever since our first Windows Server 2012 deployment on oVirt (3.4 back
then, now 3.5.1), I have noticed that working on these VMs via RDP or on
the console via VNC is noticeably slower than on Windows 2008 guests on
the same oVirt environment.

[snip]


Does anyone share this experience?
Any idea why this could happen and how it can be fixed?
Any other information I should share to get a better idea?


Hi Martijn,
Can you please provide the QEMU command line, together with kvm and qemu 
version?

This information will be helpful for reproducing the problem.
However, if the problem is not reproducible on a local setup, we will
probably need to ask collecting some performance information with
xperf tool.


Sure!

Command line is this:

/usr/libexec/qemu-kvm -name Getafix -S -M rhel6.5.0 -cpu 
Penryn,hv_relaxed -enable-kvm -m 2048 -realtime mlock=off -smp 
2,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 
34951c25-9a37-4712-a16a-fdfc98f4febc -smbios 
type=1,manufacturer=oVirt,product=oVirt 
Node,version=6-6.el6.centos.12.2,serial=44454C4C-3400-1058-804C-B1C04F42344A,uuid=34951c25-9a37-4712-a16a-fdfc98f4febc 
-nodefconfig -nodefaults -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/Getafix.monitor,server,nowait 
-mon chardev=charmonitor,id=monitor,mode=control -rtc 
base=2015-01-12T11:14:02,clock=vm,driftfix=slew -no-kvm-pit-reinjection 
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 
-device 
virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4 
-drive 
if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial= 
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 
-drive 
file=/rhev/data-center/aefd5844-6e01-4070-b3b9-c0d73cc40c78/52678e67-a202-4306-b7ed-5fed8df10edf/images/28cc9a6c-6f2e-4b09-b361-f2a09f27dbc5/4c7b571e-4b29-47b9-ab4b-5799d64f28f9,if=none,id=drive-virtio-disk0,format=raw,serial=28cc9a6c-6f2e-4b09-b361-f2a09f27dbc5,cache=none,werror=stop,rerror=stop,aio=threads 
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 
-netdev tap,fd=41,id=hostnet0,vhost=on,vhostfd=43 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:74:59:a2,bus=pci.0,addr=0x3 
-chardev 
socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/34951c25-9a37-4712-a16a-fdfc98f4febc.com.redhat.rhevm.vdsm,server,nowait 
-device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm 
-chardev 
socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/34951c25-9a37-4712-a16a-fdfc98f4febc.org.qemu.guest_agent.0,server,nowait 
-device 
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 
-device usb-tablet,id=input0 -vnc 172.17.6.14:7,password -k en-us -vga 
cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg 
timestamp=on


Qemu version:

qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64

Please let me know if I can do more to help!

Best regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Bad performance with Windows 2012 guests

2015-04-30 Thread Martijn Grendelman

Hi,

Ever since our first Windows Server 2012 deployment on oVirt (3.4 back 
then, now 3.5.1), I have noticed that working on these VMs via RDP or on 
the console via VNC is noticeably slower than on Windows 2008 guests on 
the same oVirt environment.


Basic things like starting an application (even the Server Manager that 
get started automatically on login) take a very long time, sometimes 
minutes. Everything is just... slow.


We have recently deployed Microsoft Exchange on a Windows Server 2012 
guest on RHEV, and it doesn't perform well at all.


I haven't been able to find the cause for this slowness; CPU usage is 
not excessive and it doesn't seem I/O related. Moreover, other types of 
guests (Linux and even Windows 2008) do not have this problem.


We have 3 different environments:
- oVirt 3.5.1, on old Dell servers with Penryn Family CPUs with fairly 
slow storage on replicated GlusterFS, running CentOS 6.6
- oVirt 3.5.1, on modern 6-core SandyBridge servers with local storage 
via NFS, running CentOS 7.0)
- RHEV 3.4.4 on modern 10-core SandyBridge servers with an iSCSI SAN 
behind it, running on RHEV Hypervisor 6.5


All of these -very different- environments expose the same behaviour: 
Linux, Windows 2008 fast (or as fast as can be expected given the 
hardware), Windows 2012 painfully slow.


All Windows 2012 servers use VirtIO disk and network. I think all 
drivers are from the virtio-win-0.1-74 ISO.


Does anyone share this experience?
Any idea why this could happen and how it can be fixed?
Any other information I should share to get a better idea?

Btw, for the guests on the RHEV environment, we have a case with RedHat 
support, but that doesn't seem to lead to a quick solution, hence I'm 
writing here, too.


Thanks for any help.

Regards,
Martijn Grendelman
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Ovirt Engine WAN security

2014-12-19 Thread Martijn Grendelman
Donny Davis schreef op 18-12-2014 om 23:25:
 I would like to inquire if anyone is using the ovirt engine to control
 remote datacenters, and if so.. How are you securing it. I realize you
 cannot devulge trade secrets or your actual setup.. Just general info,
 like we are using vpn, or SSH..

We use a 'management VLAN', only reachable through VPN.

Best regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] vm has paused due to unknown storage error

2014-12-18 Thread Martijn Grendelman
Hi,

On a new host, I am running into exactly the same scenario.

I have a host with an oVirt-managed GlusterFS volume (single brick on
local disk in distribute mode) on an XFS file system.

I think I have found the root cause, but I doubt I can fix it.

Around the time of the VMs going paused, there seemed to be a glusterfsd
restart:

 [2014-12-18 01:43:27.272235] W [glusterfsd.c:1194:cleanup_and_exit] (-- 0-: 
 received signum (15), shutting down
 [2014-12-18 01:43:27.272279] I [fuse-bridge.c:5599:fini] 0-fuse: Unmounting 
 '/rhev/data-center/mnt/glusterSD/onode3.isaac.local:data02'.
 [2014-12-18 01:49:36.854339] I [MSGID: 100030] [glusterfsd.c:2018:main] 
 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.1 
 (args: /usr/sbin/glusterfs -
 -volfile-server=onode3.isaac.local --volfile-id=data02 
 /rhev/data-center/mnt/glusterSD/onode3.isaac.local:data02)
 [2014-12-18 01:49:36.862887] I [dht-shared.c:337:dht_init_regex] 
 0-data02-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
 [2014-12-18 01:49:36.863749] I [client.c:2280:notify] 0-data02-client-0: 
 parent translators are ready, attempting connect on transport

So I thought I'd check /var/log/messages for potential sources of the
SIGTERM, and I found this:

 Dec 18 02:43:26 onode3 kernel: supervdsmServer[1960]: segfault at 18 ip 
 7faa89951bca sp 7fa355b80f40 error 4 in 
 libgfapi.so.0.0.0[7faa8994c000+18000]
 Dec 18 02:43:27 onode3 systemd: supervdsmd.service: main process exited, 
 code=killed, status=11/SEGV
 Dec 18 02:43:27 onode3 systemd: Unit supervdsmd.service entered failed state.
 Dec 18 02:43:27 onode3 journal: vdsm jsonrpc.JsonRpcServer ERROR Internal 
 server error
 Traceback (most recent call last):
   File /usr/lib/python2.7/site-packages/yajsonrpc/__init__.py, line 486, in 
 _serveRequest
 res = method(**params)
   File /usr/share/vdsm/rpc/Bridge.py, line 266, in _dynamicMethod
 result = fn(*methodArgs)
   File /usr/share/vdsm/gluster/apiwrapper.py, line 106, in status
 return self._gluster.volumeStatus(volumeName, brick, statusOption)
   File /usr/share/vdsm/gluster/api.py, line 54, in wrapper
 rv = func(*args, **kwargs)
   File /usr/share/vdsm/gluster/api.py, line 221, in volumeStatus
 data = self.svdsmProxy.glusterVolumeStatvfs(volumeName)
   File /usr/share/vdsm/supervdsm.py, line 50, in __call__
 return callMethod()
   File /usr/share/vdsm/supervdsm.py, line 48, in lambda
 **kwargs)
   File string, line 2, in glusterVolumeStatvfs
   File /usr/lib64/python2.7/multiprocessing/managers.py, line 759, in 
 _callmethod
 kind, result = conn.recv()
 EOFError
 Dec 18 02:43:27 onode3 systemd: supervdsmd.service holdoff time over, 
 scheduling restart.
 Dec 18 02:43:27 onode3 systemd: Stopping Virtual Desktop Server Manager...
 Dec 18 02:43:27 onode3 systemd: Stopping Auxiliary vdsm service for running 
 helper functions as root...
 Dec 18 02:43:27 onode3 systemd: Starting Auxiliary vdsm service for running 
 helper functions as root...
 Dec 18 02:43:27 onode3 systemd: Started Auxiliary vdsm service for running 
 helper functions as root.
 Dec 18 02:43:27 onode3 journal: vdsm IOProcessClient ERROR IOProcess failure
 Traceback (most recent call last):
   File /usr/lib/python2.7/site-packages/ioprocess/__init__.py, line 107, in 
 _communicate
 raise Exception(FD closed)
 Exception: FD closed


I guess I'll file a bug report.

Best regards,
Martijn Grendelman






Punit Dambiwal schreef op 12-12-2014 om 3:44:
 Hi Dan,
 
 Yes..it's glusterfs
 
 glusterfs logs :- http://ur1.ca/j3b5f
 
 OS Version: RHEL - 7 - 0.1406.el7.centos.2.3
 Kernel Version: 3.10.0 - 123.el7.x86_64
 KVM Version: 1.5.3 - 60.el7_0.2
 LIBVIRT Version: libvirt-1.1.1-29.el7_0.3
 VDSM Version: vdsm-4.16.7-1.gitdb83943.el7
 GlusterFS Version: glusterfs-3.6.1-1.el7
 Qemu Version : QEMU emulator version 1.5.3 (qemu-kvm-1.5.3-60.el7_0.2)
 
 Thanks,
 punit
 
 
 
 
 On Thu, Dec 11, 2014 at 5:47 PM, Dan Kenigsberg dan...@redhat.com
 mailto:dan...@redhat.com wrote:
 
 On Thu, Dec 11, 2014 at 03:41:01PM +0800, Punit Dambiwal wrote:
  Hi,
 
  Suddenly all of my VM on one host paused with the following error :-
 
  vm has paused due to unknown storage error
 
  I am using glusterfs storage with distributed replicate
 replica=2my
  storage and compute both running on the same node...
 
  engine logs :- http://ur1.ca/j31iu
  Host logs :- http://ur1.ca/j31kk(I grep it for one Failed VM)
 
 libvirtEventLoop::INFO::2014-12-11
 15:00:48,627::vm::4780::vm.Vm::(_onIOError)
 vmId=`e84bb987-a817-436a-9417-8eab9148e57e`::abnormal vm stop device
 virtio-disk0 error eother
 
 Which type of storage is it? gluster? Do you have anything in particular
 on glusterfs logs?
 
 Which glusterfs/qemu/libvirt/vdsm versions do you have installed?
 
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org

Re: [ovirt-users] vm has paused due to unknown storage error

2014-12-18 Thread Martijn Grendelman
Oh I just found this:

https://bugzilla.redhat.com/show_bug.cgi?id=1162640

Cheers,
M.



Martijn Grendelman schreef op 18-12-2014 om 15:03:
 Hi,
 
 On a new host, I am running into exactly the same scenario.
 
 I have a host with an oVirt-managed GlusterFS volume (single brick on
 local disk in distribute mode) on an XFS file system.
 
 I think I have found the root cause, but I doubt I can fix it.
 
 Around the time of the VMs going paused, there seemed to be a glusterfsd
 restart:
 
 [2014-12-18 01:43:27.272235] W [glusterfsd.c:1194:cleanup_and_exit] (-- 0-: 
 received signum (15), shutting down
 [2014-12-18 01:43:27.272279] I [fuse-bridge.c:5599:fini] 0-fuse: Unmounting 
 '/rhev/data-center/mnt/glusterSD/onode3.isaac.local:data02'.
 [2014-12-18 01:49:36.854339] I [MSGID: 100030] [glusterfsd.c:2018:main] 
 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.1 
 (args: /usr/sbin/glusterfs -
 -volfile-server=onode3.isaac.local --volfile-id=data02 
 /rhev/data-center/mnt/glusterSD/onode3.isaac.local:data02)
 [2014-12-18 01:49:36.862887] I [dht-shared.c:337:dht_init_regex] 
 0-data02-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
 [2014-12-18 01:49:36.863749] I [client.c:2280:notify] 0-data02-client-0: 
 parent translators are ready, attempting connect on transport
 
 So I thought I'd check /var/log/messages for potential sources of the
 SIGTERM, and I found this:
 
 Dec 18 02:43:26 onode3 kernel: supervdsmServer[1960]: segfault at 18 ip 
 7faa89951bca sp 7fa355b80f40 error 4 in 
 libgfapi.so.0.0.0[7faa8994c000+18000]
 Dec 18 02:43:27 onode3 systemd: supervdsmd.service: main process exited, 
 code=killed, status=11/SEGV
 Dec 18 02:43:27 onode3 systemd: Unit supervdsmd.service entered failed state.
 Dec 18 02:43:27 onode3 journal: vdsm jsonrpc.JsonRpcServer ERROR Internal 
 server error
 Traceback (most recent call last):
   File /usr/lib/python2.7/site-packages/yajsonrpc/__init__.py, line 486, 
 in _serveRequest
 res = method(**params)
   File /usr/share/vdsm/rpc/Bridge.py, line 266, in _dynamicMethod
 result = fn(*methodArgs)
   File /usr/share/vdsm/gluster/apiwrapper.py, line 106, in status
 return self._gluster.volumeStatus(volumeName, brick, statusOption)
   File /usr/share/vdsm/gluster/api.py, line 54, in wrapper
 rv = func(*args, **kwargs)
   File /usr/share/vdsm/gluster/api.py, line 221, in volumeStatus
 data = self.svdsmProxy.glusterVolumeStatvfs(volumeName)
   File /usr/share/vdsm/supervdsm.py, line 50, in __call__
 return callMethod()
   File /usr/share/vdsm/supervdsm.py, line 48, in lambda
 **kwargs)
   File string, line 2, in glusterVolumeStatvfs
   File /usr/lib64/python2.7/multiprocessing/managers.py, line 759, in 
 _callmethod
 kind, result = conn.recv()
 EOFError
 Dec 18 02:43:27 onode3 systemd: supervdsmd.service holdoff time over, 
 scheduling restart.
 Dec 18 02:43:27 onode3 systemd: Stopping Virtual Desktop Server Manager...
 Dec 18 02:43:27 onode3 systemd: Stopping Auxiliary vdsm service for running 
 helper functions as root...
 Dec 18 02:43:27 onode3 systemd: Starting Auxiliary vdsm service for running 
 helper functions as root...
 Dec 18 02:43:27 onode3 systemd: Started Auxiliary vdsm service for running 
 helper functions as root.
 Dec 18 02:43:27 onode3 journal: vdsm IOProcessClient ERROR IOProcess failure
 Traceback (most recent call last):
   File /usr/lib/python2.7/site-packages/ioprocess/__init__.py, line 107, 
 in _communicate
 raise Exception(FD closed)
 Exception: FD closed
 
 
 I guess I'll file a bug report.
 
 Best regards,
 Martijn Grendelman
 
 
 
 
 
 
 Punit Dambiwal schreef op 12-12-2014 om 3:44:
 Hi Dan,

 Yes..it's glusterfs

 glusterfs logs :- http://ur1.ca/j3b5f

 OS Version: RHEL - 7 - 0.1406.el7.centos.2.3
 Kernel Version: 3.10.0 - 123.el7.x86_64
 KVM Version: 1.5.3 - 60.el7_0.2
 LIBVIRT Version: libvirt-1.1.1-29.el7_0.3
 VDSM Version: vdsm-4.16.7-1.gitdb83943.el7
 GlusterFS Version: glusterfs-3.6.1-1.el7
 Qemu Version : QEMU emulator version 1.5.3 (qemu-kvm-1.5.3-60.el7_0.2)

 Thanks,
 punit




 On Thu, Dec 11, 2014 at 5:47 PM, Dan Kenigsberg dan...@redhat.com
 mailto:dan...@redhat.com wrote:

 On Thu, Dec 11, 2014 at 03:41:01PM +0800, Punit Dambiwal wrote:
  Hi,
 
  Suddenly all of my VM on one host paused with the following error :-
 
  vm has paused due to unknown storage error
 
  I am using glusterfs storage with distributed replicate
 replica=2my
  storage and compute both running on the same node...
 
  engine logs :- http://ur1.ca/j31iu
  Host logs :- http://ur1.ca/j31kk(I grep it for one Failed VM)

 libvirtEventLoop::INFO::2014-12-11
 15:00:48,627::vm::4780::vm.Vm::(_onIOError)
 vmId=`e84bb987-a817-436a-9417-8eab9148e57e`::abnormal vm stop device
 virtio-disk0 error eother

 Which type of storage is it? gluster? Do you have anything in particular
 on glusterfs logs?

 Which glusterfs/qemu

[ovirt-users] Live merge?

2014-11-12 Thread Martijn Grendelman
Hello,

The first thing the 3.5 release notes talk about is Live Merge of
snapshots and I've been dying to try that out.

Problem: I think my environment is completely up to date now, but the
Delete link on VM snapshots remains greyed out when the VM is running.

Going via Storage - Domain name - Disk Snapshots, clicking
Remove is possible  (UX inconsistency), but it stops with Error while
executing action: Cannot remove Disk Snapshot. At least one of the VMs
is not down.

What is needed, besides the following, to make live merge work?

- oVirt Engine Version: 3.5.0.1-1.el6
- Data Center Compatibility Version: 3.5
- Host OS: RHEL - 6 - 6.el6.centos.12.2
- KVM Version: 0.12.1.2 - 2.415.el6_5.7
- Libvirt Version: libvirt-0.10.2-46.el6_6.1
- VDSM version: vdsm-4.16.7-1.gitdb83943.el6
- Live Snapsnot Support: Active

VM has been restarted to make sure it's running on the latest qemu-kvm.

Please advise, thank you in advance.

Best regards,
Martijn Grendelman
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Confused about Gluster usage

2014-02-18 Thread Martijn Grendelman
Hi,

I have been running oVirt 3.3.3 with a single node in a NFS type data
center. Now, I would like to set up a second node. Both nodes have
plenty of storage, but they're only connected to each other over 1 Gbit.
I'm running nodes on CentOS 6.5.

What I would like to accomplish is:

* use a Gluster-backed DATA domain on my existing NFS datacenter
* load balancing by even spread of VMs over the two nodes
* leveraging the speed of local storage, so running a VM over NFS to the
other node is undesireable

So I was thinking I want the storage to be replicated, so that I can
take a node down for maintenance without having to migrate all the
storage to another node.

I was thinking: GlusterFS.

But I am confused on how to set it up. I understand I cannot use the
libgfapi native integration due to dependency problems on CentOS. I have
set up a replicated Gluster volume manually.

How can I use my two nodes with this Gluster volume? What are the
necessary steps?

I did try a couple of things; most notably I was able to create a 2nd
data center with POSIX storage, and mount the Gluster volume there, but
that doesn't work for the first node.

Alternatively, it would also be fine to migrate all existing VMs to the
POSIX datacenter and then move the existing node from the old NFS data
center to the new POSIX data center. Is that possible without
exporting/importing all the VMs?

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] oVirt or RHEV ?

2014-02-06 Thread Martijn Grendelman
Hi,

This may be the wrong place to ask, but I'm looking for input to form an
opinion on an oVirt or RHEV question within my company.

I have been running oVirt for about 5 months now, and I'm quite
comfortable with its features and maintenance procedures. We are now
planning to build a private virtualization cluster for hosting clients'
applications as well as our own. Some people in the company are
questioning whether we should buy RHEV, but at this point, I can't see
the benefits.

Can anyone on this list shed a light on when RHEV might be a better
choice than oVirt? What are the benefits? The trade-offs?

I am looking for pragmatic, real-world things, not marketing mumbo
jumbo. That, I can get from redhat.com ;-)

Best regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] oVirt or RHEV ?

2014-02-06 Thread Martijn Grendelman
Hi,

Dan Yasny schreef op 6-2-2014 16:38:
 This is the same question as in RHEL or Fedora IMO: do you want the
 bleeding edge features and lower code stability and reliability, or do
 you want to have techsupport (and that means a real SLA and an
 escalation path up to the engineering, if need be) behind you, stable
 and reliable, well tested code, but less of the advanced features.

Thank you, this is what I thought.

It's still a hard decision. If the stability and testedness of RHEL is
anything to go by, it's not reassuring at all (although it may be better
than Fedora, I don't know), although I must say that RedHat support is
helpful at times.

Thanks again, I think I know enough :-)

Best regards,
Martijn Grendelman










 
 
 
 
 On Thu, Feb 6, 2014 at 8:06 AM, Martijn Grendelman
 martijn.grendel...@isaac.nl mailto:martijn.grendel...@isaac.nl wrote:
 
 Hi,
 
 This may be the wrong place to ask, but I'm looking for input to form an
 opinion on an oVirt or RHEV question within my company.
 
 I have been running oVirt for about 5 months now, and I'm quite
 comfortable with its features and maintenance procedures. We are now
 planning to build a private virtualization cluster for hosting clients'
 applications as well as our own. Some people in the company are
 questioning whether we should buy RHEV, but at this point, I can't see
 the benefits.
 
 Can anyone on this list shed a light on when RHEV might be a better
 choice than oVirt? What are the benefits? The trade-offs?
 
 I am looking for pragmatic, real-world things, not marketing mumbo
 jumbo. That, I can get from redhat.com http://redhat.com ;-)
 
 Best regards,
 Martijn.
 ___
 Users mailing list
 Users@ovirt.org mailto:Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] oVirt or RHEV ?

2014-02-06 Thread Martijn Grendelman
Martijn Grendelman schreef op 6-2-2014 17:02:
 Hi,
 
 Dan Yasny schreef op 6-2-2014 16:38:
 This is the same question as in RHEL or Fedora IMO: do you want the
 bleeding edge features and lower code stability and reliability, or do
 you want to have techsupport (and that means a real SLA and an
 escalation path up to the engineering, if need be) behind you, stable
 and reliable, well tested code, but less of the advanced features.
 
 Thank you, this is what I thought.
 
 It's still a hard decision. If the stability and testedness of RHEL is
 anything to go by, it's not reassuring at all (although it may be better
 than Fedora, I don't know), although I must say that RedHat support is
 helpful at times.
 
 Thanks again, I think I know enough :-)

Or not ;-)

Would it be possible (and doable) to migrate from oVirt to RHEV?

If we start out with oVirt, but after some time we decide that RHEV
would be a better fit after all, would it be possible to hook up
existing oVirt/VDSM hosts to a RHEV engine, or am I thinking way too
simple now?

Cheers,
Martijn.









 On Thu, Feb 6, 2014 at 8:06 AM, Martijn Grendelman
 martijn.grendel...@isaac.nl mailto:martijn.grendel...@isaac.nl wrote:

 Hi,

 This may be the wrong place to ask, but I'm looking for input to form an
 opinion on an oVirt or RHEV question within my company.

 I have been running oVirt for about 5 months now, and I'm quite
 comfortable with its features and maintenance procedures. We are now
 planning to build a private virtualization cluster for hosting clients'
 applications as well as our own. Some people in the company are
 questioning whether we should buy RHEV, but at this point, I can't see
 the benefits.

 Can anyone on this list shed a light on when RHEV might be a better
 choice than oVirt? What are the benefits? The trade-offs?

 I am looking for pragmatic, real-world things, not marketing mumbo
 jumbo. That, I can get from redhat.com http://redhat.com ;-)

 Best regards,
 Martijn.
 ___
 Users mailing list
 Users@ovirt.org mailto:Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users


 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Failing migration, inconsistent state

2013-12-05 Thread Martijn Grendelman
Hi,

I tried to migrate several VMs from one host to another. Two VMs
migrated without issues, but for one VM, the migration didn't happen. It
seems to be hanging, but the UI is now in an inconsistent state:

- The 'Tasks' tab reports 0 active tasks, but the last task (the
migration in question) is still reported as 'Executing'.
- The VM status is 'Up' (not migrating)
- 'Migrate' action is choosable from menu, while 'Cancel Migration' is
greyed out, but when I choose 'Migrate' and pick a host, I am told
'Cannot migrate VM. VM name is being migrated.'.

What is the best way to fix this?

Met vriendelijke groet,
Martijn Grendelman
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Failing migration, inconsistent state

2013-12-05 Thread Martijn Grendelman
Martijn Grendelman schreef op 5-12-2013 11:00:
 Hi,
 
 I tried to migrate several VMs from one host to another. Two VMs
 migrated without issues, but for one VM, the migration didn't happen. It
 seems to be hanging, but the UI is now in an inconsistent state:
 
 - The 'Tasks' tab reports 0 active tasks, but the last task (the
 migration in question) is still reported as 'Executing'.
 - The VM status is 'Up' (not migrating)
 - 'Migrate' action is choosable from menu, while 'Cancel Migration' is
 greyed out, but when I choose 'Migrate' and pick a host, I am told
 'Cannot migrate VM. VM name is being migrated.'.
 
 What is the best way to fix this?

And perhaps this information is useful to oVirt developers:


 Thread-600648::DEBUG::2013-12-03 
 12:37:47,926::vm::180::vm.Vm::(_setupVdsConnection) 
 vmId=`0669e3c2-9cfd-4d4e-a0a3-56070902a8c8`::Destination server is: 
 onode0.isaac.local:54321
 Thread-600648::DEBUG::2013-12-03 
 12:37:47,927::vm::182::vm.Vm::(_setupVdsConnection) 
 vmId=`0669e3c2-9cfd-4d4e-a0a3-56070902a8c8`::Initiating connection with 
 destination
 Thread-600648::DEBUG::2013-12-03 
 12:37:47,990::vm::232::vm.Vm::(_prepareGuest) 
 vmId=`0669e3c2-9cfd-4d4e-a0a3-56070902a8c8`::Migration started
 Thread-600648::DEBUG::2013-12-03 12:37:48,006::vm::299::vm.Vm::(run) 
 vmId=`0669e3c2-9cfd-4d4e-a0a3-56070902a8c8`::migration semaphore acquired
 Thread-600648::DEBUG::2013-12-03 
 12:37:48,115::vm::357::vm.Vm::(_startUnderlyingMigration) 
 vmId=`0669e3c2-9cfd-4d4e-a0a3-56070902a8c8`::starting migration to 
 qemu+tls://onode0.isaac.local/system with miguri tcp://onode0.isaac.local
 Thread-600648::DEBUG::2013-12-03 
 12:43:10,819::libvirtconnection::108::libvirtconnection::(wrapper) Unknown 
 libvirterror: ecode: 9 edom: 10 level: 2 message: operation failed: migration 
 job: unexpectedly failed
 Thread-600648::DEBUG::2013-12-03 12:43:10,819::vm::742::vm.Vm::(cancel) 
 vmId=`0669e3c2-9cfd-4d4e-a0a3-56070902a8c8`::canceling migration downtime 
 thread
 Thread-600648::DEBUG::2013-12-03 12:43:10,819::vm::812::vm.Vm::(stop) 
 vmId=`0669e3c2-9cfd-4d4e-a0a3-56070902a8c8`::stopping migration monitor thread
 Thread-600648::ERROR::2013-12-03 12:43:10,820::vm::238::vm.Vm::(_recover) 
 vmId=`0669e3c2-9cfd-4d4e-a0a3-56070902a8c8`::operation failed: migration job: 
 unexpectedly failed
 Thread-600648::ERROR::2013-12-03 12:43:11,276::vm::321::vm.Vm::(run) 
 vmId=`0669e3c2-9cfd-4d4e-a0a3-56070902a8c8`::Failed to migrate
 Traceback (most recent call last):
   File /usr/share/vdsm/vm.py, line 308, in run
 self._startUnderlyingMigration()
   File /usr/share/vdsm/vm.py, line 385, in _startUnderlyingMigration
 None, maxBandwidth)
   File /usr/share/vdsm/vm.py, line 835, in f
 ret = attr(*args, **kwargs)
   File /usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py, line 
 76, in wrapper
 ret = f(*args, **kwargs)
   File /usr/lib64/python2.6/site-packages/libvirt.py, line 1178, in 
 migrateToURI2
 if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
 dom=self)
 libvirtError: operation failed: migration job: unexpectedly failed

Cheers,
Martijn.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] VM with stateless snapshot won't start

2013-12-05 Thread Martijn Grendelman
Hi,

After maintenance on a host, I am trying to start a VM that has been
running statelessly for a while. It refuses to start and Engine logs the
following:

2013-12-05 11:59:18,125 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(pool-6-thread-48) Correlation ID: 43d49965, Job ID:
0d943a6d-9d65-4ac1-89b7-139d30b4813c, Call Stack: null, Custom Event ID:
-1, Message: Failed to start VM WinXP, because exist snapshot for
stateless state. Snapshot will be deleted.

Should I submit a bug report for the poor English in this log line? ;-)

The 'Snapshots' tab for the VM doesn't show anything, but repeated
attempts to start the VM just show the same message in the log.

What can I do to start this VM?

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Failing migration, inconsistent state

2013-12-05 Thread Martijn Grendelman

Martijn Grendelman schreef op 5-12-2013 11:18:
 Martijn Grendelman schreef op 5-12-2013 11:00:
 Hi,

 I tried to migrate several VMs from one host to another. Two VMs
 migrated without issues, but for one VM, the migration didn't happen. It
 seems to be hanging, but the UI is now in an inconsistent state:

 - The 'Tasks' tab reports 0 active tasks, but the last task (the
 migration in question) is still reported as 'Executing'.
 - The VM status is 'Up' (not migrating)
 - 'Migrate' action is choosable from menu, while 'Cancel Migration' is
 greyed out, but when I choose 'Migrate' and pick a host, I am told
 'Cannot migrate VM. VM name is being migrated.'.

 What is the best way to fix this?

After a restart of Engine, the message that I got was different,
something about no host being available with enough memory (which was
correct). So I guess an Engine restart fixed it, even though the initial
migration task still shows as unfinished in the Tasks panel.

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Problem with python-cpopen dependency on f19 AIO stable

2013-12-05 Thread Martijn Grendelman
I have the same issue on a CentOS node after updating it to 6.5:

 Resolving Dependencies
 -- Running transaction check
 --- Package python-cpopen.x86_64 0:1.2.3-4.el6 will be obsoleted
 --- Package vdsm-python-cpopen.x86_64 0:4.13.0-11.el6 will be obsoleting
 -- Finished Dependency Resolution

 Dependencies Resolved
 
 
  Package   Arch  
 Version  Repository   Size
 
 Installing:
  vdsm-python-cpopenx86_64
 4.13.0-11.el6ovirt-stable 19 k
  replacing  python-cpopen.x86_64 1.2.3-4.el6
 
 Transaction Summary
 
 Install   1 Package(s)
 
 Total download size: 19 k
 Is this ok [y/N]: y

On a subsequent run of 'yum update', python-cpopen will replace
vdsm-python-cpopen, and so on.

Cheers,
Martijn.







Vinzenz Feenstra schreef op 5-12-2013 8:34:
 Forwading to vdsm-devel
 
 On 12/04/2013 08:59 AM, Gianluca Cecchi wrote:
 Hello,
 since yesterday evening I have this sort of dependency problem with updates

 yum update
 say

 Resolving Dependencies
 -- Running transaction check
 --- Package python-cpopen.x86_64 0:1.2.3-4.fc19 will be obsoleting
 --- Package vdsm-python-cpopen.x86_64 0:4.13.0-11.fc19 will be obsoleted
 -- Finished Dependency Resolution

 Dependencies Resolved

 =
   Package  Arch  Version
   Repository  Size
 =
 Installing:
   python-cpopenx86_64
 1.2.3-4.fc19updates 19 k
   replacing  vdsm-python-cpopen.x86_64 4.13.0-11.fc19

 Transaction Summary
 =
 Install  1 Package

 If I go ahead and run yum update again I have:

 Dependencies Resolved

 =
   PackageArch   Version
 RepositorySize
 =
 Installing:
   vdsm-python-cpopen x86_64
 4.13.0-11.fc19 ovirt-stable  20 k
   replacing  python-cpopen.x86_64 1.2.3-4.fc19

 Transaction Summary
 =
 Install  1 Package

 and so again in a loop

 Gianluca
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] VM with stateless snapshot won't start

2013-12-05 Thread Martijn Grendelman
Hi,

 After maintenance on a host, I am trying to start a VM that has been
 running statelessly for a while. It refuses to start and Engine logs the
 following:
 
 2013-12-05 11:59:18,125 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
 (pool-6-thread-48) Correlation ID: 43d49965, Job ID:
 0d943a6d-9d65-4ac1-89b7-139d30b4813c, Call Stack: null, Custom Event ID:
 -1, Message: Failed to start VM WinXP, because exist snapshot for
 stateless state. Snapshot will be deleted.
 
 Should I submit a bug report for the poor English in this log line? ;-)
 
 The 'Snapshots' tab for the VM doesn't show anything, but repeated
 attempts to start the VM just show the same message in the log.
 
 What can I do to start this VM?

It seems I missed some info in the log that may well indicate the root
cause of this issue. Please see attached log excerpt. A database query
is failing due to a foreign key constraint violation.

Please advise how to fix the database inconsistency.

Regards,
Martijn.


2013-12-05 12:16:18,463 INFO  [org.ovirt.engine.core.bll.RunVmCommand] 
(pool-6-thread-49) [7f341556] Running command: RunVmCommand internal: false. En
tities affected :  ID: de196133-0ccf-41c2-a91d-1760be442080 Type: VM
2013-12-05 12:16:18,475 ERROR [org.ovirt.engine.core.bll.RunVmCommand] 
(pool-6-thread-49) [7f341556] RunVmAsStateless - WinXP - found stateless snapsh
ots for this vm  - skipped creating snapshots.
2013-12-05 12:16:18,476 INFO  [org.ovirt.engine.core.bll.VmPoolHandler] 
(pool-6-thread-49) [7f341556] VdcBll.VmPoolHandler.ProcessVmPoolOnStopVm - Del
eting snapshot for stateless vm de196133-0ccf-41c2-a91d-1760be442080
2013-12-05 12:16:18,481 INFO  
[org.ovirt.engine.core.bll.RestoreStatelessVmCommand] (pool-6-thread-49) 
Running command: RestoreStatelessVmCommand inte
rnal: true. Entities affected :  ID: de196133-0ccf-41c2-a91d-1760be442080 Type: 
VM
2013-12-05 12:16:18,487 INFO  
[org.ovirt.engine.core.bll.RestoreAllSnapshotsCommand] (pool-6-thread-49) 
Running command: RestoreAllSnapshotsCommand in
ternal: true. Entities affected :  ID: de196133-0ccf-41c2-a91d-1760be442080 
Type: VM
2013-12-05 12:16:18,488 INFO  
[org.ovirt.engine.core.bll.RestoreAllSnapshotsCommand] (pool-6-thread-49) 
Locking VM(id = de196133-0ccf-41c2-a91d-1760be
442080) without compensation.
2013-12-05 12:16:18,489 INFO  
[org.ovirt.engine.core.vdsbroker.SetVmStatusVDSCommand] (pool-6-thread-49) 
START, SetVmStatusVDSCommand( vmId = de196133
-0ccf-41c2-a91d-1760be442080, status = ImageLocked), log id: 7032f4da
2013-12-05 12:16:18,491 INFO  
[org.ovirt.engine.core.vdsbroker.SetVmStatusVDSCommand] (pool-6-thread-49) 
FINISH, SetVmStatusVDSCommand, log id: 7032f4
da
2013-12-05 12:16:18,562 ERROR 
[org.ovirt.engine.core.bll.RestoreAllSnapshotsCommand] (pool-6-thread-49) 
Command org.ovirt.engine.core.bll.RestoreAllSn
apshotsCommand throw exception: 
org.springframework.dao.DataIntegrityViolationException: 
CallableStatementCallback; SQL [{call updatevmstatic(?, ?, ?,
 ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, 
?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}]; ERROR: insert or
update on table vm_static violates foreign key constraint fk_vm_static_quota
  Detail: Key (quota_id)=(----) is not present 
in table quota.
  Where: SQL statement UPDATE vm_static SET description =  $1 , 
free_text_comment =  $2  ,mem_size_mb =  $3 ,os =  $4 ,vds_group_id =  $5 , 
VM_NAME =  $6 ,vmt_guid =  $7 , domain =  $8 ,creation_date =  $9 
,num_of_monitors =  $10 ,single_qxl_pci =  $11 , allow_console_reconnect =  $12 
, is_initialized =  $13 , num_of_sockets =  $14 ,cpu_per_socket =  $15 , 
usb_policy =  $16 ,time_zone =  $17 ,auto_startup =  $18 , is_stateless =  $19 
,dedicated_vm_for_vds =  $20 , fail_back =  $21 ,vm_type =  $22 , nice_level =  
$23 , cpu_shares =  $24 , _update_date = LOCALTIMESTAMP,default_boot_sequence = 
 $25 , default_display_type =  $26 , priority =  $27 ,iso_path =  $28 ,origin = 
 $29 , initrd_url =  $30 ,kernel_url =  $31 , kernel_params =  $32 
,migration_support =  $33 , predefined_properties =  $34 
,userdefined_properties =  $35 , min_allocated_mem =  $36 , quota_id =  $37 , 
cpu_pinning =  $38 , is_smartcard_enabled =  $39 , is_delete_protected =  $40 , 
host_cpu_flags =  $41 , tunnel_migration =  $42 , vnc_keyboard_layout =  $43 , 
is_run_and_pause =  $44 , created_by_user_id =  $45  WHERE vm_guid =  $46  AND 
entity_type = 'VM'
PL/pgSQL function updatevmstatic line 2 at SQL statement; nested exception is 
org.postgresql.util.PSQLException: ERROR: insert or update on table vm_static 
violates foreign key constraint fk_vm_static_quota
  Detail: Key (quota_id)=(----) is not present 
in table quota.
  Where: SQL statement UPDATE vm_static SET description =  $1 , 
free_text_comment =  $2  ,mem_size_mb =  $3 ,os =  $4 ,vds_group_id =  $5 , 
VM_NAME =  $6 ,vmt_guid =  $7 , domain =  $8 ,creation_date =  $9 

Re: [Users] VM with stateless snapshot won't start

2013-12-05 Thread Martijn Grendelman
Hello Dafna,

 If a failure happened and the stateless vm has shut down without delete 
 of the snapshot, the next time you will try to run it we will try to 
 delete the snapshot.
 from the engine log, it seems that there is a problem deleting the 
 snapshot because of quota.
 can you please try to disable the quota and try to run the vm again?

Quota were not enabled on this Data Center, I have never done anything
with quota on oVirt. The 'quota' table was empty, and the 'quota_id'
field on all VMs in the 'vm_static' table was NULL.

Since it was seemingly trying to set the quota_id for this particular VM
to '----', I manually inserted a record
into the quota table using this ID. After that, I was able to start te
VM. The quota_id field for the VM now contains a reference to this fake id.

 Please note that the first time you will run it after disabling the 
 quota, the snapshot should be deleted but the vm will still not start. 
 only after the snapshot is deleted you will be able to run the vm again.

Indeed, it took two attempts to start the VM.

Question: is it harmful to leave the fake quota record with id
'----' and the reference to it in the
vm_static table in place?

Cheers,
Martijn.




 
 
 On 12/05/2013 11:23 AM, Martijn Grendelman wrote:
 Hi,

 After maintenance on a host, I am trying to start a VM that has been
 running statelessly for a while. It refuses to start and Engine logs the
 following:

 2013-12-05 11:59:18,125 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
 (pool-6-thread-48) Correlation ID: 43d49965, Job ID:
 0d943a6d-9d65-4ac1-89b7-139d30b4813c, Call Stack: null, Custom Event ID:
 -1, Message: Failed to start VM WinXP, because exist snapshot for
 stateless state. Snapshot will be deleted.

 Should I submit a bug report for the poor English in this log line? ;-)

 The 'Snapshots' tab for the VM doesn't show anything, but repeated
 attempts to start the VM just show the same message in the log.

 What can I do to start this VM?
 It seems I missed some info in the log that may well indicate the root
 cause of this issue. Please see attached log excerpt. A database query
 is failing due to a foreign key constraint violation.

 Please advise how to fix the database inconsistency.

 Regards,
 Martijn.




 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] VM with stateless snapshot won't start

2013-12-05 Thread Martijn Grendelman
Gilad Chaplik schreef op 5-12-2013 13:48:
 hi Martijn,
 
 Indeed we have a bug there, and we've already solved it [1].
 As long as you're not using quota, your workaround is great.
 If you'll decide to use quota you can contact me and we will see how to 
 proceed.

Ok, thanks! I have no plans for using quota ATM.

Cheers,
Martijn.




 [1] http://gerrit.ovirt.org/#/c/21332/
 
 - Original Message -
 From: Martijn Grendelman martijn.grendel...@isaac.nl
 To: d...@redhat.com, Doron Fediuck dfedi...@redhat.com
 Cc: users@ovirt.org
 Sent: Thursday, December 5, 2013 2:01:44 PM
 Subject: Re: [Users] VM with stateless snapshot won't start

 Hello Dafna,

 If a failure happened and the stateless vm has shut down without delete
 of the snapshot, the next time you will try to run it we will try to
 delete the snapshot.
 from the engine log, it seems that there is a problem deleting the
 snapshot because of quota.
 can you please try to disable the quota and try to run the vm again?

 Quota were not enabled on this Data Center, I have never done anything
 with quota on oVirt. The 'quota' table was empty, and the 'quota_id'
 field on all VMs in the 'vm_static' table was NULL.

 Since it was seemingly trying to set the quota_id for this particular VM
 to '----', I manually inserted a record
 into the quota table using this ID. After that, I was able to start te
 VM. The quota_id field for the VM now contains a reference to this fake id.

 Please note that the first time you will run it after disabling the
 quota, the snapshot should be deleted but the vm will still not start.
 only after the snapshot is deleted you will be able to run the vm again.

 Indeed, it took two attempts to start the VM.

 Question: is it harmful to leave the fake quota record with id
 '----' and the reference to it in the
 vm_static table in place?

 Cheers,
 Martijn.






 On 12/05/2013 11:23 AM, Martijn Grendelman wrote:
 Hi,

 After maintenance on a host, I am trying to start a VM that has been
 running statelessly for a while. It refuses to start and Engine logs the
 following:

 2013-12-05 11:59:18,125 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
 (pool-6-thread-48) Correlation ID: 43d49965, Job ID:
 0d943a6d-9d65-4ac1-89b7-139d30b4813c, Call Stack: null, Custom Event ID:
 -1, Message: Failed to start VM WinXP, because exist snapshot for
 stateless state. Snapshot will be deleted.

 Should I submit a bug report for the poor English in this log line? ;-)

 The 'Snapshots' tab for the VM doesn't show anything, but repeated
 attempts to start the VM just show the same message in the log.

 What can I do to start this VM?
 It seems I missed some info in the log that may well indicate the root
 cause of this issue. Please see attached log excerpt. A database query
 is failing due to a foreign key constraint violation.

 Please advise how to fix the database inconsistency.

 Regards,
 Martijn.




 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users


 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Agents for Windows

2013-12-03 Thread Martijn Grendelman
Blaster schreef op 2-12-2013 21:15:
 
 I've been able to find prebuilt virt-io drivers and spice agents for 
 Windows.

It seems the Windows guest tools package from
http://www.spice-space.org/download.html is not installable on Windows
Server 2012 though :-(

Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] backups

2013-12-02 Thread Martijn Grendelman
Blaster schreef op 30-11-2013 21:40:
 Contrary to my other post, which was more educational than practical, 
 yes, you generally would not back up app data via a hypervisor 
 snapshot.  Generally you would only backup the OS disk and perhaps the 
 application binaries.  This would be for quick restore of the OS and 
 app, so you don't have to spend hours reconfiguring your OS.  
 (especially Windows based OSes)
 
 I also do an IN OS backup as well, for individual file restores in the 
 instances you accidentally destroy something in /etc for example.

If your backups are 1) recent and 2) consistent (to a level that suits
you), what does it matter how you make them?

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] backups

2013-11-27 Thread Martijn Grendelman
Hi Charles,

 How are you folks doing your hypervisor level backups?
 
 Under ESXi I used GhettoVCB which basically took a snap shot, copied the 
 disk image to another location, then deleted the snap.

Thank you for this hint, I didn't know about GhettoVCB and I'm
definately going to have a look at it.

 I haven't been able to find too much information on how this can be done 
 with ovirt.  I see discussions on the new backup API, but I'm not 
 interested in spending big bucks on an enterprise backup solution for a 
 home lab.
 
 Only discussion I saw on using snapshots for backups said don't do it 
 because the tools don't sync memory when the snapshots are taken.

The problem with snapshot-based backups is, that they are usually only
crash-consistent, meaning that they contain the state of a system's
disks as they would be if you pulled the power plug on a server. If you
restore a system from this type of backup, you would see file system
recovery happening at the first boot, and you risk data loss from -for
example- database servers.

The process that GhettoVCB uses according to your description above is
the same. Your backups are only crash-consistent.

If you need application-level consistency, you need a mechanism to
inform applications that a backup is going to take place (or rather: a
snapshot will be taken) and that they should place themselves in a
consistent state. For example: sync data to disk, flush transaction
logs, stuff like that. Microsoft Windows has VSS for that. For Linux,
there is no such thing (that I know of). Common practice for quiescing
database servers and such on Linux is making consistent SQL dumps in a
pre-backup job.

I my case, for most guests a crash-consistent backup, containing a
recent MySQL or PostgreSQL dump is sufficient. I use LVM snapshots (not
oVirt snapshots) for backups, and I use Rsync to transfer the data. I
have been experimenting with Virtsync [1], but I'm having a bit of
trouble with that, so for the moment, it's just Rsync.

Efficiently backing up sparse images with Rsync can be a bit of a
challenge (that's why Virtsync was created in the first place, IIRC),
but using '--sparse' on the inital backup and '--inplace' on subsequent
backups seems to do the trick.

[1] http://www.virtsync.com/

I hope this helps.

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Where do you run the engine?

2013-11-27 Thread Martijn Grendelman
Sander Grendelman schreef op 27-11-2013 16:56:
 On Wed, Nov 27, 2013 at 3:51 PM, Ernest Beinrohr
 ernest.beinr...@axonpro.sk wrote:
 Just curious, where/how you run the engine. I run it in libvirt/kvm on one
 of my storage domains.
 
 I run it on our esx cluster (seriously).

Yep, me too...

Highly interested in running it alongside VDSM on one of the hosts,
though...

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] backups

2013-11-27 Thread Martijn Grendelman
Blaster schreef op 27-11-2013 17:23:
 On 11/27/2013 4:24 AM, Martijn Grendelman wrote:
 The problem with snapshot-based backups is, that they are usually only 
 crash-consistent, meaning that they contain the state of a system's 
 disks as they would be if you pulled the power plug on a server. If 
 you restore a system from this type of backup, you would see file 
 system recovery happening at the first boot, and you risk data loss 
 from -for example- database servers. 
 
 The work-around for this is to SSH into the guest first, put the 
 database into backup mode(maybe run sync a time or two to flush out as 
 much from RAM as possible), take the snap shot, ssh back in to resume 
 the database, backup the snap, delete the snap.

Yes, for example for MySQL, you could

1. issue a FLUSH TABLES WITH READ LOCK statement
2. create a snapshot
3. issue a UNLOCK TABLES statement

before starting a backup from the snapshot, to get a consistent backup
of the binary table space.

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] oVirt 3.3.1 rlease

2013-11-22 Thread Martijn Grendelman
I'd just like to say that I just upgraded from 3.3.0.1 to 3.3.1 without
problems. It was a smooth experience, for both Engine and VDSM.

Cheers,
Martijn.





Kiril Nesenko schreef op 21-11-2013 16:43:
 The oVirt development team is very happy to announce the general
 availability of oVirt 3.3.1 as of November 21th 2013. This release
 solidifies oVirt as a leading KVM management application, and open
 source alternative to VMware vSphere.
 
 oVirt is available now for Fedora 19 and Red Hat Enterprise Linux 6.4
 (or similar).
 
 See release notes [1] for a list of the new features and bug fixed.
 
 [1] http://www.ovirt.org/OVirt_3.3.1_release_notes
 
 - Kiril
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] oVirt 3.4 planning

2013-11-13 Thread Martijn Grendelman
Patrick Lists schreef op 13-11-2013 16:37:
 Hi René,
 
 On 11/13/2013 04:16 PM, René Koch (ovido) wrote:
 [snip]
 The plugin is a Nagios monitoring plugin, but as mentioned above you
 should be able to use it with Zabbix when defining it as an external
 check.

 Download and documentation can be found here:
 https://github.com/ovido/check_rhev3
 
 Any idea if your plugin also works with Icinga?

If it works with Nagios, it works with Icinga.

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Live storage migration fails on CentOS 6.4 + ovirt3.3 cluster

2013-11-06 Thread Martijn Grendelman
Itamar Heim schreef op 6-11-2013 12:06:
 On 11/06/2013 10:42 AM, Sander Grendelman wrote:
 Can anyone reproduce / comment on this?

 Can this be caused by
 http://www.ovirt.org/Vdsm_Developers#Missing_dependencies_on_RHEL_6.4
 ?
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

 
 do you use qemu-kvm or qemu-kvm-rhev rpm?


What is the difference?

regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] vmware disks

2013-10-17 Thread Martijn Grendelman
Op 17-10-2013 11:16, René Koch (ovido) schreef:
 On Thu, 2013-10-17 at 10:14 +0100, supo...@logicworks.pt wrote:
 Hi, it's possible to import a vmware disk into ovirt?
 
 Yes, you can import a virtual machine from VMware using virt-v2v tool...

That's not an answer to the question.

The answer, AFAIK, is: no, you can't. You need to have ESX running to
import the whole VM with virt-v2v.

Or am I wrong? In which case I'd really like to know.

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] virt-io drivers for windows xp

2013-10-16 Thread Martijn Grendelman
Hi,

 I have been playing with the VirtIO drivers from mentioned ISO on
 Windows XP, but I experienced a lot of BSODs.

 In the end, I set NICs to emulate as rtl8139 and disks as IDE, and that
 seems to work.

 Many thanks,
 infact during creation of windows vm I have chosen the windows option in
 vm wizard. 
 I cannot guess that the wizard made a strange choice using virtio drivers
 when for windows is best to use ide and rtl.
 
 The choice is not strange given:
 1) availability of virtual floppy with virtio drivers
 2) dreadful performance of emulated devices compared to virtio ones
 
 The minor annoyance during installation (need to attach a floppy and do
 F6/Load Drivers) really pays off.

Like I said, Windows XP kept crashing a lot (reproducible crash on
shutdown and more random crashes during normal operation) with the
VirtIO drivers. I'm not sure if it was network or disk, but it didn't
work well, that's for sure. With the IDE/RTL8139 combination, XP has
been running stable for a few weeks now.

Regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Proposal for a fresh look and feel for Ovirt

2013-10-16 Thread Martijn Grendelman
Hi,

 I like the modernized look, but imo it still misses the simplistic feel.
 I've had a few people complain that it's hung their whole browser
 because of the amount of javascript, I didn't proceed to question why
 because they were VMware fan boys and I simply could never replicate it.

Yes, the admin portal's JavaScript hangs the browser frequently, at
least on FF. On Chrome it happens less, but it still happens.

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] virt-io drivers for windows xp

2013-10-14 Thread Martijn Grendelman
Op 13-10-2013 17:38, noc schreef:
 On 12-10-2013 18:06, Mario Giammarco wrote:
 Hello,

 Where can I find an ISO with all drivers for ovirt?


 http://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/ is the 
 place too look for. Insert the iso when you need the drivers.
 
 The proxmox site/forum has a wiki page describing which folder belongs 
 to what component. Some are straight forward some are rather cryptic :-)
 Search for virtio-win, I think.
 
 Joop

I have been playing with the VirtIO drivers from mentioned ISO on
Windows XP, but I experienced a lot of BSODs.

In the end, I set NICs to emulate as rtl8139 and disks as IDE, and that
seems to work.

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Vfd floppy images

2013-10-02 Thread Martijn Grendelman
Hi,

How does one upload/import VFD floppy images into oVirt?

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Vfd floppy images

2013-10-02 Thread Martijn Grendelman
Hi all,

 How does one upload/import VFD floppy images into oVirt?

 Hi,
 You can use iso uploader, check man ovirt-iso-uploader for more info.

I have to apologize. I tried ovirt-iso-uploader before I wrote to the
list, and I got an error message that I couldn't immediately place. It
appeared I did in fact make a mistake in the command (missing ISO
domain). Upload succeeded just now. Sorry for wasting your time.

Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] VMs and volumes disappearing

2013-10-01 Thread Martijn Grendelman
Hi,

 I have recently set up an oVirt environment, I think in a pretty
 standard fashion, with engine 3.3 on one host, one oVirt host on a
 physical machine, both running CentOS 6.4, using NFS for all storage
 domains.
 
 Please provide rpm -qa on the ovirt rpms (ovirt engine).

martijn@ovirt:~ rpm -qa | grep ovirt
ovirt-log-collector-3.3.0-1.el6.noarch
ovirt-engine-3.3.0-1.el6.noarch
ovirt-host-deploy-1.1.1-1.el6.noarch
ovirt-engine-cli-3.3.0.4-1.el6.noarch
ovirt-engine-userportal-3.3.0-1.el6.noarch
ovirt-engine-tools-3.3.0-1.el6.noarch
ovirt-engine-setup-3.3.0-4.el6.noarch
ovirt-engine-sdk-python-3.3.0.6-1.el6.noarch
ovirt-image-uploader-3.3.0-1.el6.noarch
ovirt-engine-restapi-3.3.0-1.el6.noarch
ovirt-engine-webadmin-portal-3.3.0-1.el6.noarch
ovirt-host-deploy-java-1.1.1-1.el6.noarch
ovirt-engine-backend-3.3.0-1.el6.noarch
ovirt-release-el6-8-1.noarch
ovirt-iso-uploader-3.3.0-1.el6.noarch
ovirt-engine-dbscripts-3.3.0-1.el6.noarch
ovirt-engine-lib-3.3.0-4.el6.noarch

 Today I was playing around with snapshots, when I noticed that the
 Snapshots panel didn't show any of the snapshots I created, not even the
 'Current - Active VM' snapshot that all VMs have.
 
 Not sure why this has happened. How do you know that snapshot
 creation was completed? Did you look at the events tab? (Asking to be
 sure) engine.log will be quite helpful here.

I find engine.log somewhat hard to read, to be honest, and documentation
is hard to find, but I think I found some clues.

I tried to create 4 snapshots of a certain VM, 2 of which completed
normally and 2 of which failed:

Failed with VDSM error SNAPSHOT_FAILED and code 48

However, what I find most upsetting, is that the VMs that disappeared
were not the subject of my experiments. I was creating snapshots of a
single VM, and the VMs that disappeared were unrelated. As a matter of
fact, the VM I was experimenting with IS THE ONLY ONE that survived.

By the way, the Snapshots panel has been displaying snapshots correctly
for a while, but when I logged in this morning, it appeared empty again,
for all VMs.

Is there anything I can check to see what causes this?

 Not sure what to do, I decided to restart the ovirt-engine process.

 When I logged back on to the administrator panel, I was shocked to
 see 2endWith of my 4 VMs completely missing from the inventory. I
 haven't been able to find back a single trace of either machine,
 neither in the portal nor on disk. It seems like they never
 existed. The storage of both VMs seems to be erased from the data
 domain.

 Not sure why storage domain was erased. About Vms disappeared - there
 were previous discussions on that at users@ovirt.org. In a nutshell,
 due to a bug (that was already fixed) prior to the restart you might
 have had records at thetable that contained value of
 empty guid (a string in UUID format with only 0 and - ) at the
 vdsm_task_id_column. This means that the task is not associated with
 a real SPM task, and when the engine restarts, if for a given flow
 (let's say - snapshot creation) there are tasks with such
 vdsm_task_id,  the flow will end with failure. For some flows ,
 ending with failure means erasing the vm (for example - real failure
 of importing a vm). By the way, similar issue can probably occur with
 disks as well, as there are flows that run async tasks that deal with
 disks.

I think I have an idea about what happended now.

The 2 disappeared VMs have been imported into oVirt using virt-v2v. The
3rd one that's now missing a disk volume was not, but I have been
playing with storage migration in the past.

Yesterday's engine.log seems to suggest, that all of these tasks
(importing the 2 VMs and trying to move a volume) have been restarted
immediately after restarting Engine. After failure, the VMs and volume
were removed. It seems to fit the above description of the bug.

So...

What can I do to prevent this from happening again?

Should I periodically check the 'async_tasks' table for anomalies? Is
there a bugfix I can apply, or should I wait for a new release of oVirt?
If the latter, when is that expected to happen?

Thanks,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] VMs and volumes disappearing

2013-10-01 Thread Martijn Grendelman
Hello Yair,

Thank you for your answers, but I still have some questions left ;-)

 I find engine.log somewhat hard to read, to be honest, and documentation
 is hard to find, but I think I found some clues.
 
 Hi,
 I understand what you're saying about engine.log, when I asked for
 it, it was because I'm one of the maintainers of ovirt engine, so I
 thought I could give you a hand here, especially after reading your
 email and getting a sense that I saw a similar issue in the past.

If you want, I can send you my log. I wasn't sure if that's what you meant.

I just had the same problem again.
- Stopped oVirt engine
- Checked the 'async_tasks' table. It was empty!
- Started oVirt engine
- Same set of imported VMs as last time deleted!

I thought that lingering records from the 'async_tasks' table were to
blame, but apparently, that's not the case.

Can you tell me what I need to check/do/modify/update/delete before
restarting oVirt that will keep my VMs from being deleted? (Please see
below for a note on upgrading like you suggested)

 I think I have an idea about what happended now.

 The 2 disappeared VMs have been imported into oVirt using virt-v2v. The
 3rd one that's now missing a disk volume was not, but I have been
 playing with storage migration in the past.
 
 Then this is the reason, other users have complained about it at 
 users@ovirt.org

I have read the thread about disappearing VMs from August and indeed it
sounds like this might me the same problem.

 Upgrade I just talked with Ofer (CC'ed), our release engineer, and he
 said that all packages should be 3.3.0-4 (notice ovirt-engine is
 not) I hope this helps you out,

There are no updates, at least Yum doesn't give me any. I enabled the
beta channel just now, but that doesn't make a difference. 3.3.0-1 is
the latest version. What am I missing?

martijn@ovirt:~ sudo yum list all|grep ovirt-engine.noarch
ovirt-engine.noarch 3.3.0-1.el6@ovirt-beta

Best regards,
Martijn Grendelman

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] VMs and volumes disappearing

2013-10-01 Thread Martijn Grendelman
Hi,

 I just had the same problem again.
 - Stopped oVirt engine
 - Checked the 'async_tasks' table. It was empty!
 - Started oVirt engine
 - Same set of imported VMs as last time deleted!

 I thought that lingering records from the 'async_tasks' table were to
 blame, but apparently, that's not the case.

 Can you tell me what I need to check/do/modify/update/delete before
 restarting oVirt that will keep my VMs from being deleted? (Please see
 below for a note on upgrading like you suggested)
 
 make sure you dont have records at async_tasks with vdsm_task_id of
 empty guid - but i'm really not in favor of such hacks, this is a
 hack. I strongly suggest you solve upgrade issue (please communicate
 with ofer on this).

OK, I understand that, but for the record: 'async_tasks' was completely
empty before I started Engine, so this hack wouldn't have been of any use.


Cheers,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Resizing disks destroys contents

2013-10-01 Thread Martijn Grendelman
Hi,

I just tried out another feature of oVirt and again, I am shocked by the
results.

I did the following:
- create new VM based on an earlier created template, with 20 GB disk
- Run the VM - boots fine
- Shut down the VM
- Via Disks - Edit - Extend size by(GB) add 20 GB to the disk
- Run the VM

Result: no bootable device. Linux installation gone.

Just to be sure, I booted the VM with a gparted live iso, and gparted
reports the entire 40 GB as unallocated space.

Where's my data? What's wrong with my oVirt installation? What am I
doing wrong?

Regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] VMs and volumes disappearing

2013-09-30 Thread Martijn Grendelman
Hi,

I have recently set up an oVirt environment, I think in a pretty
standard fashion, with engine 3.3 on one host, one oVirt host on a
physical machine, both running CentOS 6.4, using NFS for all storage
domains.

Today I was playing around with snapshots, when I noticed that the
Snapshots panel didn't show any of the snapshots I created, not even the
'Current - Active VM' snapshot that all VMs have.

Not sure what to do, I decided to restart the ovirt-engine process.

When I logged back on to the administrator panel, I was shocked to see 2
of my 4 VMs completely missing from the inventory. I haven't been able
to find back a single trace of either machine, neither in the portal nor
on disk. It seems like they never existed. The storage of both VMs seems
to be erased from the data domain.

A 3rd VM is down and refuses to start: Exit message: Volume
337a410f-1598-4a7f-9afd-c0160c329563 is corrupted or missing.

and in vdsm.log on the host:

OSError: [Errno 2] No such file or directory:
'/rhev/data-center/5849b030-626e-47cb-ad90-3ce782d831b3/d523a48d-7a34-4bb0-9d48-2092934af816/images/e803ad34-94e5-4180-b26f-7271bfca5923/337a410f-1598-4a7f-9afd-c0160c329563'

So it seems something is seriously f*cked up. Now what? Any ideas what
may have caused this? And more importantly, how do I prevent something
like this from happening again?

Perhaps a needless addition, but I am very scared to host anything
remotely important on oVirt now.

Regards,
Martijn.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users