Re: [Users] Gluster network info
Hi Gianluca, snip On 10/02/2013 08:52 PM, Itamar Heim wrote: On 09/28/2013 08:33 PM, Gianluca Cecchi wrote: Hello, I remember in the past that could be a problem to have high usage of ovirtmgmt network because engine sometimes detects hosts as unresponsive. And it should this the reason about bandwith limitation on vm migration, until dedicated network for it has been released. We have a dedicated migration network in oVirt 3.3, which means all the migration network is using a dedicated network and not the management network. The caveat is that we still did not add support for capping it's bandwidth. The problem you detailed above can be bypassed by configuring the migration network on a dedicated interface. SO the question is : what about ovirtmgmt network for gluster replication when gluster domain is provided by ovirt nodes? I suppose it could be a problem too, couldn't it? yes that is correct, using ovirtmgmt for non management traffic is not a good idea. In case I have a dedicated network for gluster for the nodes, how can i configure it? Generally there two ways to go about this: 1. add a new network role 'gluster replication' and then adjust the engine code to pass the network with this role as parameter to the replication verb in VDSM. 2. in the replication verb in the UI let the user choose which network to use for the replication. I am not familiar with the replication verb so I might be missing something but otherwise I think that solution 2 is more simple and less invasive then requiring a new network role. Can I configure in this case Gluster storage domain from ovirt engine or does it require to be on this network too? Thanks in advance for any suggestion Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users livnat - would network QoS cover this? Network Qos is needed in addition to what I have specified above. and it would be available in oVirt 3.4 hopefully, but if you use a dedicated network as suggested above you can bypass the traffic shaping issue by using a dedicated NIC. Livnat vijay/sahina - does it make sense to define a 'gluster replication network' or something like that? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Ovirt 3.3 Fedora 19 add gluster storage permissions error
The root cause seems to be a SANlock issue: Thread-32382::ERROR:: 2013-09-20 13:16:34,126::clusterlock::145::initSANLock: initSANLock) Cannot initialize SANLock for domain 17d21ac7-5859-4f25-8de7-2a9433d50c11 Traceback (most recent call last): File /usr/share/vdsm/storage/clusterlock.py, line 140, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (22, 'Sanlock lockspace init failure', 'Invalid argument') Thread-32382::WARNING:: 2013-09-20 13:16:34,127::sd::428::Storage.StorageDomain: initSPMlease) lease did not initialize successfully Traceback (most recent call last): File /usr/share/vdsm/storage/sd.py, line 423, in initSPMlease self._clusterLock.initLock() File /usr/share/vdsm/storage/clusterlock.py, line 163, in initLock initSANLock(self._sdUUID, self._idsPath, self._leasesPath) File /usr/share/vdsm/storage/clusterlock.py, line 146, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () Can you include /var/log/sanlock.log and /var/log/messages please? - Original Message - From: Steve Dainard sdain...@miovision.com To: Deepak C Shetty deepa...@linux.vnet.ibm.com Cc: users users@ovirt.org Sent: Friday, September 20, 2013 8:30:59 PM Subject: Re: [Users] Ovirt 3.3 Fedora 19 add gluster storage permissions error Awesome, thanks guys. Its weird that that article tells you to set with 'key=value' rather than 'key value' must be some legacy stuff. Once those changes are in place I hit a different error. Deepak, maybe you've seen this one on new storage domain add: [root@ovirt-manager2 ~]# tail -f /var/log/ovirt-engine/engine.log 2013-09-20 13:16:36,226 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] (ajp--127.0.0.1-8702-9) Command CreateStoragePoolVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Cannot acquire host id: ('17d21ac7-5859-4f25-8de7-2a9433d50c11', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2013-09-20 13:16:36,229 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] (ajp--127.0.0.1-8702-9) FINISH, CreateStoragePoolVDSCommand, log id: 672635cc 2013-09-20 13:16:36,231 ERROR [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (ajp--127.0.0.1-8702-9) Command org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Cannot acquire host id: ('17d21ac7-5859-4f25-8de7-2a9433d50c11', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) (Failed with VDSM error AcquireHostIdFailure and code 661) 2013-09-20 13:16:36,296 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-9) Correlation ID: 11070337, Call Stack: null, Custom Event ID: -1, Message: Failed to attach Storage Domains to Data Center Default. (User: admin@internal) 2013-09-20 13:16:36,299 INFO [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (ajp--127.0.0.1-8702-9) Lock freed to object EngineLock [exclusiveLocks= key: 5849b030-626e-47cb-ad90-3ce782d831b3 value: POOL , sharedLocks= ] 2013-09-20 13:16:36,387 INFO [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] (ajp--127.0.0.1-8702-9) Command [id=293a1e97-e949-4c17-92c6-c01f2221204e]: Compensating CHANGED_ENTITY of org.ovirt.engine.core.common.businessentities.StoragePool; snapshot: id=5849b030-626e-47cb-ad90-3ce782d831b3. 2013-09-20 13:16:36,398 INFO [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] (ajp--127.0.0.1-8702-9) Command [id=293a1e97-e949-4c17-92c6-c01f2221204e]: Compensating NEW_ENTITY_ID of org.ovirt.engine.core.common.businessentities.StoragePoolIsoMap; snapshot: storagePoolId = 5849b030-626e-47cb-ad90-3ce782d831b3, storageId = 17d21ac7-5859-4f25-8de7-2a9433d50c11. 2013-09-20 13:16:36,425 INFO [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] (ajp--127.0.0.1-8702-9) Command [id=293a1e97-e949-4c17-92c6-c01f2221204e]: Compensating CHANGED_ENTITY of org.ovirt.engine.core.common.businessentities.StorageDomainStatic; snapshot: id=17d21ac7-5859-4f25-8de7-2a9433d50c11. 2013-09-20 13:16:36,464 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-9) Correlation ID: 302ae6eb, Job ID: 014ec59b-e6d7-4e5e-b588-4fb0dfa8f1c8, Call Stack: null, Custom Event ID: -1, Message: Failed to attach Storage Domain rep2-virt to Data Center Default. (User: admin@internal) [root@ovirt001 ~]# tail -f /var/log/vdsm/vdsm.log Thread-32374::DEBUG::2013-09-20 13:16:18,107::task::579::TaskManager.Task::(_updateState)
Re: [Users] Gluster network info
On Thu, Oct 03, 2013 at 12:44:14PM +0300, Livnat Peer wrote: Hi Gianluca, snip On 10/02/2013 08:52 PM, Itamar Heim wrote: On 09/28/2013 08:33 PM, Gianluca Cecchi wrote: Hello, I remember in the past that could be a problem to have high usage of ovirtmgmt network because engine sometimes detects hosts as unresponsive. And it should this the reason about bandwith limitation on vm migration, until dedicated network for it has been released. We have a dedicated migration network in oVirt 3.3, which means all the migration network is using a dedicated network and not the management network. The caveat is that we still did not add support for capping it's bandwidth. The problem you detailed above can be bypassed by configuring the migration network on a dedicated interface. SO the question is : what about ovirtmgmt network for gluster replication when gluster domain is provided by ovirt nodes? I suppose it could be a problem too, couldn't it? yes that is correct, using ovirtmgmt for non management traffic is not a good idea. In case I have a dedicated network for gluster for the nodes, how can i configure it? Generally there two ways to go about this: 1. add a new network role 'gluster replication' and then adjust the engine code to pass the network with this role as parameter to the replication verb in VDSM. 2. in the replication verb in the UI let the user choose which network to use for the replication. I am not familiar with the replication verb so I might be missing something but otherwise I think that solution 2 is more simple and less invasive then requiring a new network role. Bala, is there a means to select, for each node, which IP address is to be used for replication? AFAICT we use the host fqdn, which most probably resolves to the ovirtmgmt device. Can I configure in this case Gluster storage domain from ovirt engine or does it require to be on this network too? Thanks in advance for any suggestion Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users livnat - would network QoS cover this? Network Qos is needed in addition to what I have specified above. and it would be available in oVirt 3.4 hopefully, but if you use a dedicated network as suggested above you can bypass the traffic shaping issue by using a dedicated NIC. Livnat vijay/sahina - does it make sense to define a 'gluster replication network' or something like that? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] HA in oVirt 3.3
On Wed, Oct 2, 2013 at 3:45 PM, Itamar Heim ih...@redhat.com wrote: On 10/02/2013 09:04 PM, emi...@gmail.com wrote: But, if i want HA, if the host were the vm it's running lost connectivity with the manager, the vm should start in other host of the cluster right? HA is currently controlled by the engine, which needs to know the VM isn't running on the host to launch it on another host. we're looking at relying on sanlock for engine to be able to launch the VM on another host without fencing the first host. Hi, on RHEV 3.x it works as emitor needs, oVirt has another behavior ? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Gluster network info
On Thu, Oct 3, 2013 at 2:08 PM, Dan Kenigsberg wrote: SO the question is : what about ovirtmgmt network for gluster replication when gluster domain is provided by ovirt nodes? I suppose it could be a problem too, couldn't it? yes that is correct, using ovirtmgmt for non management traffic is not a good idea. In case I have a dedicated network for gluster for the nodes, how can i configure it? Generally there two ways to go about this: 1. add a new network role 'gluster replication' and then adjust the engine code to pass the network with this role as parameter to the replication verb in VDSM. 2. in the replication verb in the UI let the user choose which network to use for the replication. I am not familiar with the replication verb so I might be missing something but otherwise I think that solution 2 is more simple and less invasive then requiring a new network role. Bala, is there a means to select, for each node, which IP address is to be used for replication? AFAICT we use the host fqdn, which most probably resolves to the ovirtmgmt device. what is replication verb in the UI? Inside UI when I click create volume and then add Bricks buttons I can only select Server that in my case is a drop down menu containing only ip addresses of my 2 hypervisors Or simply do you mean development steps to integrate this missing functionality in oVirt with steps 1. and 2.? Thanks, Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] unable to start vm in 3.3 and f19 with gluster
On Thu, Oct 3, 2013 at 12:21 AM, Gianluca Cecchi wrote: On Wed, Oct 2, 2013 at 9:16 PM, Itamar Heim wrote: On 10/02/2013 12:57 AM, Gianluca Cecchi wrote: Today I was able to work again on this matter and it seems related to spice Every time I start the VM (that is defined with spice) it goes in and this doesn't happen if the VM is defined with vnc? No, reproduced both from oVirt and through virsh. with spice defined in boot options or in xml (for virsh) the vm remains in paused state and after a few minutes it seems the node hangs... with vnc the VM goes in runnign state I'm going to put same config on 2 physical nodes with only local storage and see what happens and report... Gianluca So I was able to configure 2xHP blades BL685c G1 (Opteron G2 in oVirt) with only one internal 72Gb disk each. It seems here GlusterFS works quite well form the initial tests done. I used same config as the problematic nested one. Created a CentOS 6.4 VM, configured as server in oVirt but installed as default Desktop in anaconda. The 2 servers are on a Gigabit network The install phase of the 1089 packages had an elapsed of about 17 minutes before completion. During install I see (with command bwm-ng -t 1000) throughput goes at 55MB/s so it seems quite ok. Also at the moment the distributed-replicated volume is on ovirtmgmt itself and I don't see any disconnection from hosts in webadmin. I'm going to test more on this infra, and check/compare with the nested environment that was configured quite the same... unfortunately not much disk space to use and stress... ;-( Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] unable to start vm in 3.3 and f19 with gluster
On 10/03/2013 01:21 AM, Gianluca Cecchi wrote: On Wed, Oct 2, 2013 at 9:16 PM, Itamar Heim wrote: On 10/02/2013 12:57 AM, Gianluca Cecchi wrote: Today I was able to work again on this matter and it seems related to spice Every time I start the VM (that is defined with spice) it goes in and this doesn't happen if the VM is defined with vnc? No, reproduced both from oVirt and through virsh. with spice defined in boot options or in xml (for virsh) the vm remains in paused state and after a few minutes it seems the node hangs... with vnc the VM goes in runnign state I'm going to put same config on 2 physical nodes with only local storage and see what happens and report... Gianluca adding spice-devel mailing list as the VM only hangs if started with spice and not with vnc, from virsh as well. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] Migrate existing guest (kvm/libvirtd) into ovirt environment
Hello! I have thougts about if there is a way to take an existing vm (kvm) running on BoxA that only uses libvirt and deploy that on BoxB, running ovirt (oVirt Engine Version: 3.3.0-4.fc19) The existing vm, is a linux host, and the disk type is qcow2 And since it's a working configuration i just wan't do move/migrate it to the ovirt envionment. Lasse Lindgren .” ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Migrate existing guest (kvm/libvirtd) into ovirt environment
On Thu, 2013-10-03 at 15:53 +0200, lasse lindgren wrote: Hello! I have thougts about if there is a way to take an existing vm (kvm) running on BoxA that only uses libvirt and deploy that on BoxB, running ovirt (oVirt Engine Version: 3.3.0-4.fc19) The existing vm, is a linux host, and the disk type is qcow2 And since it's a working configuration i just wan't do move/migrate it to the ovirt envionment. You can use virt-v2v to migrate your existing vm into oVirt. Just create an export domain in oVirt, shutdown the vm and import it using virt-v2v. Regards, René Lasse Lindgren .” ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Migrate existing guest (kvm/libvirtd) into ovirt environment
On Thu, 2013-10-03 at 16:06 +0200, lasse lindgren wrote: I have been trying to find any good examples about how that could be done. But no luck there. Here's an example on how to import a vm named test-solaris10 from a KVM machine (kvm-server) to the export domain located on server nfs-server: $ virt-v2v -ic qemu+ssh://kvm-server/system -o rhev -os nfs-server:/nfs/exports -of raw -oa sparse -n ovirtmgmt test-solaris10 man virt-v2v explains all the possible options... Is that a supported way? Yes, it's the way how you should import vms... On Thu, Oct 3, 2013 at 3:57 PM, René Koch (ovido) r.k...@ovido.at wrote: On Thu, 2013-10-03 at 15:53 +0200, lasse lindgren wrote: Hello! I have thougts about if there is a way to take an existing vm (kvm) running on BoxA that only uses libvirt and deploy that on BoxB, running ovirt (oVirt Engine Version: 3.3.0-4.fc19) The existing vm, is a linux host, and the disk type is qcow2 And since it's a working configuration i just wan't do move/migrate it to the ovirt envionment. You can use virt-v2v to migrate your existing vm into oVirt. Just create an export domain in oVirt, shutdown the vm and import it using virt-v2v. Regards, René Lasse Lindgren .” ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Lasse Lindgren .” ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Quota for VMs created from templates
AFAIK, a user cannot create a VM that is associated with one (or more) quota objects on which he doesn't have consumer permissions. i.e. if the VM was created successfully by the user, and this VM is associated with TemplateQuota, and with the quota that has been created for the user (let's call it UserQuota), it means that the user has consumer permissions on both TemplateQuota and UserQuota. If the user doesn't have permissions on one of these Quota objects - the fact that the VM has been created successfully sounds like a bug to me. Thanks, Einav - Original Message - From: Mitja Mihelič mitja.mihe...@arnes.si To: users@ovirt.org Sent: Thursday, October 3, 2013 9:59:06 AM Subject: [Users] Quota for VMs created from templates Hi! We are running engine version 3.3.0 on CentOS6 and we have come across a problem, possibly a bug. When a user creates a VM from a template, the template's quota is assigned to the VM. Here is the setup: - quota is set to Enforced on the data center - quota is created for template purposes (TemplateQuota) - a template is created from a sealed VM with TemplateQuota assigned to it - quota is created for a user, the user is set as its consumer - the user creates a VM from the mentioned template and leaves the quota unchanged - the created VM consumes the user's storage quota but does not consume their memory and CPU quota This way a user can create and run an arbitrary number of VMs as long they stay within their storage quota. No errors are reported in the logs. Kind regards, Mitja Mihelic -- -- Mitja Mihelič ARNES, Tehnološki park 18, p.p. 7, SI-1001 Ljubljana, Slovenia tel: +386 1 479 8877, fax: +386 1 479 88 78 ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Migration issues with ovirt 3.3
On Wed, Oct 2, 2013 at 12:07 AM, Jason Brooks wrote: I'm having this issue on my ovirt 3.3 setup (two node, one is AIO, GlusterFS storage, both on F19) as well. Jason Me too with oVirt 3.3 setup and GlusterFS DataCenter. One dedicated engine + 2 vdsm hosts. All fedora 19 + ovirt stable. My trying to migrate VM is CentOS 6.4 fully updated After trying to migrate in webadmin gui I get: VM c6s is down. Exit message: 'iface'. For a few moments the VM appears as down in the GUI but actually an ssh session I had before is still alive. Also the qemu process on source host is still there. After a little, the VM results again as Up in the source node from the gui Actually it never stopped: [root@c6s ~]# uptime 16:36:56 up 19 min, 1 user, load average: 0.00, 0.00, 0.00 At target many errors such as Thread-8609::ERROR::2013-10-03 16:17:13,086::task::850::TaskManager.Task::(_setError) Task=`a102541e-fbe5-46a3-958c-e5f4026cac8c`::Unexpected error Traceback (most recent call last): File /usr/share/vdsm/storage/task.py, line 857, in _run return fn(*args, **kargs) File /usr/share/vdsm/logUtils.py, line 45, in wrapper res = f(*args, **kwargs) File /usr/share/vdsm/storage/hsm.py, line 2123, in getAllTasksStatuses allTasksStatus = sp.getAllTasksStatuses() File /usr/share/vdsm/storage/securable.py, line 66, in wrapper raise SecureError() SecureError But relevant perhaps is around the time of migration (16:33-16:34) Thread-9968::DEBUG::2013-10-03 16:33:38,811::task::1168::TaskManager.Task::(prepare) Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::finished: {'taskStatus': {'code': 0, 'message': 'Task is initializing', 'taskState': 'running', 'taskResult': '', 'taskID': '0eaac2f3-3d25-4c8c-9738-708aba290404'}} Thread-9968::DEBUG::2013-10-03 16:33:38,811::task::579::TaskManager.Task::(_updateState) Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::moving from state preparing - state finished Thread-9968::DEBUG::2013-10-03 16:33:38,811::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-9968::DEBUG::2013-10-03 16:33:38,812::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-9968::DEBUG::2013-10-03 16:33:38,812::task::974::TaskManager.Task::(_decref) Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::ref 0 aborting False 0eaac2f3-3d25-4c8c-9738-708aba290404::ERROR::2013-10-03 16:33:38,847::task::850::TaskManager.Task::(_setError) Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::Unexpected error Traceback (most recent call last): File /usr/share/vdsm/storage/task.py, line 857, in _run return fn(*args, **kargs) File /usr/share/vdsm/storage/task.py, line 318, in run return self.cmd(*self.argslist, **self.argsdict) File /usr/share/vdsm/storage/sp.py, line 272, in startSpm self.masterDomain.acquireHostId(self.id) File /usr/share/vdsm/storage/sd.py, line 458, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File /usr/share/vdsm/storage/clusterlock.py, line 189, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: ('d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291', SanlockException(5, 'Sanlock lockspace add failure', 'Input/output error')) 0eaac2f3-3d25-4c8c-9738-708aba290404::DEBUG::2013-10-03 16:33:38,847::task::869::TaskManager.Task::(_run) Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::Task._run: 0eaac2f3-3d25-4c8c-9738-708aba290404 () {} failed - stopping task 0eaac2f3-3d25-4c8c-9738-708aba290404::DEBUG::2013-10-03 16:33:38,847::task::1194::TaskManager.Task::(stop) Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::stopping in state running (force False) Instead at source: Thread-10402::ERROR::2013-10-03 16:35:03,713::vm::244::vm.Vm::(_recover) vmId=`4147e0d3-19a7-447b-9d88-2ff19365bec0`::migration destination error: Error creating the requested VM Thread-10402::ERROR::2013-10-03 16:35:03,740::vm::324::vm.Vm::(run) vmId=`4147e0d3-19a7-447b-9d88-2ff19365bec0`::Failed to migrate Traceback (most recent call last): File /usr/share/vdsm/vm.py, line 311, in run self._startUnderlyingMigration() File /usr/share/vdsm/vm.py, line 347, in _startUnderlyingMigration response['status']['message']) RuntimeError: migration destination error: Error creating the requested VM Thread-1161::DEBUG::2013-10-03 16:35:04,243::fileSD::238::Storage.Misc.excCmd::(getReadDelay) '/usr/bin/dd iflag=direct if=/rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/metadata bs=4096 count=1' (cwd None) Thread-1161::DEBUG::2013-10-03 16:35:04,262::fileSD::238::Storage.Misc.excCmd::(getReadDelay) SUCCESS: err = '0+1 records in\n0+1 records out\n512 bytes (512 B) copied, 0.0015976 s, 320 kB/s\n'; rc = 0 Thread-1161::INFO::2013-10-03 16:35:04,269::clusterlock::174::SANLock::(acquireHostId) Acquiring host id for domain d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291 (id: 2) Thread-1161::DEBUG::2013-10-03
Re: [Users] HA in oVirt 3.3
- Original Message - From: Andres Gonzalez tuc...@gmail.com To: Itamar Heim ih...@redhat.com Cc: users@ovirt.org users@ovirt.org Sent: Thursday, October 3, 2013 3:17:33 PM Subject: Re: [Users] HA in oVirt 3.3 On Wed, Oct 2, 2013 at 3:45 PM, Itamar Heim ih...@redhat.com wrote: On 10/02/2013 09:04 PM, emi...@gmail.com wrote: But, if i want HA, if the host were the vm it's running lost connectivity with the manager, the vm should start in other host of the cluster right? HA is currently controlled by the engine, which needs to know the VM isn't running on the host to launch it on another host. we're looking at relying on sanlock for engine to be able to launch the VM on another host without fencing the first host. Hi, on RHEV 3.x it works as emitor needs, oVirt has another behavior ? Hi Andres, oVirt behaves just like RHEV. What Itamar refers to are thoughts we have on using storage based synchronization of the HA information, which will remove the need of fencing. This is still in the thinking phase, but some bits of it are already being implemented as a part of the hosted engine feature. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Quota for VMs created from templates
- Original Message - From: Einav Cohen eco...@redhat.com To: Mitja Mihelič mitja.mihe...@arnes.si Cc: users@ovirt.org Sent: Thursday, October 3, 2013 5:06:36 PM Subject: Re: [Users] Quota for VMs created from templates AFAIK, a user cannot create a VM that is associated with one (or more) quota objects on which he doesn't have consumer permissions. i.e. if the VM was created successfully by the user, and this VM is associated with TemplateQuota, and with the quota that has been created for the user (let's call it UserQuota), it means that the user has consumer permissions on both TemplateQuota and UserQuota. If the user doesn't have permissions on one of these Quota objects - the fact that the VM has been created successfully sounds like a bug to me. Thanks, Einav Thanks Einav. Mitja, Compute resources (CPU, RAM) are being consumed when the VM is running. So yes, you can create how many VMs you'd like as long as you don't exceed storage limit, but run VMs only as many as your quota permit (cpu and memory limitations). The difference between storage to compute resources is that storage is being consumed when the disks are copied while memory and cpu consumed when the VM is running. Thanks, Gilad. - Original Message - From: Mitja Mihelič mitja.mihe...@arnes.si To: users@ovirt.org Sent: Thursday, October 3, 2013 9:59:06 AM Subject: [Users] Quota for VMs created from templates Hi! We are running engine version 3.3.0 on CentOS6 and we have come across a problem, possibly a bug. When a user creates a VM from a template, the template's quota is assigned to the VM. Here is the setup: - quota is set to Enforced on the data center - quota is created for template purposes (TemplateQuota) - a template is created from a sealed VM with TemplateQuota assigned to it - quota is created for a user, the user is set as its consumer - the user creates a VM from the mentioned template and leaves the quota unchanged - the created VM consumes the user's storage quota but does not consume their memory and CPU quota This way a user can create and run an arbitrary number of VMs as long they stay within their storage quota. No errors are reported in the logs. Kind regards, Mitja Mihelic -- -- Mitja Mihelič ARNES, Tehnološki park 18, p.p. 7, SI-1001 Ljubljana, Slovenia tel: +386 1 479 8877, fax: +386 1 479 88 78 ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Gluster network info
On 10/03/2013 03:45 PM, Gianluca Cecchi wrote: On Thu, Oct 3, 2013 at 2:08 PM, Dan Kenigsberg wrote: SO the question is : what about ovirtmgmt network for gluster replication when gluster domain is provided by ovirt nodes? I suppose it could be a problem too, couldn't it? yes that is correct, using ovirtmgmt for non management traffic is not a good idea. In case I have a dedicated network for gluster for the nodes, how can i configure it? Generally there two ways to go about this: 1. add a new network role 'gluster replication' and then adjust the engine code to pass the network with this role as parameter to the replication verb in VDSM. 2. in the replication verb in the UI let the user choose which network to use for the replication. I am not familiar with the replication verb so I might be missing something but otherwise I think that solution 2 is more simple and less invasive then requiring a new network role. Bala, is there a means to select, for each node, which IP address is to be used for replication? AFAICT we use the host fqdn, which most probably resolves to the ovirtmgmt device. what is replication verb in the UI? Inside UI when I click create volume and then add Bricks buttons I can only select Server that in my case is a drop down menu containing only ip addresses of my 2 hypervisors Or simply do you mean development steps to integrate this missing functionality in oVirt with steps 1. and 2.? I am probably missing some background around Gluster, could you share what triggers the gluster replication? Thanks, Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] ovirt 3.3 and ilo2 for power fencing?
Hello, is this combination supported? I'm trying to configure Power Management, but I get this: Power Management test failed for Host f18ovn03.Parse error: Ignoring unknown option 'option=status' Unable to connect/login to fencing device What would be correct setup? Any way to test from command line too? The hw is an HP blade BL685 g1 with iLO2 mgmt Thanks in advance. Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] ovirt 3.3 and ilo2 for power fencing?
On Thu, Oct 3, 2013 at 5:58 PM, Gianluca Cecchi wrote: Hello, is this combination supported? I'm trying to configure Power Management, but I get this: Power Management test failed for Host f18ovn03.Parse error: Ignoring unknown option 'option=status' Unable to connect/login to fencing device What would be correct setup? Any way to test from command line too? The hw is an HP blade BL685 g1 with iLO2 mgmt Thanks in advance. Gianluca I forgot an important thing. I configured from webgui and the test succeeded, bu actually the blade was put in power off, not reboot And above the message I got inside the gui Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] iSCSI domain
Hi Jakub, what exactly are you trying to achieve here? Are you just trying to change the IP of your storage server, or do you really need to copy all the data too? - Original Message - From: Jakub Bittner j.bitt...@nbu.cz To: users@ovirt.org Sent: Monday, September 30, 2013 12:36:33 PM Subject: [Users] iSCSI domain Hello, I have to change iSCSI data domain (master) ip address. I am using oVirt 3.3 latest stable and I wonder if there is another (easier) way to change master domain IP than to export all VMs and import them to newly created domain with new IP. Please could you anyone point me how to do that? Thank you. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] iSCSI domain
Dne 3.10.2013 18:16, Allon Mureinik napsal(a): Hi Jakub, what exactly are you trying to achieve here? Are you just trying to change the IP of your storage server, or do you really need to copy all the data too? - Original Message - From: Jakub Bittner j.bitt...@nbu.cz To: users@ovirt.org Sent: Monday, September 30, 2013 12:36:33 PM Subject: [Users] iSCSI domain Hello, I have to change iSCSI data domain (master) ip address. I am using oVirt 3.3 latest stable and I wonder if there is another (easier) way to change master domain IP than to export all VMs and import them to newly created domain with new IP. Please could you anyone point me how to do that? Thank you. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hello Allon, I only need to change IP address of our storage server. Data and connections wont be changed. Just the IP address. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Gluster network info
Gianluca Cecchi wrote: Hello, I remember in the past that could be a problem to have high usage of ovirtmgmt network because engine sometimes detects hosts as unresponsive. And it should this the reason about bandwith limitation on vm migration, until dedicated network for it has been released. SO the question is : what about ovirtmgmt network for gluster replication when gluster domain is provided by ovirt nodes? I suppose it could be a problem too, couldn't it? In case I have a dedicated network for gluster for the nodes, how can i configure it? Can I configure in this case Gluster storage domain from ovirt engine or does it require to be on this network too? No, it doesn't. I have a setup where the two gluster servers are using 10G and ovirtmgmt is 1G. We use split-dns but the poor mens variety. The engine server (mgmt01) sees both gluster servers on its own network (192.168.x.x) but the gluster servers itself have host entries for the same hostnames in the 172.19.x.x range, same holds for the vmhosts. What happens is that you add the storage servers by fqdn and the gluster peer probe commands resolve to the 10G network and all storage traffic is confined to this network. So managment commands arrive on the 192 network and the actual work is done on the storage network. Hope that I haven't confused you beyond hope:-) Joop ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] ovirt 3.3 and ilo2 for power fencing?
On Thu, Oct 03, 2013 at 06:00:23PM +0200, Gianluca Cecchi wrote: On Thu, Oct 3, 2013 at 5:58 PM, Gianluca Cecchi wrote: Hello, is this combination supported? I'm trying to configure Power Management, but I get this: Power Management test failed for Host f18ovn03.Parse error: Ignoring unknown option 'option=status' Unable to connect/login to fencing device That's odd. status is indeed not an ilo option, it's an action. Do you have anything filled in the Options field of the host's power management dialog? Could you copy the thread starting with fenceNode from your vdsm.log? What would be correct setup? Any way to test from command line too? The hw is an HP blade BL685 g1 with iLO2 mgmt Thanks in advance. Gianluca I forgot an important thing. I configured from webgui and the test succeeded, bu actually the blade was put in power off, not reboot When was the host powered off? during test, or when there has been a need for fencing? Dan. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] ovirt 3.3 and ilo2 for power fencing?
On Thu, Oct 3, 2013 at 10:46 PM, Dan Kenigsberg wrote: On Thu, Oct 03, 2013 at 06:00:23PM +0200, Gianluca Cecchi wrote: On Thu, Oct 3, 2013 at 5:58 PM, Gianluca Cecchi wrote: Hello, is this combination supported? I'm trying to configure Power Management, but I get this: Power Management test failed for Host f18ovn03.Parse error: Ignoring unknown option 'option=status' Unable to connect/login to fencing device That's odd. status is indeed not an ilo option, it's an action. Do you have anything filled in the Options field of the host's power management dialog? Could you copy the thread starting with fenceNode from your vdsm.log? I don't find that string... neither in f18ovn03 (where I configured fencing and clicked the test button) nor in f18ovn01 (the node that oVirt should have used to test fencing I suppose) What would be correct setup? Any way to test from command line too? The hw is an HP blade BL685 g1 with iLO2 mgmt Thanks in advance. Gianluca I forgot an important thing. I configured from webgui and the test succeeded, bu actually the blade was put in power off, not reboot When was the host powered off? during test, or when there has been a need for fencing? Dan. Starting point was f18ovn03 and f18ovn01 configured without power mgmt I went into edit f18ovn03 and configured power mgmt. After click test, after several seconds returned successful in the power mgmt window. But when I clicked ok button to save, then I saw the error in events tray of the gui. And the host in reboot state in the Gui with hourglass: is this expectd behaviour when you configure power mgmt for a host that it is rebooted? The host remained in reboot state with hourglass so I went to see the iLO console. And I saw that the server was in power off state. I waited a little but nothing happened. I noticed that if I powered on the server, after a few seconds it was powered off again I presume by oVirt... Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] oVirt 3.3 gluster volume active but unable to activate domain
One engine with f19 and two nodes with f19. All with ovirt stable repo for f19. DC defined as GlusterFS The volume is ok, but I can't activate the domain. Relevant logs when I clich activate are below On engine: 2013-10-03 23:05:10,332 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServicesListVDSCommand] (pool-6-thread-50) START, GlusterServicesListVDSCommand(HostName = f18ovn03, HostId = b67bcfd4-f868-49d5-8704-4936ee922249), log id: 5704c54f 2013-10-03 23:05:12,121 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-83) hostFromVds::selectedVds - f18ovn01, spmStatus Free, storage pool Gluster 2013-10-03 23:05:12,142 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-83) SpmStatus on vds 80188ccc-83b2-4bc8-9385-8d07f7458a3c: Free 2013-10-03 23:05:12,144 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-83) starting spm on vds f18ovn01, storage pool Gluster, prevId 1, LVER 0 2013-10-03 23:05:12,148 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServicesListVDSCommand] (pool-6-thread-46) FINISH, GlusterServicesListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@955283ba, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@1ef87397, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@c1b996b6, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@30199726, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@606c4879, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@2b860d38, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@f69fd1f7], log id: 4a1b4d33 2013-10-03 23:05:12,159 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-83) START, SpmStartVDSCommand(HostName = f18ovn01, HostId = 80188ccc-83b2-4bc8-9385-8d07f7458a3c, storagePoolId = eb679feb-4da2-4fd0-a185-abbe459ffa70, prevId=1, prevLVER=0, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log id: 62f11f2d 2013-10-03 23:05:12,169 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-83) spmStart polling started: taskId = ab9f2f84-f89b-44e9-b508-a904420635f4 2013-10-03 23:05:12,232 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServicesListVDSCommand] (pool-6-thread-50) FINISH, GlusterServicesListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@b624c19b, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@3fcab178, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@e28bd497, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@50ebd507, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@813e865a, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@4c584b19, org.ovirt.engine.core.common.businessentities.gluster.GlusterServerService@17720fd8], log id: 5704c54f 2013-10-03 23:05:12,512 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler_Worker-6) START, GlusterVolumesListVDSCommand(HostName = f18ovn01, HostId = 80188ccc-83b2-4bc8-9385-8d07f7458a3c), log id: 39a3f45d 2013-10-03 23:05:12,595 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler_Worker-6) FINISH, GlusterVolumesListVDSCommand, return: {97873e57-0cc2-4740-ae38-186a8dd94718=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@a82da199, d055b38c-2754-4e53-af5c-69cc0b8bf31c=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@ef0c0180}, log id: 39a3f45d 2013-10-03 23:05:14,182 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-83) Failed in HSMGetTaskStatusVDS method 2013-10-03 23:05:14,184 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-83) Error code AcquireHostIdFailure and error message VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2013-10-03 23:05:14,186 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-83) spmStart polling ended: taskId = ab9f2f84-f89b-44e9-b508-a904420635f4 task status = finished 2013-10-03 23:05:14,188 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-83) Start SPM Task failed - result: cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id 2013-10-03 23:05:14,214 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-83) spmStart polling ended, spm
Re: [Users] ovirt 3.3 and ilo2 for power fencing?
On Thu, Oct 03, 2013 at 10:58:41PM +0200, Gianluca Cecchi wrote: On Thu, Oct 3, 2013 at 10:46 PM, Dan Kenigsberg wrote: On Thu, Oct 03, 2013 at 06:00:23PM +0200, Gianluca Cecchi wrote: On Thu, Oct 3, 2013 at 5:58 PM, Gianluca Cecchi wrote: Hello, is this combination supported? I'm trying to configure Power Management, but I get this: Power Management test failed for Host f18ovn03.Parse error: Ignoring unknown option 'option=status' Unable to connect/login to fencing device That's odd. status is indeed not an ilo option, it's an action. Do you have anything filled in the Options field of the host's power management dialog? Could you copy the thread starting with fenceNode from your vdsm.log? I don't find that string... neither in f18ovn03 (where I configured fencing and clicked the test button) nor in f18ovn01 (the node that oVirt should have used to test fencing I suppose) fenceNode should be sought on the fencing host (f18ovn01) not the victim. do you have any other host in the datacenter? Maybe your vdsm.log has been log-rotated? Could you look at your engine.log for hints? What would be correct setup? Any way to test from command line too? The hw is an HP blade BL685 g1 with iLO2 mgmt Thanks in advance. Gianluca I forgot an important thing. I configured from webgui and the test succeeded, bu actually the blade was put in power off, not reboot When was the host powered off? during test, or when there has been a need for fencing? Dan. Starting point was f18ovn03 and f18ovn01 configured without power mgmt I went into edit f18ovn03 and configured power mgmt. After click test, after several seconds returned successful in the power mgmt window. But when I clicked ok button to save, then I saw the error in events tray of the gui. And the host in reboot state in the Gui with hourglass: is this expectd behaviour when you configure power mgmt for a host that it is rebooted? Not at all. test ought to be harmless, and never cause an unintentional reboot. The host remained in reboot state with hourglass so I went to see the iLO console. And I saw that the server was in power off state. I waited a little but nothing happened. I noticed that if I powered on the server, after a few seconds it was powered off again I presume by oVirt... Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] ovirt 3.3 and ilo2 for power fencing?
On Thu, Oct 3, 2013 at 11:24 PM, Dan Kenigsberg wrote: fenceNode should be sought on the fencing host (f18ovn01) not the victim. do you have any other host in the datacenter? Maybe your vdsm.log has been log-rotated? Could you look at your engine.log for hints? Nodes and engine configured right today. I can send full logs for engine and vdsm hosts as they have not been rotated yet When was the host powered off? during test, or when there has been a need for fencing? After clicking OK in the window when you configure power mgmt. Donna about eventual need for fenging, but it could be as I'm having right now problems in activating the storage domain (see the other thread) Not at all. test ought to be harmless, and never cause an unintentional reboot. Ok for test phase inside the configuration window. But when then you want to confirm the change and click on ok button? Anything expected? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] oVirt 3.3 gluster volume active but unable to activate domain
On messages of vdsm hosts Oct 3 23:05:57 f18ovn03 sanlock[1146]: 2013-10-03 23:05:57+0200 16624 [13543]: read_sectors delta_leader offset 512 rv -5 /rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids Oct 3 23:05:58 f18ovn03 sanlock[1146]: 2013-10-03 23:05:58+0200 16625 [1155]: s2142 add_lockspace fail result -5 Oct 3 23:04:24 f18ovn01 sanlock[1166]: 2013-10-03 23:04:24+0200 16154 [1172]: s2688 add_lockspace fail result -5 Oct 3 23:04:24 f18ovn01 vdsm TaskManager.Task ERROR Task=`bd6b0848-4550-483e-9002-e3051a2e1074`::Unexpected error Oct 3 23:04:25 f18ovn01 vdsm TaskManager.Task ERROR Task=`f2ac595d-cc9d-4125-a7be-7b8706cc9ee3`::Unexpected error Oct 3 23:04:25 f18ovn01 vdsm TaskManager.Task ERROR Task=`6c03be35-57f1-405e-b127-bd708defad67`::Unexpected error Oct 3 23:04:26 f18ovn01 sanlock[1166]: 2013-10-03 23:04:26+0200 16157 [21348]: read_sectors delta_leader offset 0 rv -5 /rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids Oct 3 23:04:27 f18ovn01 sanlock[1166]: 2013-10-03 23:04:27+0200 16158 [1172]: s2689 add_lockspace fail result -5 But the origin possibly is a split brain detected at gluster level I see it in rhev-data-center-mnt-glusterSD-f18ovn01.mydomain:gvdata.log this afternoon around the time I had installed the first guest and ran a shutdown and a power on. see: https://docs.google.com/file/d/0BwoPbcrMv8mvNHNOVlNrOFFabjQ/edit?usp=sharing Why gluster logs are two hours behind? UTC? Any way to set them with the current system time? Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] oVirt 3.3 gluster volume active but unable to activate domain
And in fact after solving the split brain, the gluster domain automatically activated. From rhev-data-center-mnt-glusterSD-f18ovn01.mydomain:gvdata.log under /var/log/glusterfs I found ids file was the one not in sync As the VM only started on f18ovn03 and I was not able to migrate to f18ovn01, I decided to delete the file form f18ovn01. BTW: what does dom_md/ids contain? [2013-10-03 22:06:33.543730] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-gvdata-replicate-0: Unable to self-heal contents of '/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 2 ] [ 2 0 ] ] [2013-10-03 22:06:33.544013] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-gvdata-replicate-0: background data self-heal failed on /d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids [2013-10-03 22:06:33.544522] W [afr-open.c:213:afr_open] 0-gvdata-replicate-0: failed to open as split brain seen, returning EIO [2013-10-03 22:06:33.544603] W [page.c:991:__ioc_page_error] 0-gvdata-io-cache: page error for page = 0x7f4b80004910 waitq = 0x7f4b8001da60 [2013-10-03 22:06:33.544635] W [fuse-bridge.c:2049:fuse_readv_cbk] 0-glusterfs-fuse: 132995: READ = -1 (Input/output error) [2013-10-03 22:06:33.545070] W [client-lk.c:367:delete_granted_locks_owner] 0-gvdata-client-0: fdctx not valid [2013-10-03 22:06:33.545118] W [client-lk.c:367:delete_granted_locks_owner] 0-gvdata-client-1: fdctx not valid I found that gluster creates hard links, so you have to delete all copies of conflicting file from the brick directory of the node you choose to delete from. Thanks very much to this link: http://inuits.eu/blog/fixing-glusterfs-split-brain So these my steps: locate [root@f18ovn01 d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291]# find /gluster/DATA_GLUSTER/brick1/ -samefile /gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids -print /gluster/DATA_GLUSTER/brick1/.glusterfs/ae/27/ae27eb8d-c653-4cc0-a054-ea376ce8097d /gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids and then delete both [root@f18ovn01 d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291]# find /gluster/DATA_GLUSTER/brick1/ -samefile /gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids -print -delete /gluster/DATA_GLUSTER/brick1/.glusterfs/ae/27/ae27eb8d-c653-4cc0-a054-ea376ce8097d /gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids An after this step no more E lines in gluster log and gluster domain automatically activated by engine. Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Migration issues with ovirt 3.3
On Wed, Oct 2, 2013 at 12:07 AM, Jason Brooks wrote: I'm having this issue on my ovirt 3.3 setup (two node, one is AIO, GlusterFS storage, both on F19) as well. Jason I uploaded all my day logs to bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1007980 I can reproduce the problem and the migrate phase is what generates apparently files to heal in gluster volume VM is running on f18ovn03 before migrate [root@f18ovn01 vdsm]# gluster volume heal gvdata info Gathering Heal info on volume gvdata has been successful Brick 10.4.4.58:/gluster/DATA_GLUSTER/brick1 Number of entries: 0 Brick 10.4.4.59:/gluster/DATA_GLUSTER/brick1 Number of entries: 0 Start migrate that fails with the iface error And now you see: [root@f18ovn01 vdsm]# gluster volume heal gvdata info Gathering Heal info on volume gvdata has been successful Brick 10.4.4.58:/gluster/DATA_GLUSTER/brick1 Number of entries: 1 /d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids Brick 10.4.4.59:/gluster/DATA_GLUSTER/brick1 Number of entries: 1 /d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids While on f18ovn03: [root@f18ovn03 vdsm]# gluster volume heal gvdata info Gathering Heal info on volume gvdata has been successful Brick 10.4.4.58:/gluster/DATA_GLUSTER/brick1 Number of entries: 0 Brick 10.4.4.59:/gluster/DATA_GLUSTER/brick1 Number of entries: 0 Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users