Hi Elad, why did you install vdsm-hook-allocate_net? adding Dan as I think the hook is not supposed to fail this badly in any case
Thanks, michal > On 5 May 2018, at 19:22, Elad Ben Aharon <[email protected]> wrote: > > Start VM fails on: > > 2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: > 'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c62/images/6cdabfe5- > > d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' -> > u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425- > > f35a01928211' (storagexml:334) > 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START > getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', options=None) > from=::ffff:10.35.161.127,53512, task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c > 2758 (api:46) > 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] > /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 err=vm net > allocation hook: [unexpected error]: Traceback (most recent call last): > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > 105, in <module> > main() > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > 93, in main > allocate_random_network(device_xml) > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > 62, in allocate_random_network > net = _get_random_network() > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > 50, in _get_random_network > available_nets = _parse_nets() > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > 46, in _parse_nets > return [net for net in os.environ[AVAIL_NETS_KEY].split()] > File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ > raise KeyError(key) > KeyError: 'equivnets' > > > (hooks:110) > 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process failed > (vm:943) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in > _startUnderlyingVm > self._run() > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in _run > domxml = hooks.before_vm_start(self._buildDomainXML(), > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in > _buildDomainXML > dom, self.id <http://self.id/>, self._custom['custom']) > File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_preprocess.py", line > 240, in replace_device_xml_with_hooks_xml > dev_custom) > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 134, in > before_device_create > params=customProperties) > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 120, in > _runHooksDir > raise exception.HookError(err) > HookError: Hook Error: ('vm net allocation hook: [unexpected error]: > Traceback (most recent call last):\n File > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in > <module>\n main()\n > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > 93, in main\n allocate_random_network(device_xml)\n File > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, i > n allocate_random_network\n net = _get_random_network()\n File > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in > _get_random_network\n available_nets = _parse_nets()\n File "/us > r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in > _parse_nets\n return [net for net in os.environ[AVAIL_NETS_KEY].split()]\n > File "/usr/lib64/python2.7/UserDict.py", line 23, in __getit > em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',) > > > > Hence, the success rate was 28% against 100% running with d/s (d/s). If > needed, I'll compare against the latest master, but I think you get the > picture with d/s. > > vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > libvirt-3.9.0-14.el7_5.3.x86_64 > qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 > kernel 3.10.0-862.el7.x86_64 > rhel7.5 > > > Logs attached > > On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon <[email protected] > <mailto:[email protected]>> wrote: > nvm, found gluster 3.12 repo, managed to install vdsm > > On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon <[email protected] > <mailto:[email protected]>> wrote: > No, vdsm requires it: > > Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) > Requires: glusterfs-fuse >= 3.12 > Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3) > > Therefore, vdsm package installation is skipped upon force install. > > On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek > <[email protected] <mailto:[email protected]>> wrote: > > >> On 5 May 2018, at 00:38, Elad Ben Aharon <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi guys, >> >> The vdsm build from the patch requires glusterfs-fuse > 3.12. This is while >> the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7) > > because it is still oVirt, not a downstream build. We can’t really do > downstream builds with unmerged changes:/ > >> Trying to get this gluster-fuse build, so far no luck. >> Is this requirement intentional? > > it should work regardless, I guess you can force install it without the > dependency > >> >> On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek >> <[email protected] <mailto:[email protected]>> wrote: >> Hi Elad, >> to make it easier to compare, Martin backported the change to 4.2 so it is >> actually comparable with a run without that patch. Would you please try that >> out? >> It would be best to have 4.2 upstream and this[1] run to really minimize the >> noise. >> >> Thanks, >> michal >> >> [1] >> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on-demand-el7-x86_64/28/ >> >> <http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on-demand-el7-x86_64/28/> >> >>> On 27 Apr 2018, at 09:23, Martin Polednik <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: >>>> I will update with the results of the next tier1 execution on latest 4.2.3 >>> >>> That isn't master but old branch though. Could you run it against >>> *current* VDSM master? >>> >>>> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik <[email protected] >>>> <mailto:[email protected]>> >>>> wrote: >>>> >>>>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: >>>>> >>>>>> Hi, I've triggered another execution [1] due to some issues I saw in the >>>>>> first which are not related to the patch. >>>>>> >>>>>> The success rate is 78% which is low comparing to tier1 executions with >>>>>> code from downstream builds (95-100% success rates) [2]. >>>>>> >>>>> >>>>> Could you run the current master (without the dynamic_ownership patch) >>>>> so that we have viable comparision? >>>>> >>>>> From what I could see so far, there is an issue with move and copy >>>>>> operations to and from Gluster domains. For example [3]. >>>>>> >>>>>> The logs are attached. >>>>>> >>>>>> >>>>>> [1] >>>>>> *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >>>>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv> >>>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ >>>>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >>>>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv> >>>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* >>>>>> >>>>>> >>>>>> >>>>>> [2] >>>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ >>>>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/> >>>>>> >>>>>> rhv-4.2-ge-runner-tier1-after-upgrade/7/ >>>>>> >>>>>> >>>>>> >>>>>> [3] >>>>>> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH >>>>>> deleteImage error=Image does not exist in domain: >>>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >>>>>> from=: >>>>>> :ffff:10.35.161.182,40936, flow_id=disks_syncAction_ba6b2630-5976-4935, >>>>>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) >>>>>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) [storage.TaskManager.Task] >>>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error (task:875) >>>>>> Traceback (most recent call last): >>>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, >>>>>> in >>>>>> _run >>>>>> return fn(*args, **kargs) >>>>>> File "<string>", line 2, in deleteImage >>>>>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in >>>>>> method >>>>>> ret = func(*args, **kwargs) >>>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1503, >>>>>> in >>>>>> deleteImage >>>>>> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) >>>>>> ImageDoesNotExistInSD: Image does not exist in domain: >>>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >>>>>> >>>>>> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) [storage.TaskManager.Task] >>>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is aborted: >>>>>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835- >>>>>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code 268 >>>>>> (task:1181) >>>>>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] >>>>>> FINISH >>>>>> deleteImage error=Image does not exist in domain: >>>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>>>>> domain=e5fd29c8-52ba-467e-be09 >>>>>> -ca40ff054d >>>>>> d4' (dispatcher:82) >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon <[email protected] >>>>>> <mailto:[email protected]>> >>>>>> wrote: >>>>>> >>>>>> Triggered a sanity tier1 execution [1] using [2], which covers all the >>>>>>> requested areas, on iSCSI, NFS and Gluster. >>>>>>> I'll update with the results. >>>>>>> >>>>>>> [1] >>>>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 >>>>>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2> >>>>>>> _dev/job/rhv-4.2-ge-flow-storage/1161/ >>>>>>> >>>>>>> [2] >>>>>>> https://gerrit.ovirt.org/#/c/89830/ >>>>>>> <https://gerrit.ovirt.org/#/c/89830/> >>>>>>> vdsm-4.30.0-291.git77aef9a.el7.x86_64 >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik <[email protected] >>>>>>> <mailto:[email protected]>> >>>>>>> wrote: >>>>>>> >>>>>>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: >>>>>>>> >>>>>>>> Hi Martin, >>>>>>>>> >>>>>>>>> I see [1] requires a rebase, can you please take care? >>>>>>>>> >>>>>>>>> >>>>>>>> Should be rebased. >>>>>>>> >>>>>>>> At the moment, our automation is stable only on iSCSI, NFS, Gluster and >>>>>>>> >>>>>>>>> FC. >>>>>>>>> Ceph is not supported and Cinder will be stabilized soon, AFAIR, it's >>>>>>>>> not >>>>>>>>> stable enough at the moment. >>>>>>>>> >>>>>>>>> >>>>>>>> That is still pretty good. >>>>>>>> >>>>>>>> >>>>>>>> [1] https://gerrit.ovirt.org/#/c/89830/ >>>>>>>> <https://gerrit.ovirt.org/#/c/89830/> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik >>>>>>>>> <[email protected] <mailto:[email protected]> >>>>>>>>> > >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi, sorry if I misunderstood, I waited for more input regarding what >>>>>>>>>> >>>>>>>>>>> areas >>>>>>>>>>> have to be tested here. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'd say that you have quite a bit of freedom in this regard. >>>>>>>>>> GlusterFS >>>>>>>>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some suite >>>>>>>>>> that covers basic operations (start & stop VM, migrate it), snapshots >>>>>>>>>> and merging them, and whatever else would be important for storage >>>>>>>>>> sanity. >>>>>>>>>> >>>>>>>>>> mpolednik >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < >>>>>>>>>> [email protected] <mailto:[email protected]> >>>>>>>>>> > >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and >>>>>>>>>>>> cinder, >>>>>>>>>>>> >>>>>>>>>>>> will >>>>>>>>>>>>> have to check, since usually, we don't execute our automation on >>>>>>>>>>>>> them. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Any update on this? I believe the gluster tests were successful, >>>>>>>>>>>>> OST >>>>>>>>>>>>> >>>>>>>>>>>> passes fine and unit tests pass fine, that makes the storage >>>>>>>>>>>> backends >>>>>>>>>>>> test the last required piece. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <[email protected] >>>>>>>>>>>> <mailto:[email protected]>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> +Elad >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg >>>>>>>>>>>>> <[email protected] <mailto:[email protected]> >>>>>>>>>>>>>> > >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer <[email protected] >>>>>>>>>>>>>> <mailto:[email protected]>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <[email protected] >>>>>>>>>>>>>> <mailto:[email protected]>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please make sure to run as much OST suites on this patch as >>>>>>>>>>>>>>>> possible >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> before merging ( using 'ci please build' ) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> But note that OST is not a way to verify the patch. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Such changes require testing with all storage types we support. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Nir >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < >>>>>>>>>>>>>>>> [email protected] <mailto:[email protected]> >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hey, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've created a patch[0] that is finally able to activate >>>>>>>>>>>>>>>>>> libvirt's >>>>>>>>>>>>>>>>>> dynamic_ownership for VDSM while not negatively affecting >>>>>>>>>>>>>>>>>> functionality of our storage code. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> That of course comes with quite a bit of code removal, mostly >>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> area of host devices, hwrng and anything that touches >>>>>>>>>>>>>>>>>> devices; >>>>>>>>>>>>>>>>>> bunch >>>>>>>>>>>>>>>>>> of test changes and one XML generation caveat (storage is >>>>>>>>>>>>>>>>>> handled >>>>>>>>>>>>>>>>>> by >>>>>>>>>>>>>>>>>> VDSM, therefore disk relabelling needs to be disabled on the >>>>>>>>>>>>>>>>>> VDSM >>>>>>>>>>>>>>>>>> level). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Because of the scope of the patch, I welcome >>>>>>>>>>>>>>>>>> storage/virt/network >>>>>>>>>>>>>>>>>> people to review the code and consider the implication this >>>>>>>>>>>>>>>>>> change >>>>>>>>>>>>>>>>>> has >>>>>>>>>>>>>>>>>> on current/future features. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [0] https://gerrit.ovirt.org/#/c/89830/ >>>>>>>>>>>>>>>>>> <https://gerrit.ovirt.org/#/c/89830/> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In particular: dynamic_ownership was set to 0 >>>>>>>>>>>>>>>>>> prehistorically >>>>>>>>>>>>>>>>>> (as >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> part >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 >>>>>>>>>>>>>>>> <https://bugzilla.redhat.com/show_bug.cgi?id=554961> ) because >>>>>>>>>>>>>>> libvirt, >>>>>>>>>>>>>>> running as root, was not able to play properly with root-squash >>>>>>>>>>>>>>> nfs >>>>>>>>>>>>>>> mounts. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Have you attempted this use case? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I join to Nir's request to run this with storage QE. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Raz Tamir >>>>>>>>>>>>>> Manager, RHV QE >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>> >>>>> >>> _______________________________________________ >>> Devel mailing list >>> [email protected] <mailto:[email protected]> >>> http://lists.ovirt.org/mailman/listinfo/devel >>> <http://lists.ovirt.org/mailman/listinfo/devel> >>> >>> >> >> > > > > > <logs.tar.gz>
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
