Can you enable vdsm debug level logs[1] so we can see the exact command? [1] - https://www.ovirt.org/develop/developer-guide/vdsm/log-files.html
On Sat, Apr 20, 2019 at 1:42 PM Callum Smith <[email protected]> wrote: > > Dear Benny, > > This is via VDSM, but a manual one fails too. > > Background: cloning of an image fails in any context, whether creating from > template, using Clone VM, etc etc. Tested on any 4.3+ on multiple NFS servers > etc. Even tried allowing root access and not, seems to not fix. > > gdb is blank and has no relevant output > > /var/log/messages > Appear to be getting this regularly ~5mins: > Apr 20 10:15:42 virthyp03 nfsidmap[57709]: nss_getpwnam: name 'nobody' does > not > map into domain 'virt.in.bmrc.ox.ac.uk' > Apr 20 10:15:42 virthyp03 nfsidmap[57713]: nss_name_to_gid: name 'nobody' > does n > ot map into domain ‘virt.in.bmrc.ox.ac.uk' > > Some relevant area of /var/log/vdsm/vdsm.log > 2019-04-20 10:17:19,644+0000 INFO (jsonrpc/5) [vdsm.api] START > copyImage(sdUUID=u'0e01f014-530b-4067-aa1d-4e9378626a9d', > spUUID=u'df02d494-56b9-11e9-a05b-00163e4d2a92', vmUUID='', > srcImgUUID=u'6597eede-9fa0-4451-84fc-9f9c070cb5f3', > srcVolUUID=u'765fa48b-2e77-4637-b4ca-e1affcd71e48', > dstImgUUID=u'e1c182ae-f25d-464c-b557-93088e894452', > dstVolUUID=u'a1157ad0-44a8-4073-a20c-468978973f4f', description=u'', > dstSdUUID=u'0e01f014-530b-4067-aa1d-4e9378626a9d', volType=8, volFormat=5, > preallocate=2, postZero=u'false', force=u'false', discard=False) > from=::ffff:10.141.31.240,44298, > flow_id=82e5a31b-5010-4e50-a333-038b16738ecb, > task_id=d6bf22cf-1ea3-419d-8248-6c8842537a2e (api:48) > 2019-04-20 10:17:19,649+0000 INFO (jsonrpc/5) [storage.Image] image > 6597eede-9fa0-4451-84fc-9f9c070cb5f3 in domain > 0e01f014-530b-4067-aa1d-4e9378626a9d has vollist > [u'765fa48b-2e77-4637-b4ca-e1affcd71e48'] (image:313) > 2019-04-20 10:17:19,657+0000 INFO (jsonrpc/5) [storage.Image] Current > chain=765fa48b-2e77-4637-b4ca-e1affcd71e48 (top) (image:702) > 2019-04-20 10:17:19,658+0000 INFO (jsonrpc/5) [IOProcessClient] (Global) > Starting client (__init__:308) > 2019-04-20 10:17:19,668+0000 INFO (ioprocess/57971) [IOProcess] (Global) > Starting ioprocess (__init__:434) > 2019-04-20 10:17:19,679+0000 INFO (jsonrpc/5) [vdsm.api] FINISH copyImage > return=None from=::ffff:10.141.31.240,44298, > flow_id=82e5a31b-5010-4e50-a333-038b16738ecb, > task_id=d6bf22cf-1ea3-419d-8248-6c8842537a2e (api:54) > 2019-04-20 10:17:19,696+0000 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC > call Volume.copy succeeded in 0.05 seconds (__init__:312) > 2019-04-20 10:17:19,697+0000 INFO (tasks/7) > [storage.ThreadPool.WorkerThread] START task > d6bf22cf-1ea3-419d-8248-6c8842537a2e (cmd=<bound method Task.commit of > <vdsm.storage.task.Task instance at 0x7f9f40033c68>>, args=None) > (threadPool:208) > 2019-04-20 10:17:19,713+0000 INFO (tasks/7) [storage.Image] > sdUUID=0e01f014-530b-4067-aa1d-4e9378626a9d vmUUID= > srcImgUUID=6597eede-9fa0-4451-84fc-9f9c070cb5f3 > srcVolUUID=765fa48b-2e77-4637-b4ca-e1affcd71e48 > dstImgUUID=e1c182ae-f25d-464c-b557-93088e894452 > dstVolUUID=a1157ad0-44a8-4073-a20c-468978973f4f > dstSdUUID=0e01f014-530b-4067-aa1d-4e9378626a9d volType=8 volFormat=RAW > preallocate=SPARSE force=False postZero=False discard=False (image:724) > 2019-04-20 10:17:19,719+0000 INFO (tasks/7) [storage.VolumeManifest] Volume: > preparing volume > 0e01f014-530b-4067-aa1d-4e9378626a9d/765fa48b-2e77-4637-b4ca-e1affcd71e48 > (volume:567) > 2019-04-20 10:17:19,722+0000 INFO (tasks/7) [storage.Image] copy source > 0e01f014-530b-4067-aa1d-4e9378626a9d:6597eede-9fa0-4451-84fc-9f9c070cb5f3:765fa48b-2e77-4637-b4ca-e1affcd71e48 > size 104857600 blocks destination > 0e01f014-530b-4067-aa1d-4e9378626a9d:e1c182ae-f25d-464c-b557-93088e894452:a1157ad0-44a8-4073-a20c-468978973f4f > allocating 104857600 blocks (image:767) > 2019-04-20 10:17:19,722+0000 INFO (tasks/7) [storage.Image] image > e1c182ae-f25d-464c-b557-93088e894452 in domain > 0e01f014-530b-4067-aa1d-4e9378626a9d has vollist [] (image:313) > 2019-04-20 10:17:19,723+0000 INFO (tasks/7) [storage.StorageDomain] Create > placeholder > /rhev/data-center/mnt/10.141.15.248:_export_instruct_vm__storage/0e01f014-530b-4067-aa1d-4e9378626a9d/images/e1c182ae-f25d-464c-b557-93088e894452 > for image's volumes (sd:1288) > 2019-04-20 10:17:19,731+0000 INFO (tasks/7) [storage.Volume] Creating volume > a1157ad0-44a8-4073-a20c-468978973f4f (volume:1183) > 2019-04-20 10:17:19,754+0000 INFO (tasks/7) [storage.Volume] Request to > create RAW volume > /rhev/data-center/mnt/10.141.15.248:_export_instruct_vm__storage/0e01f014-530b-4067-aa1d-4e9378626a9d/images/e1c182ae-f25d-464c-b557-93088e894452/a1157ad0-44a8-4073-a20c-468978973f4f > with size = 20480 blocks (fileVolume:463) > 2019-04-20 10:17:19,754+0000 INFO (tasks/7) [storage.Volume] Changing volume > u'/rhev/data-center/mnt/10.141.15.248:_export_instruct_vm__storage/0e01f014-530b-4067-aa1d-4e9378626a9d/images/e1c182ae-f25d-464c-b557-93088e894452/a1157ad0-44a8-4073-a20c-468978973f4f' > permission to 0660 (fileVolume:480) > 2019-04-20 10:17:19,800+0000 INFO (tasks/7) [storage.VolumeManifest] Volume: > preparing volume > 0e01f014-530b-4067-aa1d-4e9378626a9d/a1157ad0-44a8-4073-a20c-468978973f4f > (volume:567) > > I tried to filter the usual noise out of VDSM.log so hopefully this is the > relevant bit you need - let me know if the full thing would help. > > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > On 20 Apr 2019, at 10:54, Benny Zlotnik <[email protected]> wrote: > > Sorry, I kind of lost track of what the problem is > > The "KeyError: 'appsList'" issue is a known bug[1] > > If a manual (not via vdsm) run of qemu-img is actually stuck, then > let's involve the qemu-discuss list, with the version of the relevant > packages (qemu, qemu-img, kernel, your distro) and the output of gdb > commands > > [1] - https://bugzilla.redhat.com/show_bug.cgi?id=1690301 > > > > On Sat, Apr 20, 2019 at 1:36 AM Callum Smith <[email protected]> wrote: > > > Dear Benny and others, > > So it seems I wasn’t being patient with GDB and it does show me some output. > This error of qemu-img convert even is failing and preventing updating > ovirt-node version from 4.3.2 to 4.3.3.1. I get a feeling this is an > unrelated error, but I thought I’d be complete: > > Excuse any typos, im having to type this manually from a remote session, but > the error: > > [733272.427922] hid-generic 0003:0624:0249.0001: usb_submit_urb(ctrl) failed: > -19 > > If this bug is preventing even a local yum updatei can’t see how it’s any > issue other than somehow involved with the hardware of the hypervisor, our > network and storage configuration must be irrelevant to this fact at this > stage? > > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > On 11 Apr 2019, at 12:00, Callum Smith <[email protected]> wrote: > > Without the sudo and running in a dir where the root has access to, gdb has > zero output: > > <Screenshot 2019-04-11 at 12.00.27.png> > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > On 11 Apr 2019, at 11:54, Callum Smith <[email protected]> wrote: > > Some more information: > > running qemu-img convert manually having captured the failed attempt from the > previous: > > sudo -u vdsm /usr/bin/qemu-img convert -p -t none -T none -f raw > /rhev/data-center/mnt/10.141.15.248:_export_instruct_vm__storage/0e01f014-530b-4067-aa1d-4e9378626a9d/images/6597eede-9fa0-4451-84fc-9f9c070cb5f3/765fa48b-2e77-4637-b4ca-e1affcd71e48 > -O raw > /rhev/data-center/mnt/10.141.15.248:_export_instruct_vm__storage/0e01f014-530b-4067-aa1d-4e9378626a9d/images/9cc99110-70a2-477f-b3ef-1031a912d12b/c2776107-4579-43a6-9d60-93a5ea9c64c5 > -W > > Added the -W flag just to see what would happen: > > gdb -p 79913 -batch -ex "t a a bt" > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > 0x00007f528d7661f0 in __poll_nocancel () from /lib64/libc.so.6 > > Thread 1 (Thread 0x7f528e6bb840 (LWP 79913)): > #0 0x00007f528d7661f0 in __poll_nocancel () from /lib64/libc.so.6 > #1 0x00007f528dc510fb in sudo_ev_scan_impl () from > /usr/libexec/sudo/libsudo_util.so.0 > #2 0x00007f528dc49b44 in sudo_ev_loop_v1 () from > /usr/libexec/sudo/libsudo_util.so.0 > #3 0x000055e94aa0e271 in exec_nopty () > #4 0x000055e94aa0afda in sudo_execute () > #5 0x000055e94aa18a12 in run_command () > #6 0x000055e94aa0969e in main () > > > And without -W: > > gdb -p 85235 -batch -ex "t a a bt" > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > 0x00007fc0cc69b1f0 in __poll_nocancel () from /lib64/libc.so.6 > > Thread 1 (Thread 0x7fc0cd5f0840 (LWP 85235)): > #0 0x00007fc0cc69b1f0 in __poll_nocancel () from /lib64/libc.so.6 > #1 0x00007fc0ccb860fb in sudo_ev_scan_impl () from > /usr/libexec/sudo/libsudo_util.so.0 > #2 0x00007fc0ccb7eb44 in sudo_ev_loop_v1 () from > /usr/libexec/sudo/libsudo_util.so.0 > #3 0x00005610f4397271 in exec_nopty () > #4 0x00005610f4393fda in sudo_execute () > #5 0x00005610f43a1a12 in run_command () > #6 0x00005610f439269e in main () > > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > On 11 Apr 2019, at 09:57, Callum Smith <[email protected]> wrote: > > Dear Benny, > > It would seem that even cloning a VM is failing, creating a VM works on the > same storage. This is the only error i could find: > > ERROR Internal server error > Traceback (most > recent call last): > File > "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in > _handle_request > res = > method(**params) > File > "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 194, in > _dynamicMethod > result = > fn(*methodArgs) > File > "<string>", line 2, in getAllVmStats > File > "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method > ret = > func(*args, **kwargs) > File > "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1388, in getAllVmStats > statsList = > self._cif.getAllVmStats() > File > "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 567, in > getAllVmStats > return > [v.getStats() for v in self.vmContainer.values()] > File > "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1766, in getStats > oga_stats = > self._getGuestStats() > File > "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1967, in > _getGuestStats > stats = > self.guestAgent.getGuestInfo() > File > "/usr/lib/python2.7/site-packages/vdsm/virt/guestagent.py", line 505, in > getGuestInfo > del > qga['appsList'] > KeyError: > 'appsList' > > > It's the qemu-img convert for sure that's just failing to do anything, this > is the command from the clone: > /usr/bin/qemu-img convert -p -t none -T none -f raw > /rhev/data-center/mnt/10.141.15.248:_export_instruct_vm__storage/0e01f014-530b-4067-aa1d-4e9378626a9d/images/6597eede-9fa0-4451-84fc-9f9c070cb5f3/765fa48b-2e77-4637-b4ca-e1affcd71e48 > -O raw > /rhev/data-center/mnt/10.141.15.248:_export_instruct_vm__storage/0e01f014-530b-4067-aa1d-4e9378626a9d/images/f0700631-e60b-4c2a-a6f5-a6c818ae7651/d4fb05ec-7c78-4d89-9a66-614c093c6e16 > > gdb has a blank output for this though. This means 4.3.2 is fairly unusable > for us, so two questions, can I downgrade to 4.2, and is there a fix coming > in 4.3.3 for this? > > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > On 10 Apr 2019, at 11:22, Callum Smith <[email protected]> wrote: > > Creating a disk on the target share works fine. - This seems to specifically > be an issue to do with moving a disk to/from a share. > > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > On 10 Apr 2019, at 09:53, Callum Smith <[email protected]> wrote: > > gdb -p $(pidof qemu-img convert) -batch -ex "t a a bt" > 289444: No such file or directory. > > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > On 10 Apr 2019, at 09:36, Benny Zlotnik <[email protected]> wrote: > > Can you run: > $ gdb -p $(pidof qemu-img convert) -batch -ex "t a a bt" > > > On Wed, Apr 10, 2019 at 11:26 AM Callum Smith <[email protected]> wrote: > > > Dear All, > > Further to this, I can't migrate a disk to different storage using the GUI. > Both disks are configured identically and on the same physical NFS provider. > > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > On 9 Apr 2019, at 12:12, Callum Smith <[email protected]> wrote: > > Dear All, > > It would seem this is a bug in 4.3.? - As upgrading the old oVirt HE to 4.3 > (from 4.2.latest) now means that the export of VMs to export domain no longer > works. > > Again qemu-img convert is using some cpu, but no network. Progress is 0. > > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > On 8 Apr 2019, at 15:42, Callum Smith <[email protected]> wrote: > > Dear All, > > We've exported some VMs from our old oVirt infrastructure and want to import > them into the new one, but qemu-img appears to be failing. We have mounted an > export domain populated from the old oVirt in the new hosted engine and are > using the GUI to import the VM. Manually running the command sits at 16% CPU, > 0% network usage and no progress. It appears to lock the NFS mount and ls and > lsof both hang. > > sudo -u vdsm /usr/bin/qemu-img convert -p -t none -T none -f raw <path to > source> -O raw <path to dest> > > Conversely a simple cp will work (ruling out file permissions errors): > sudo -u vddm cp <path to source> <path to test> > > What might we be doing wrong? > > Regards, > Callum > > -- > > Callum Smith > Research Computing Core > Wellcome Trust Centre for Human Genetics > University of Oxford > e. [email protected] > > _______________________________________________ > Users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/ZUBKWQIGGSJETPVRWN42R4J7COPFV6GS/ > > > _______________________________________________ > Users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/6EE6X43GTAJ6L4QBH2XQJ4LVPIXCZC3T/ > > > _______________________________________________ > Users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/GD2XCKEIPNCXXQWMR4OP5IKGCORJQEGB/ > > > _______________________________________________ > Users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/SV3V34DQJZ7A3MH4CQMMIZ4GPPHWFB7E/ > > > _______________________________________________ > Users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/MICXCNEDJL5NOFUNS7R7W5CYDNEFINVK/ > > > _______________________________________________ > Users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/KRV6NPWUCOAMRXOV5S7PIHBMJ3ZOCAAD/ > > > > _______________________________________________ > Users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/R53H5T4ZHFGTSWPJAOORMTAZYD355SBF/ > > > _______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/GMKPV5ZSNINMP4DVI5GX67XICRQQYPTE/

