On Mon, Jul 5, 2021 at 3:36 PM Gianluca Cecchi <gianluca.cec...@gmail.com> wrote: > > On Mon, Jul 5, 2021 at 2:13 PM Nir Soffer <nsof...@redhat.com> wrote: >> >> >> > >> > vdsm 14342 3270 0 11:17 ? 00:00:03 /usr/bin/qemu-img convert >> > -p -t none -T none -f raw >> > /rhev/data-center/mnt/blockSD/679c0725-75fb-4af7-bff1-7c447c5d789c/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379 >> > -O raw -o preallocation=falloc >> > /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379 >> >> -o preallocation + NFS 4.0 + very slow NFS is your problem. >> >> qemu-img is using posix-fallocate() to preallocate the entire image at >> the start of the copy. With NFS 4.2 >> this uses fallocate() linux specific syscall that allocates the space >> very efficiently in no time. With older >> NFS versions, this becomes a very slow loop, writing one byte for >> every 4k block. >> >> If you see -o preallocation, it means you are using an old vdsm >> version, we stopped using -o preallocation >> in 4.4.2, see https://bugzilla.redhat.com/1850267. > > > OK. As I said at the beginning the environment is latest 4.3 > We are going to upgrade to 4.4 and we are making some complimentary backups, > for safeness. > >> >> > On the hypervisor the ls commands quite hang, so from another hypervisor I >> > see that the disk size seems to remain at 4Gb even if timestamp updates... >> > >> > # ll >> > /rhev/data-center/mnt/172.16.1.137\:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/ >> > total 4260941 >> > -rw-rw----. 1 nobody nobody 4363202560 Jul 5 11:23 >> > d2a89b5e-7d62-4695-96d8-b762ce52b379 >> > -rw-r--r--. 1 nobody nobody 261 Jul 5 11:17 >> > d2a89b5e-7d62-4695-96d8-b762ce52b379.meta >> > >> > On host console I see a throughput of 4mbit/s... >> > >> > # strace -p 14342 >> >> This shows only the main thread use -f use -f to show all threads. > > > # strace -f -p 14342 > strace: Process 14342 attached with 2 threads > [pid 14342] ppoll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, NULL, NULL, 8 > <unfinished ...> > [pid 14343] pwrite64(12, "\0", 1, 16474968063) = 1 > [pid 14343] pwrite64(12, "\0", 1, 16474972159) = 1 > [pid 14343] pwrite64(12, "\0", 1, 16474976255) = 1 > [pid 14343] pwrite64(12, "\0", 1, 16474980351) = 1 > [pid 14343] pwrite64(12, "\0", 1, 16474984447) = 1 > [pid 14343] pwrite64(12, "\0", 1, 16474988543) = 1 > [pid 14343] pwrite64(12, "\0", 1, 16474992639) = 1 > [pid 14343] pwrite64(12, "\0", 1, 16474996735) = 1 > [pid 14343] pwrite64(12, "\0", 1, 16475000831) = 1 > [pid 14343] pwrite64(12, "\0", 1, 16475004927) = 1
qemu-img is busy in posix_fallocate(), wiring one byte to every 4k block. If you add -tt -T (as I suggested), we can see how much time each write takes, which may explain why this takes so much time. strace -f -p 14342 --tt -T > . . . and so on . . . > > >> > >> > This is a test oVirt env so I can wait and eventually test something... >> > Let me know your suggestions >> >> I would start by changing the NFS storage domain to version 4.2. > > > I'm going to try. RIght now I have set it to the default of autonegotiated... > >> >> 1. kill the hang qemu-img (it will probably cannot be killed, but worth >> trying) >> 2. deactivate the storage domain >> 3. fix the ownership on the storage domain (should be vdsm:kvm, not >> nobody:nobody)3. > > > Unfortunately it is an appliance. I have asked the guys that have it in > charge if we can set them. > Thanks for the other concepts explained. > > Gianluca