[ovirt-devel] Re: qemu-img process stuck

Vojtech Juranek Fri, 01 Nov 2019 05:53:27 -0700

> On Thu, Oct 24, 2019 at 12:53 PM Vojtech Juranek <[email protected]>
> wrote:
> >
> >
> > Hi,
> > I'm doing some manual storage tests on RHEL 8.1 host (kernel version
> > 4.18.0-107.el8.x86_64)
 and run into following issue: when I try to move
> > an image to block SD and RHEL 8.1 host is SPM, qemu-img gets stuck. There
> > are errors regarding blk_cloned_rq_check_limits on RHEL host and also
> > max_write_same_len on iscsi target (see bellow). iscsi target runs CentOS
> > 7.6.>
> >
> >
> > Before running qemu-img command, everything works fine (e.g. cat a file
> > from iscsi target),
 after running qemu-img, all IO on iscsi target is
> > very slow.
> >
> >
> >
> > When I try to run qemu-img command manually, it's very slow (1GB image
> > take several minutes),
 but finishes, when it's run from vdsm, it seems
> > to be hung forever.
> 
> qemu-img is using BLKZEROOUT to write zeroes quickly, and this uses
> SCSI WRITE_SAME
> in the kernel.
> 
> 
> > In similar cases I found it could be fixed by adjusting various values in
> > /sys/block/#DEVICE/queue/, but in this case, AFAICT all values are correct
> > (same as on CentOS 7.7
 host where everything work). Or it was identified
> > as a bug in the kernel and was suggested to upgrade to newer kernel.
> >
> >
> >
> > Do you know any workaround, how to make it working? Or does it sound to
> > you like a bug in kernel
 which should be reported?
> 
> 
> The issue is that the client kernel see wrong value of
> max_write_same_len. For some reason it thinks
> that the value is 65535 sectors (32MiB) but the value defined on the
> server is only 4096 sectors
> (2 MiB). When qemu-img issue WRITE_SAME with wrong payload size, the
> server reject the
> request and then the client treat this as a fatal error.
> 
> You can fix this by configuring these attributes on the server side:
> 
> $ targetcli /backstores/fileio/my-disk set attribute \
>     emulate_tpu=1 \
>     emulate_tpws=1 \
>     max_write_same_len=65335
> 
> - emulate_tpu enables discard
> - emulate_tpws enables write_same
> - max_write_same_len matches what the client see


Thanks a lot!
I tired this at the beginning of the week and doesn't work, but had no time to 
investigate what's wrong. Today I returned back to it and works. In meantime i 
replaced one iSCSI target running on CentOS 7.6 by new one on CentOS 7.7 - but 
sure if it matter or not, though.

> You can test the settings using blkdiscard like this:
> 
> 1. Set max_write_same_len attribute on a backstore
> 2. login to the iscsi server
> 3. zero the LUN with that you modified on the server
> 
>     blkdiscard -z /dev/mapper/xxxyyy
> 
> I think this is a kernel bug, but not sure if this on the initiator or
> target
 side.
> 
> Nir
> 
> 
> 
> Nir
> _______________________________________________
> Devel mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/ List Archives:
> https://lists.ovirt.org/archives/list/[email protected]/message/ZJNDB7LABRW2X
> DYMKPXSHEIMWT23NBZM/

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/WK4LX4YGVQ4FC265G6MMKBLQ6PDSUOY5/

[ovirt-devel] Re: qemu-img process stuck

Reply via email to