On Tue, Aug 21, 2018 at 04:07:08PM -0500, Mike Christie wrote:
> On 08/08/2018 02:31 PM, Greg Edwards wrote:
>> When T10 PI is enabled on a backing device for the iblock backstore, the
>> PI SGL for the entire command is attached to the first bio only. This
>> works fine if the command is covered by a single bio, but results in
>> integrity verification errors for the other bios in a multi-bio command.
>>
>
> Did you hit this with a older distro kernel?
>
> It looks like iblock_get_bio will alloc a bio that has enough vecs for
> the entire cmd (bi_max_vecs will equal sgl_nents). So it is not clear to
> me how does the bio_add_page call ever return a value other than
> sg->length, and we end up doing another iblock_get_bio call?
I hit it with the tip of Linus' tree, but it depended on some other
in-flight changes. Those other changes are now in Linus' tree for 4.19,
with the exception of [1].
Without [1], when doing a large read I/O through vhost + iblock to a T10
PI enabled device (I used scsi_debug), you first hit the vhost
VHOST_SCSI_PREALLOC_PROT_SGLS limitation noted in [1].
Once the limitation on I/O size is no longer gated by
VHOST_SCSI_PREALLOC_PROT_SGLS, the next issue I hit is the one this
patch addresses. I should have been more precise in my commit message.
The failure is actually a bio_integrity_alloc() failure to allocate the
bip_vec when cmd->t_prot_nents exceeds 256 (BIO_MAX_PAGES), which
results in the following failure on the host:
[ 53.780723] Unable to allocate bio_integrity_payload
and the following failure on the client:
[ 28.724432] sd 0:0:1:0: [sda] tag#40 FAILED Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
[ 28.736127] sd 0:0:1:0: [sda] tag#40 Sense Key : Not Ready [current]
[ 28.744567] sd 0:0:1:0: [sda] tag#40 Add. Sense: Logical unit communication
failure
[ 28.754724] sd 0:0:1:0: [sda] tag#40 CDB: Read(10) 28 20 00 00 00 00 00 38
00 00
[ 28.766190] print_req_error: I/O error, dev sda, sector 0
By splitting up the PI SGL across the bios, you avoid ever trying to
allocate a too-large bip_vec (I was testing with 32 MiB I/Os).
Here is how I am testing:
L1 VM:
# modprobe scsi_debug dif=1 dix=1 guard=0 dev_size_mb=6144
# targetcli <<EOF
/backstores/block create dev=/dev/sda name=scsi_debug
/vhost create wwn=naa.50014055d17a5a87
/vhost/naa.50014055d17a5a87/tpg1/luns/ create /backstores/block/scsi_debug
EOF
L2 VM is booted with QEMU vhost option 't10_pi=on', which depends on
QEMU patches [2]:
-device vhost-scsi-pci,wwpn=naa.50014055d17a5a87,t10_pi=on \
then in L2 VM:
# cat /sys/block/sda/queue/max_hw_sectors_kb >
/sys/block/sda/queue/max_sectors_kb
# dd if=/dev/sda of=/dev/null bs=32M iflag=direct count=1
The end goal being to have a vehicle to test large I/Os through
virtio_scsi to a PI enabled device.
Greg
[1]
https://lists.linuxfoundation.org/pipermail/virtualization/2018-August/039040.html
[2] https://lists.nongnu.org/archive/html/qemu-devel/2018-08/msg01298.html