On Wednesday, 14 June 2023 Simon Rowe wrote:

> We've also seen a handful of similar reports. Again, just the MBR sector
> overwritten by what looks to be guest data (e.g. log messages). The
> common thread with our incidents is again a SATA disk under the AHCI
> controller, we have a network backend (iSCSI) which has experienced a
> failure.
>
> I've tried to repro this with blkdebug and simulated write errors,
> without success.

I’ve finally has some success in reproducing this issue. I have a test 
environment set up as follows:
* QEMU 4.2
* guest booting from CD with a small SATA disk
* guest test harness partitions the disk then continually writes data to the 
partition while checking the integrity of the MBR
* filter script that interposes between QEMU and the iSCSI backend, this drops 
writes and then resets the connection after a period of time

>From tracing in the filter script I can see unsolicited writes to LBA 0 once 
>the SATA controller is reset

Data in: iSCSI op 01 SCSI op 28 LBA 0 NOP count 5 wait for read False
Data in: iSCSI op 01 SCSI op 28 LBA 0 NOP count 6 wait for read False
Data in: iSCSI op 01 SCSI op 2a LBA 0 NOP count 0 wait for read True
Data in: iSCSI op 01 SCSI op 28 LBA 0 NOP count 0 wait for read False

I have a stack trace at the time that the write occurs
#0  iscsi_co_writev (bs=0x564322ecc220, sector_num=<optimized out>,
    nb_sectors=1, iov=0x7fc20c045860, flags=<optimized out>)
    at block/iscsi.c:641
#1  0x00005643220e780b in bdrv_driver_pwritev (bs=bs@entry=0x564322ecc220,
    offset=offset@entry=0, bytes=bytes@entry=512,
    qiov=qiov@entry=0x7fc20c045860, qiov_offset=qiov_offset@entry=0,
    flags=flags@entry=0) at block/io.c:1216
#2  0x00005643220e9985 in bdrv_aligned_pwritev (
    child=child@entry=0x564322ecb050, req=req@entry=0x7fc2aa90bb00, offset=0,
    bytes=512, align=align@entry=512, qiov=0x7fc20c045860, qiov_offset=0,
    flags=flags@entry=0) at block/io.c:1980
#3  0x00005643220ea25b in bdrv_co_pwritev_part (child=0x564322ecb050,
    offset=offset@entry=0, bytes=bytes@entry=512,
    qiov=qiov@entry=0x7fc20c045860, qiov_offset=qiov_offset@entry=0, flags=0)
    at block/io.c:2137
#4  0x00005643220ea55b in bdrv_co_pwritev (child=<optimized out>,
    offset=offset@entry=0, bytes=bytes@entry=512,
    qiov=qiov@entry=0x7fc20c045860, flags=<optimized out>) at block/io.c:2087
#5  0x00005643220aa64d in raw_co_pwritev (bs=0x564322ec4a00, offset=0,
    bytes=512, qiov=0x7fc20c045860, flags=<optimized out>)
    at block/raw-format.c:258
#6  0x00005643220e7702 in bdrv_driver_pwritev (bs=bs@entry=0x564322ec4a00,
    offset=offset@entry=0, bytes=bytes@entry=512,
    qiov=qiov@entry=0x7fc20c045860, qiov_offset=qiov_offset@entry=0,
    flags=flags@entry=0) at block/io.c:1183
#7  0x00005643220e9985 in bdrv_aligned_pwritev (
    child=child@entry=0x564322ed28c0, req=req@entry=0x7fc2aa90be70, offset=0,
    bytes=512, align=align@entry=1, qiov=0x7fc20c045860, qiov_offset=0,
    flags=flags@entry=0) at block/io.c:1980
#8  0x00005643220ea25b in bdrv_co_pwritev_part (child=0x564322ed28c0,
    offset=offset@entry=0, bytes=bytes@entry=512,
    qiov=qiov@entry=0x7fc20c045860, qiov_offset=qiov_offset@entry=0, flags=0)
    at block/io.c:2137
#9  0x00005643220d63b4 in blk_do_pwritev_part (blk=0x564322ec4570, offset=0,
    bytes=512, qiov=0x7fc20c045860, qiov_offset=qiov_offset@entry=0,
    flags=<optimized out>) at block/block-backend.c:1231
#10 0x00005643220d650d in blk_aio_write_entry (opaque=0x7fc20c045520)
    at block/block-backend.c:1439
#11 0x000056432218706a in coroutine_trampoline (i0=<optimized out>,
    i1=<optimized out>) at util/coroutine-ucontext.c:115
#12 0x00007fc2afa20190 in ?? () from /lib64/libc.so.6
#13 0x00007fc2b3e01aa0 in ?? ()
#14 0x0000000000000000 in ?? ()
I’m not familiar with the storage code of QEMU, any suggestions about how to 
proceed debugging this?
Regards
Simon

Reply via email to