On Tue, Apr 16, 2024 at 7:54 AM Stefan Fritsch <s...@sfritsch.de> wrote:
> adding John Snow to CC because he investigated this in 2020. > > On Fri, 12 Apr 2024, Eric Blake wrote: > > > On Fri, Apr 12, 2024 at 10:06:17AM +0200, Stefan Fritsch wrote: > > > Commit 99868af3d0 changed the hardcoded constant BDRV_SECTOR_SIZE to a > > > dynamic field 'align' but introduced a bug. qemu_iovec_discard_back() > > > is now passed the wanted iov length instead of the actually required > > > amount that should be removed from the end of the iov. > > > > > > The bug can likely only be hit in uncommon configurations, e.g. with > > > icount enabled or when reading from disk directly to device memory. > > > > > > Fixes: 99868af3d0a75cf6 ("dma-helpers: explicitly pass alignment into > DMA helpers") > > > Signed-off-by: Stefan Fritsch <s...@sfritsch.de> > > > --- > > > system/dma-helpers.c | 3 +-- > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > Wow, that bug has been latent for a while (2016, v2.8.0). As such, I > > don't think it's worth holding up 9.0 for, but it is definitely worth > > cc'ing qemu-stable (done now). > > we had some more internal discussions and did some more googling, and it > turns out that this is actually part of a known issue: > > https://lists.nongnu.org/archive/html/qemu-block/2020-07/msg01866.html > https://gitlab.com/qemu-project/qemu/-/issues/259 > > In the above mail to qemu-block, John Snow listed 4 problems in the code > but my patch only fixes the first one. Considering that the code may also > write data to the wrong place (problem 2), I wonder if an assert in the > same place would be better until the underlying issues have been fixed? > Whew, it's been such a long time since I've looked at this, I barely remember. I think my email now knows more about the problem than *I* currently do. I'm glad I wrote it ... I do not currently know what happens if you fix the first issue but not the other four; I don't know if it gets passably "less broken" or if it causes other bigger issues... and I don't have the bandwidth to test that out currently, my apologies. If there's a renewed interest in fixing this, I can try to start ramping back up on it, but I have a few other projects to finish first, but possibly I could discuss this at KVM Forum to help delineate the work into manageable chunks. ~js