Kevin, what do you think about it? What guest is intended to receive, when it requests multiple reads to the same buffer in a single DMA transaction?
Should it be the first SG part? The last one? Or just a random set of bytes? (Then why it is reading this data in that case?) Pavel Dovgalyuk > -----Original Message----- > From: Vladimir Sementsov-Ogievskiy [mailto:vsement...@virtuozzo.com] > Sent: Tuesday, February 25, 2020 12:19 PM > To: dovgaluk > Cc: qemu-devel@nongnu.org; mre...@redhat.com; kw...@redhat.com > Subject: Re: Race condition in overlayed qcow2? > > 25.02.2020 10:56, dovgaluk wrote: > > Vladimir Sementsov-Ogievskiy писал 2020-02-25 10:27: > >> 25.02.2020 8:58, dovgaluk wrote: > >>> Vladimir Sementsov-Ogievskiy писал 2020-02-21 16:23: > >>>> 21.02.2020 15:35, dovgaluk wrote: > >>>>> Vladimir Sementsov-Ogievskiy писал 2020-02-21 13:09: > >>>>>> 21.02.2020 12:49, dovgaluk wrote: > >>>>>>> Vladimir Sementsov-Ogievskiy писал 2020-02-20 12:36: > >>>>>> > >>>>>> So, preadv in file-posix.c returns different results for the same > >>>>>> offset, for file which is always opened in RO mode? Sounds impossible > >>>>>> :) > >>>>> > >>>>> True. > >>>>> Maybe my logging is wrong? > >>>>> > >>>>> static ssize_t > >>>>> qemu_preadv(int fd, const struct iovec *iov, int nr_iov, off_t offset) > >>>>> { > >>>>> ssize_t res = preadv(fd, iov, nr_iov, offset); > >>>>> qemu_log("preadv %x %"PRIx64"\n", fd, (uint64_t)offset); > >>>>> int i; > >>>>> uint32_t sum = 0; > >>>>> int cnt = 0; > >>>>> for (i = 0 ; i < nr_iov ; ++i) { > >>>>> int j; > >>>>> for (j = 0 ; j < (int)iov[i].iov_len ; ++j) > >>>>> { > >>>>> sum += ((uint8_t*)iov[i].iov_base)[j]; > >>>>> ++cnt; > >>>>> } > >>>>> } > >>>>> qemu_log("size: %x sum: %x\n", cnt, sum); > >>>>> assert(cnt == res); > >>>>> return res; > >>>>> } > >>>>> > >>>> > >>>> Hmm, I don't see any issues here.. > >>>> > >>>> Are you absolutely sure, that all these reads are from backing file, > >>>> which is read-only and never changed (may be by other processes)? > >>> > >>> Yes, I made a copy and compared the files with binwalk. > >>> > >>>> 2. guest modifies buffers during operation (you can catch it if > >>>> allocate personal buffer for preadv, than calculate checksum, then > >>>> memcpy to guest buffer) > >>> > >>> I added the following to the qemu_preadv: > >>> > >>> // do it again > >>> unsigned char *buf = g_malloc(cnt); > >>> struct iovec v = {buf, cnt}; > >>> res = preadv(fd, &v, 1, offset); > >>> assert(cnt == res); > >>> uint32_t sum2 = 0; > >>> for (i = 0 ; i < cnt ; ++i) > >>> sum2 += buf[i]; > >>> g_free(buf); > >>> qemu_log("--- sum2 = %x\n", sum2); > >>> assert(sum2 == sum); > >>> > >>> These two reads give different results. > >>> But who can modify the buffer while qcow2 workers filling it with data > >>> from the disk? > >>> > >> > >> As far as I know, it's guest's buffer, and guest may modify it during > >> the operation. So, it may be winxp :) > > > > True, but normally the guest won't do it. > > > > But I noticed that DMA operation which causes the problems has the > > following set of the > buffers: > > dma read sg size 20000 offset: c000fe00 > > --- sg: base: 2eb1000 len: 1000 > > --- sg: base: 3000000 len: 1000 > > --- sg: base: 2eb2000 len: 3000 > > --- sg: base: 3000000 len: 1000 > > --- sg: base: 2eb5000 len: b000 > > --- sg: base: 3040000 len: 1000 > > --- sg: base: 2f41000 len: 3000 > > --- sg: base: 3000000 len: 1000 > > --- sg: base: 2f44000 len: 4000 > > --- sg: base: 3000000 len: 1000 > > --- sg: base: 2f48000 len: 2000 > > --- sg: base: 3000000 len: 1000 > > --- sg: base: 3000000 len: 1000 > > --- sg: base: 3000000 len: 1000 > > > > > > It means that one DMA transaction performs multiple reads into the same > > address. > > And no races is possible, when there is only one qcow2 worker. > > When there are many of them - they can fill this buffer simultaneously. > > > > Hmm, actually if guest start parallel reads into same buffer from different > offsets, races are > possible anyway, as different requests run in parallel even with one worker, > because > MAX_WORKERS is per-request value, not total... But several workers may > increase probability of > races or introduce new ones. > > So, actually, several workers of one request can write to the same buffer > only if guest > provides broken iovec, which references the same buffer several times (if it > is possible at > all). > > > > -- > Best regards, > Vladimir