Am 27.04.2023 um 15:22 hat zhoushl geschrieben: > Hi Kevin: > I’m sorry for missing commit message, next time I will be careful. When > the application in guest vm execute fsync, qemu will execute fsync too. > But when aio + dio is enabled, pagecache is bypassed
As far as I can tell, you don't need AIO for that, only DIO. > and we could assure the data is on disk No. > (at least on the disk cache), In some cases, for a local file system on a physical disk, yes. But this is not enough. The promise when a guest application calls fsync() is not that the data is in a potentially volatile disk cache, but on disk. If the image is on a network file system, there are other options where the data could still be cached, like the page cache of the server. > so there is no needto sync anymore. For example, we could execute the > following python script in vm: > > #!/usr/bin/python > import os > > fo = os.open(“test.txt”, os.O_RDWR|os.O_CREAT) > while True: > os.write(fo, “123\n”) > os.fsync(fo) > > os.closed(fo) > > In this case, each write will take an fsync operation, which will > search the dirty page in pagecache, force flushing the metadata and > data into disk, which is often useless and waste IO resource and maybe > will cause write amplification in filesystem. Yes, if you request an fsync(), you get an fsync(). This is necessary to fulfill the guarantes that fsync() makes. If a guest application doesn't want fsync() semantics, it shouldn't call it. QEMU has an option cache.no-flush=on for block backends (cache=unsafe contains this), which will skip flushes. This is unsafe and if your host crashes, you may get a corrupted file system in the guest. But at the risk of losing your filesystem, it does save the overhead of these operations that you want to avoid. Kevin
