Hi,
I want to come back to the discussion from Oct. 2016 on tech@ about a
disk cache flush ioctl.
The problem we want to solve is that in case of power loss, there should
be no data loss on partitions that are either mounted read-only or are
unmounted. This should also be true if such a partition was previously
mounted r/w (this is what makes this difficult).
The most common situation where this is an issue are devices like home
routers, which have their root fs on a ramdisk and only store
configuration data on the disk. When the configuration is changed, the r/o
mount is changed to r/w, the data is written, and then the mount is
changed to r/o again.
So, the proposal was to
* add a DIOCCACHESYNC ioctl that can be used to flush a disk's cache
* add code to the file systems that executes this ioctl when a mount
is updated from r/w to r/o
* change the various disk devices to do a cache flush whenever a
writable physio file descriptor is closed on a partition. Right now
this is only done if the last such file descriptor for a complete
disk is closed.
There was some argument that the cache flush should not be done by the
filesystems but in a central place. The problem with that is that
currently the file systems do not notify anyone if a mount is changed
between r/o and r/w. So it's quite possible that a file system does
VOP_OPEN() and VOP_CLOSE() with different setting of F_WRITE. Or it
could be that despite the VOP_OPEN() call only having F_READ, the file
system is later changed to r/w. So, if we wanted to go this way, we
would need a new call (VOP_UPDATE?) to change F_WRITE in the flags and
make all file systems use it..
Another issue voiced was the performance impact. I don't think that
umounting or remounting file systems happens often enough for this to be a
problem. For scsi it would actually be possible to do a cache flush only
for a single partition, but for ata/nvme there is only an API for a cache
flush for the whole disk.
I will re-send the patches that I have in separate mails.
Are there any other ideas how to go forward with this?
Cheers,
Stefan