On 05/11/2010 06:40 PM, Anthony Liguori wrote:
But if the goal is to make sure that fsync's don't result in data
actually being on disk, there are many other ways to accomplish this.
First, for the vast majority of users, this is already the case
because ext3 defaults to disabling barriers. Alex is running into
this issue only because SuSE enables barriers by default for ext3 and
fsync()'s are horribly slow on ext3. The fact that this is measurable
is purely a statement of ext3 suckage and a conscious decision by the
SuSE team to live with that suckage. It shouldn't be nearly as bad on
ext4.
There is a huge difference between disabling barriers and cache=volatile.
With barrier=0, data is still forced out of host pagecache and into disk
cache; it's simply not forced from disk cache to the platter. But since
the disk cache is very limited, you'll still be running at disk speed.
With cache=volatile and enough memory you'll never wait for the disk to
write data.
On the other hand, the cost of adding another caching option is pretty
significant. Very few users really understand what caching options
they should use under what circumstance. Adding another option will
just make that worse. The fact that this new option can result in
data corruption that won't occur under normal circumstances is troubling.
I don't think it's a real problem with proper naming, but we can always
hide the option behind a ./configure --enable-developer-options.
Mounting a filesystem with barrier=0 is not a good answer because it's a
global setting. While my qemu VM may be disposable, it's unlikely
that the
same is true of the rest of my machine.
You can have multiple mounts. In fact, you can just make a loopback
mount within your existing file system which means that you don't even
have to muck with your disk.
If your VM is disposable, then shouldn't you be using -snapshot?
For my use case (autotest) the VM is not disposable (it's reused between
tests) but I don't care about the data in case of a host crash.
By your argument linux shouldn't be
allowing me to do that in the first place because a dumb sysadmin
could use
that option on the filesystem containing the mail store. In fact with
the
average user that's *likely* to happen because they'll only have a
single
partition on their machine.
We aren't preventing sophisticated users from doing sophisticated
things. But why should we simplify something that most people don't
need and if they accidentally use it, bad things will happen?
We don't have a real alternative.
--
Do not meddle in the internals of kernels, for they are subtle and quick to
panic.