Xen wrote:

<quote> So there are two different cases as mentioned: existing block writes, 
 and new block writes. What I was gabbing about earlier would be forcing 
 a filesystem to also be able to distuinguish between them. You would 
 have a filesystem-level "no extend" mode or "no allocate" mode that gets 
 triggered. Initially my thought was to have this get triggered trough 
 the FS-LVM interface. But, it could also be made operational not through 
 any membrane but simply by having a kernel (module) that gets passed 
 this information. In both cases the idea is to say: the filesystem can 
 do what it wants with existing blocks, but it cannot get new ones.
</quote>

You still have no earthly clue how the various layers work, apparently. For the 
FS to "know" which of it's blocks can be scribbled on and which can't means it 
has to constantly poll the block layer (the next layer down may NOT necessarily 
be LVM) on every write. Goodbye performance.

<quote>
 However, it does mean the filesystem must know the 'hidden geometry' 
 beneath its own blocks, so that it can know about stuff that won't work 
 anymore.
</quote>

I'm pretty sure this was explained to you a couple weeks ago: it's called 
"integration". For 50 years filesystems were DELIBERATELY written to be 
agnostic if not outright ignorant of the underlying block device's 
peculiarities. That's how modular software is written. Sure, some optimizations 
have been made by peaking into attributes exposed by the block layer but those 
attributes don't change over time. They are probed at newfs() time and never 
consulted again.

Chafing at the inherent tradeoffs caused by "lack of knowledge" was why BTRFS 
and ZFS were written. It is  ignorant to keep pounding the "but I want 
XFS/EXT+LVM to be feature parity with BTRFS". It's not supposed to, it was 
never intended and it will never happen. So go use the tool as it's designed or 
go use something else that tickles your fancy.
 
<quote>
 Will mention that I still haven't tested --errorwhenfull yet.
</quote>

But you conveniently overlook the fact that the FS is NOT remotely full using 
any of the standard tools - all of a sudden the FS got signaled that the block 
layer was denying write BIO calls. Maybe there's a helpful kern.err in syslog 
that you wrote support for? 

<quote>
 In principle if you had the means to acquire such a  flag/state/condition, and 
the
 filesystem would be able to block new  allocation wherever whenever, you would 
already
 have a working system.  So what is then non-trivial?
...
 It seems completely obvious that to me at this point, if anything from 
 LVM (or e.g. dmeventd) could signal every filesystem on every affected
 thin volume, to enter a do-not-allocate state, and filesystems would be 
 able to fail writes based on that, you would already have a solution
</quote>

And so therefore in order to acquire this "signal" every write has to be done 
in synchronous fashion and making sure strict data integrity is maintained 
vis-a-vis filesystem data and metadata. Tweaking kernel dirty block size and 
flush intervals are knobs that you can be turned to "signal" user-land that 
write errors are happening. There's no such thing as "immediate" unless you use 
synchronous function calls from userland.

If you want to write your application to handle "mis-behaved" block layers, 
then use O-DIRECT+SYNC.

_______________________________________________
linux-lvm mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Reply via email to