> written as required. If the file system has particular areas
> of importance that need to be writable to prevent file
> system failure, perhaps the file system should have a way of
> communicating this to the volume layer. The naive approach
> here might be to preallocate these critical blocks before
> proceeding with any updates to these blocks, such that the
> failure situations can all be "safe" situations,
> where ENOSPC can be returned without a danger of the file
> system locking up or going read-only.
why all of a sudden does each and every FS have to have this added code to
second guess the block layer? The quickest solution is to mount the FS in sync
mode. Go ahead and pay the performance piper. It's still not likely to be
bullet proof but it's a sure step closer.
What you're saying is that when mounting a block device the layer needs to
expose a "thin-mode" attribute (or the sysdmin sets such a flag via tune2fs).
Something analogous to mke2fs can "detect" LVM raid mode geometry (does that
actually work reliably?).
Then there has to be code in every FS block de-stage path:
IF thin {
tickle block layer to allocate the block (aka write zeros to it? - what about
pre-existing data, is there a "fake write" BIO call that does everything but
actually write data to a block but would otherwise trigger LVM thin's extent
allocation logic?)
IF success, destage dirty block to block layer ELSE
inform userland of ENOSPC
}
In a fully journal'd FS (metadata AND data) the journal could be 'pinned' and
likewise the main metadata areas if for no other reason they are zero'd at
onset and or constantly being written to. Once written to, LVM thin isn't going
to go back and yank away an allocated extent.
This at least should maintain FS integrity albeit you may end up in a situation
where the journal can never get properly de-staged, so you're stuck on any
further writes and need to force RO.
> just want a sanely behaving LVM + XFS...)
IMO if the system admin made a conscious decision to use thin AND overprovision
(thin by itself is not dangerous), it's up to HIM to actively manage his block
layer. Even on million dollar SANs the expectation is that the engineer will do
his job and not drop the mic and walk away. Maybe the "easiest" implementation
would be a MD layer job that the admin can tailor to fail all allocation
requests once extent count drops below a number and thus forcing all FS mounted
on the thinpool to go into RO mode.
But in any event it won't prevent irate users from demanding why the space they
appear to have isn't actually there.
_______________________________________________
linux-lvm mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/