Xen wrote:

> I didn't know thin (or LVM) doesn't maintain maps of used blocks.

Right, so you're ignorant of basics like how the various subsystems work. Like 
I said, go find a text on OS and filesystem design. Hell, read the EXT and LVM 
code or even just the design docs.

> The recent DISCARD improvements apparently just signal some special case 
> (?) but SSDs DO maintain maps or it wouldn't even work (?).

Again, read up on the inner workings of SSDs. To over-simplify, SSDs have their 
own "LVM". No different really than a hardware RAID controller does - 
admittedly most raid controllers don't do anything particularly advanced.

> I don't know, it would seem that having a map of used extents in a thin 
> pool is in some way deeply important in being able to allocate unused 
> ones?

clearly you are in need of much more studying. LVM knows exactly out of all of 
it's defined extents which ones are free and which ones have been assigned to 
an LV - aka written to. What individual blocks (aka range of bytes) inside 
those extents have FS-managed data in them it knows not nor does it care.
 
> I guess continuous polling would be deeply disrespectful of the hardware 
> and software resources.

Not to mention instantaneously invalid. So you poll LVM, "what is your 
allocation map and do you have any free extents?" You get the results. Then the 
FS having been assured there is free space issues writes. But oh no, in the 
round-trip some other LV has grabbed the extent you had intended to use! 
IO=FAIL.

The ONLY way for a FS to "reserve" a set of blocks (aka extent) to itself is to 
write to it - but mind the FS has NO IDEA if needs to do an reservation in the 
first place nor if this IO just so happens to fit inside the allocated range 
but the next IO at offset +1 will require a new extent to be allocated from the 
THINP.

I haven't checked, but it's perfectly possible for LVM THINP to respond to FS 
issued DISCARD notices and thus build an allocation map of an extent. And 
should an extent be fully empty to return the extent to the thin pool. Only to 
have to allocate a new extent if any IO hits the same block range in the 
future. This kind of extent churn is probably not very useful unless your 
workload is in the habit of writing tons of data, freeing it and waiting a 
reasonable amount of time and potentially doing it again. SSDs resort to it 
because they must - it's the nature of the silicon device itself.

> It would say to a filesystem: these regions are currently unavailable.
>
> You would even get more flags:
> 
> - this region is entirely unavailable
> - this region is now more expensive to allocate to
> - this region is the preferred place

All of this "inside knowledge" and "coordination" you so desperately seem to 
want is called integration. And again spelled BTRFS and ZFS. et. al. 
 
> In the theoretical system I proposed it would be a constant

yeah, have fun with that theoretical system.

...

Xen, dude seriously. Go do a LOT more reading.

_______________________________________________
linux-lvm mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Reply via email to