I'm one of many end-users with highly dedupable pools held back by DDT and
spacemap RW inefficiencies. There's been discussion and presentations -
Matt Ahrens' talk at BSDCan 2016 ("Dedup doesn't have to suck") was
especially useful, and allocation classes from the ZoL/ZoF work will allow
metadata-specific offload to SSD. But briad discussion of this general area
is not on the roadmap atm, probably bc so much else is a priority and seems
nobody's stepped up.
Speedy dedup would be a killer point for OpenZFS with appropriate pool data
(VMs and full similar backups of multi other systems being examples), as
frankly compression is some modest percentage, whereas dedup-appropriate
pools can easily get entire orders of magnitude larger, 4 - 10x gain.
1. Has there been any discussion of putting this on the radar? Can this be
discussed and value-for-effort-needed assessed if not?
2. As (1) is likely to take some time if it happens at all, are there any
readily identifiable smaller mitigations that could have a strong or
helpful effect, that could fit on the radar sooner, and help?
Examples - spacemaps/metaslabs have tunables to preload them on pool
mount/import ("debug_load"), to prevent eviction ("debug_unload"), and to
change their block size from 4k to a more efficient size (or allow
experimenting). DDT has none of these. Any of these, created similarly for
DDT, would probably be very helpful mitigations, or at least allow testing.
Can any of these be put on the radar sooner as an interim help, esp. if not
too complex?
3. Also efficiency related, there's no easy way to get stats data on *what*
metadata is in arc, or loaded/evicted/hit/miss, by subtype. This could be
key info for tuning a system or pool. Right now the only stats are for
combined metadata as a whole. Makes it hard for end users/admins to
diagnose what's lacking/slow (is it DDT or sm doing those load/saves?), and
to trial remedial settings or changes. Could a breakdown of arc stats into
subtypes (spacemap/DDT/other metadata?) be helpful here? Is it something
worth fitting on the radar?
Thank you for discussion and any support or stepping up!
(Unfortunately I'm unable to do much more than raise this to the list -
grossly inadequate starting point to contemplate it, but hoping others will
find these worth a look)
Stilez.
------------------------------------------
openzfs: openzfs-developer
Permalink:
https://openzfs.topicbox.com/groups/developer/Td9c7189186fd24f2-M4834c1708d942fe022e70f6f
Delivery options: https://openzfs.topicbox.com/groups/developer/subscription