I'm one of many end-users with highly dedupable pools held back by DDT and spacemap RW inefficiencies. There's been discussion and presentations - Matt Ahrens' talk at BSDCan 2016 ("Dedup doesn't have to suck") was especially useful, and allocation classes from the ZoL/ZoF work will allow metadata-specific offload to SSD. But briad discussion of this general area is not on the roadmap atm, probably bc so much else is a priority and seems nobody's stepped up.

Speedy dedup would be a killer point for OpenZFS with appropriate pool data (VMs and full similar backups of multi other systems being examples), as frankly compression is some modest percentage, whereas dedup-appropriate pools can easily get entire orders of magnitude larger, 4 - 10x gain.


1. Has there been any discussion of putting this on the radar? Can this be discussed and value-for-effort-needed assessed if not?


2. As (1) is likely to take some time if it happens at all, are there any readily identifiable smaller mitigations that could have a strong or helpful effect, that could fit on the radar sooner, and help?


Examples - spacemaps/metaslabs have tunables to preload them on pool mount/import ("debug_load"), to prevent eviction ("debug_unload"), and to change their block size from 4k to a more efficient size (or allow experimenting). DDT has none of these. Any of these, created similarly for DDT, would probably be very helpful mitigations, or at least allow testing. Can any of these be put on the radar sooner as an interim help, esp. if not too complex?


3. Also efficiency related, there's no easy way to get stats data on *what* metadata is in arc, or loaded/evicted/hit/miss, by subtype. This could be key info for tuning a system or pool. Right now the only stats are for combined metadata as a whole. Makes it hard for end users/admins to diagnose what's lacking/slow (is it DDT or sm doing those load/saves?), and to trial remedial settings or changes. Could a breakdown of arc stats into subtypes (spacemap/DDT/other metadata?) be helpful here? Is it something worth fitting on the radar?






Thank you for discussion and any support or stepping up!


(Unfortunately I'm unable to do much more than raise this to the list - grossly inadequate starting point to contemplate it, but hoping others will find these worth a look)


Stilez.


------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/Td9c7189186fd24f2-M4834c1708d942fe022e70f6f
Delivery options: https://openzfs.topicbox.com/groups/developer/subscription

Reply via email to