On Jun 15, 2020, at 1:07 PM, Pawel Jakub Dawidek <pa...@dawidek.net> wrote: > Exactly. Plus on-disk BRT entry size is 24 bytes and in-memory is 80 > bytes for now (vs. 392 bytes of DDT entry). Smaller structure sizes just > delays the problem, of course, but sorting can be a big win. > > Also note that this table only grows when you explicitly clone a block > and not for every block as in dedup case. > > In addition to that when you move a file between datasets, the BRT > entries are just created temporarily, as we create them to create > destination copy, but remove them when we remove the old copy. > > All in all I'd expect this table to much, much smaller than DDT.
Yes, I agree it would be smaller than the DDT — but given that it can be used for offline dedup, you need to plan for it having as many entries as a DDT has refcnt>1 entries. Which could be millions or billions. This means that the options are: • Rewrite the block pointer. (ha ha!) • On every free, check against the BRT to see if there is a refcount entry. • Create a new block pointer, with a flag set, and copy the block if it’s the first duplication. This leaves the old one around, so there is no advantage to this approach if it is done infrequently. Did I miss any there? Sean. ------------------------------------------ openzfs: openzfs-developer Permalink: https://openzfs.topicbox.com/groups/developer/Te62797341aee0806-Mf517e8f11a1772d73548f05c Delivery options: https://openzfs.topicbox.com/groups/developer/subscription