On 6/15/20 09:18, Matthew Ahrens via openzfs-developer wrote:
> However, even so, looking up in the BRT for every single zio_free()
> would be a substantial cost. [...]

After giving it some more thought we could avoid that cost by leveraging
the fact that we operate on offsets within VDEVs.

We could maintain a table of fixed size regions for each VDEV. The table
entry is a reference counter. Let's call it Table of Regions (ToR)...

For example we divide a VDEV into 1GB regions. Each region gets his own
32-bit counter (21-bit counter would be enough as we can get only 2^21
512-byte blocks in 1GB). Every time _new_ entry in RBT shows up, we
increase the counter in ToR's entry for this block. Every time we free a
block we take a look at ToR first to see if we should check RBT. If the
counter for this region is 0 there are no entries in RBT, thus there is
no need to consult RBT, so there is no additional cost for zio_free().

ToR is extremely small. For 1GB regions and 32 counter it takes 4kB
(four kilobytes) of RAM per 1TB per top-level VDEV.

Note that ToR is only updated for a new entry in RBT or when entry is
removed from RBT. We don't update ToR when we increase counter on an
existing RBT entry.

-- 
Pawel Jakub Dawidek

------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/Te62797341aee0806-M557cacb30e3094ff907e04f5
Delivery options: https://openzfs.topicbox.com/groups/developer/subscription

Reply via email to