On Thu, Dec 1, 2016 at 1:21 AM, Andriy Gapon <a...@freebsd.org> wrote:
> > get_clones_stat() could be very slow if a snapshot has many (thousands) > clones. > There seems to be two factors that contribute to that. > > Clone names are added to an nvlist that's created with NV_UNIQUE_NAME. > So, each time a new name is appended to the list, the whole list is > searched > linearly to see if that name is not already in the list. That results in > the > quadratic complexity. > That should be easy to fix as we know in advance that we should not get any > duplicate names, so we can drop NV_UNIQUE_NAME when creating the list. > Sounds good. > > Also, for each clone we call dsl_dataset_hold_obj() -> > dsl_dir_hold_obj(). If > the clone dataset is not yet instantiated, for example, because the clones > are > not mounted, then we need to call zap_value_search(), which is expensive. > And > if all the clones have the same parent, then the cost of > zap_value_search() is > proportional to the number of clones. > > I wonder if it would make sense to extend the on disk information with > reverse-lookup ZAPs, so that we could avoid having to use > zap_value_search() in > the hot paths. That could also help with issues like > https://www.illumos.org/issues/7606 > > Yeah, it would make sense to have a more efficient way to map from dataset object to its name. Of course the cost is added complexity of keeping this in sync with the name->object mapping (e.g. across rename). I can imagine several ways of doing this: 1. for each name->object ZAP, add an object->name ZAP 2. For the entire storage pool, add a single object->name ZAP 3. For each object, store its name (e.g. in the dsl_dataset_phys_t [impractical due to backwards compatibility and name length], or in the zapified dataset) 4. Read the existing name->object ZAP once and cache the reverse mapping, e.g. 4a. as a separate data structure e.g. AVL or hash table mapping object->name 4b. by forcing all the dsl_dataset_t's it references to be loaded and forcing them to stay in the cache until the pool is unloaded I think that option 3 is the cleanest architecturally, but option 2 might be more efficient. Option 4 has the benefit of not requiring an on-disk format change, which is a big plus. 4a would be pretty straightforward to implement, it could even be done entirely in the ZAP (e.g. first call to zap_value_search() caches the reverse mapping, hung off the zap_t, and future calls to zap_value_search() use the cached mapping). --matt -- > Andriy Gapon > ------------------------------------------- openzfs-developer Archives: https://www.listbox.com/member/archive/274414/=now RSS Feed: https://www.listbox.com/member/archive/rss/274414/28015062-cce53afa Modify Your Subscription: https://www.listbox.com/member/?member_id=28015062&id_secret=28015062-f966d51c Powered by Listbox: http://www.listbox.com