[developer] sub-optimal performance of get_clones_stat()

Andriy Gapon Thu, 01 Dec 2016 01:24:43 -0800

get_clones_stat() could be very slow if a snapshot has many (thousands) clones.
There seems to be two factors that contribute to that.


Clone names are added to an nvlist that's created with NV_UNIQUE_NAME.
So, each time a new name is appended to the list, the whole list is searched
linearly to see if that name is not already in the list.  That results in the
quadratic complexity.
That should be easy to fix as we know in advance that we should not get any
duplicate names, so we can drop NV_UNIQUE_NAME when creating the list.

Also, for each clone we call dsl_dataset_hold_obj() -> dsl_dir_hold_obj().  If
the clone dataset is not yet instantiated, for example, because the clones are
not mounted, then we need to call zap_value_search(), which is expensive.  And
if all the clones have the same parent, then the cost of zap_value_search() is
proportional to the number of clones.

I wonder if it would make sense to extend the on disk information with
reverse-lookup ZAPs, so that we could avoid having to use zap_value_search() in
the hot paths.  That could also help with issues like
https://www.illumos.org/issues/7606

-- 
Andriy Gapon


-------------------------------------------
openzfs-developer
Archives: https://www.listbox.com/member/archive/274414/=now
RSS Feed: https://www.listbox.com/member/archive/rss/274414/28015062-cce53afa
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=28015062&id_secret=28015062-f966d51c
Powered by Listbox: http://www.listbox.com

[developer] sub-optimal performance of get_clones_stat()

Reply via email to