> On Sept. 9, 2015, 10:40 p.m., Matthew Ahrens wrote:
> > usr/src/uts/common/fs/zfs/zfs_vfsops.c, line 564
> > <https://reviews.csiden.org/r/235/diff/1/?file=17030#file17030line564>
> >
> > Are you sure that it's OK to unregister the objset layer's callbacks?
> > This seems like it breaks the layering (DMU registers callbacks, ZPL
> > unregisters them). Could we combine the ZPL's unregistering of callbacks
> > with its disposing of the objset, such that there is one operation which is
> > "disown the objset and unregister all of my (ZPL) callbacks", which the DMU
> > can implement by unregistering all (DMU & ZPL) callbacks?
Looking at this more carefully, I don't think this is safe. It opens a race
window where the dsl_dataset_t could be kept alive by a reference from another
thread, causing an update to be missed. Some other operation could then
potentially see a stale property before all the references were dropped and the
dsl_dataset_t went away.
My first thought was to formally register the layer that owns a particular
property and then pass in a "layer type". Informally, this already occurs
because both the DMU and the ZPL use the same, unique to their respective
layer, callback args for their callbacks ("objset_t *" set for DMU, "zfsvfs_t
*" for ZPL). Changing dsl_prop_unregister_all() to filter by callback arg would
allow the two layers to operate independently. Is this an ok solution?
- Justin
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.csiden.org/r/235/#review774
-----------------------------------------------------------
On Aug. 29, 2015, 2:33 a.m., Justin Gibbs wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.csiden.org/r/235/
> -----------------------------------------------------------
>
> (Updated Aug. 29, 2015, 2:33 a.m.)
>
>
> Review request for OpenZFS Developer Mailing List and Christopher Siden.
>
>
> Bugs: 6171
> https://www.illumos.org/issues/6171
>
>
> Repository: illumos-gate
>
>
> Description
> -------
>
> A change to a property on a dataset must be propagated to its descendants
> in case that property is inherited. For datasets whose information is
> not currently loaded into memory (e.g. a snapshot that isn't currently
> mounted), there is nothing to do; the property change will take effect
> the next time that dataset is loaded. To handle updates to datasets that
> are in-core, ZFS registers a callback entry for each property of each
> loaded dataset with the dsl directory that holds that dataset. There
> is a dsl directory associated with each live dataset that references
> both the live dataset and any snapshots of the live dataset. A property
> change is effected by doing a traversal of the tree of dsl directories
> for a pool, starting at the directory sourcing the change, and invoking
> these callbacks.
>
> The current implementation both registers and de-registers properties
> individually for each loaded dataset. While registration for a property is
> O(1) (insert into a list), de-registration is O(n) (search list and then
> remove). The 'n' for de-registration, however, is not limited to the size
> (number of snapshots + 1) of the dsl directory. The eviction portion
> of the life cycle for the in core state of datasets is asynchronous,
> which allows multiple copies of the dataset information to be in-core
> at once. Only one of these copies is active at any time with the rest
> going through tear down processing, but all copies contribute to the
> cost of performing a dsl_prop_unregister().
>
> One way to create multiple, in-flight copies of dataset information
> is by performing "zfs list" operations from multiple threads
> concurrently. In-core dataset information is loaded on demand and then
> evicted when reference counts drops to zero. For datasets that are not
> mounted, there is no persistent reference count to keep them resident.
> So, a list operation will load them, compute the information required to
> do the list operation, and then evict them. When performing this operation
> from multiple threads it is possible that some of the in-core dataset
> information will be reused, but also possible to lose the race and load
> the dataset again, even while the same information is being torn down.
>
> Compounding the performance issue further is a change made for illumos
> issue 5056 which made dataset eviction single threaded. In environments
> using automation to manage ZFS datasets, it is now possible to create
> enough of a backlog of dataset evictions to consume excessive amounts
> of kernel memory and to bog down the system.
>
> The fix employed here is to make property de-registration O(1). With this
> change in place, it is hoped that a single thread is more than sufficient
> to handle eviction processing. If it isn't, the problem can be solved
> by increasing the number of threads devoted to the eviction taskq.
>
> uts/common/fs/zfs/dsl_dataset.c
> uts/common/fs/zfs/dsl_dir.c:
> uts/common/fs/zfs/dsl_prop.c:
> uts/common/fs/zfs/sys/dsl_dataset.h:
> uts/common/fs/zfs/sys/dsl_dir.h:
> uts/common/fs/zfs/sys/dsl_prop.h:
> Associate dsl property callback records with both the
> dsl directory and the dsl dataset that is registering the
> callback. Both connections are protected by the dsl directory's
> "dd_lock".
>
> When linking callbacks into a dsl directory, group them by
> the property type. This helps reduce the space penalty for the
> double association (the property name pointer is stored once
> per dsl_dir instead of in each record) and reduces the number of
> strcmp() calls required to do callback processing when updating
> a single property. Property types are stored in a linked list
> since currently ZFS registers a maximum of 10 property types
> for each dataset.
>
> Note that the property buckets/records associated with a dsl
> directory are created on demand, but only freed when the dsl
> directory is freed. Given the static nature of property types
> and their small number, there is no benefit to freeing the few
> bytes of memory used to represent the property record earlier.
> When a property record becomes empty, the dsl directory is either
> going to become unreferenced a little later in this thread of
> execution, or there is a high chance that another dataset is
> going to be loaded that would recreate the bucket anyway.
>
> Replace dsl_prop_unregister() with dsl_prop_unregister_all().
> All callers of dsl_prop_unregister() are trying to remove
> all property registrations for a given dsl dataset anyway. By
> changing the API, we can avoid doing any lookups of callbacks
> by property type and just traverse the list of all callbacks
> for the dataset and free each one.
>
> uts/common/fs/zfs/dmu_objset.c:
> uts/common/fs/zfs/zfs_vfsops.c:
> Replace use of dsl_prop_unregister() with the new
> dsl_prop_unregister_all() API.
>
>
> Diffs
> -----
>
> usr/src/uts/common/fs/zfs/sys/dsl_prop.h
> 5fe18d6a7c550b621e9df0ff8458b703c6f1bcf3
> usr/src/uts/common/fs/zfs/sys/dsl_dir.h
> f50014d95a8109e6d715371eee0af641e92730ae
> usr/src/uts/common/fs/zfs/sys/dsl_dataset.h
> 001bff566a888bd9ab1d9c0bc7a5243ae8ca4cb6
> usr/src/uts/common/fs/zfs/zfs_vfsops.c
> 3559b9b7b32db88326f06faa42c83907ab1344e6
> usr/src/uts/common/fs/zfs/dsl_prop.c
> 09eec21604f45aacecd6f87397954d50492a36e0
> usr/src/uts/common/fs/zfs/dsl_dir.c
> 6a20ab3ca2adabddc81c6199374ed99e7ec920fb
> usr/src/uts/common/fs/zfs/dsl_dataset.c
> 7f63cd97bcde4ba4a7708c1311c89ed93c8a8cfa
> usr/src/uts/common/fs/zfs/dmu_objset.c
> 9ad18f0d11fbaa1edb14ac5f2adfd215cb4716b3
>
> Diff: https://reviews.csiden.org/r/235/diff/
>
>
> Testing
> -------
>
> Ztest on illumos.
>
>
> Thanks,
>
> Justin Gibbs
>
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer