Re: [OpenZFS Developer] [developer] Review: zfs filesystem and snapshot limits

Jerry Jelinek Mon, 21 Apr 2014 10:20:46 -0700

Matt,

Thanks for reviewing this again. My responses are in-line below.



On Sat, Mar 29, 2014 at 6:17 PM, Matthew Ahrens <[email protected]> wrote:

>
>
>
> On Mon, Mar 10, 2014 at 5:16 PM, Jerry Jelinek 
> <[email protected]>wrote:
>
>> It has been quite a while since I last had a chance to work on this, but
>> I finally got some time to address the previous code review comments.  My
>> responses to Matt's feedback are in-line with each of his comments. There
>> is an updated webrev at:
>>
>>
>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-3/index.html
>>
>
> This is looking great.  I only see one potential bug here (the first
> comment below), the rest of the comments are style / code simplification.
>
> dsl_dir.c:
> 541: The comment above (line 120) indicates that temporary snapshots are
> not counted.  Wouldn't this code count temporary snapshots?
>

yes, thanks for catching that, fixed


>
> 647: What would we need to do to allow the limits to be set in non-global
> zones (on datasets below the one that was delegated to the non-global
> zone), as with space usage quotas?
>

It should be possible to make that work for delegated datasets but I would
rather defer that to a later enhancement since this initial work has
already taken such a long time.


>
> 679: I think you mean  == 0 (obj is not a pointer)
>

fixed


>
> 708: I think "cnt" would be better named "delta" or "increment".  It's the
> amount we are adding to "count", which it could be easily confused with
> (cnt vs count). in dsl_fs_ss_count_adjust() you use "delta".
>

changed


>
> 793: "prop" can be "const char *".  Also, typically the "tx" argument
> comes last.
>

changed


>
> 1791: if the feature is enabled, then dd must be zapified and have the
> counts present.  You should make this more clear by using that is the
> condition, like:
>
> if (spa_feature_is_enabled(...)) {
>   VERIFY0(zap_lookup(...));
>   fs_cnt++;
>   ...
> }
>

changed


>
> 1813, 1820: dsl_fs_ss_count_adjust() already has a fast path for delta==0,
> you should probably call it unconditionally here.
>

changed


>
> dsl_dataset.c:
> 768: should be "it's" (= "it is")
>

fixed.

Also, in a follow-on email you asked about enhancing zdb for this work.
This is another thing I would like to defer. I haven't looked into zdb at
all and it will probably be quite a while before I can get a good block of
time to work on that.

I have posted a new webrev at:

http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-4/index.html

The previous webrevs are still in place if you need to look back at any of
those.

Thanks again,
Jerry


>
>
>>
>>
>> The previous webrev is still available at:
>>
>>
>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-2/index.html
>>
>> This can be used to correlate back to the previous comments if necessary.
>>
>> > Given some of the issues with initializing the counts (detailed below),
>> would
>> > it be reasonable to simply initialize all of the counts (on all
>> > filesystems/zvols in the pool) when the feature is activated?
>>  Obviously that
>> > has a performance impact, but no more than opening the pool (which
>> requires
>> > iterating over all filesystems/zvols to claim the ZIL log blocks).
>>
>> I kind of prefer the code the way it is. I believe I was able to easily
>> resolve
>> the issues below and with this implementation there is minimal impact, and
>> that occurs only when a limit is first being set. Also, we've done it
>> this way
>> since our first attempt at this at the 2012 hackathon, so, I'd rather keep
>> things using the current approach unless there is a fundamental problem
>> with
>> it.
>>
>> > It would be nice to avoid the extra txg_wait_synced associated with the
>> > dsl_dir_activate_fs_ss_limit() sync task whenever possible.  For
>> example, if
>> > the dataset already has its counts present.  You can do this by having
>> the
>> > check func return a nonzero value (e.g. EALREADY) if the counts are
>> present.
>> > Then in dsl_dir_activate_fs_ss_limit(), if it gets EALREADY, return 0.
>>  (this
>> > comment only applies if you stick with the current method of
>> initializing on
>> > demand rather than all at once).  Not a huge deal but something to
>> consider.
>>
>> Added a check for this and return EALREADY as suggested.
>>
>> > I think dsl_snapcount_check() and dsl_dir_fscount_check() can be
>> combined
>> > into one common function, which takes the count and limit zfs_prop_t's
>> as
>> > arguments. Perhaps keeping these 2 funcs as wrappers to pass in the
>> > zfs_prop_t's, if that's useful for convenience.  This is probably true
>> for
>> > dsl_snapcount_adjust() and dsl_dir_fscount_adjust() as well.
>>
>> Yes, thanks. The code has evolved enough now that these were practically
>> the same.
>> I rewrote things to be common. The new function names are
>> dsl_fs_ss_limit_check
>> and dsl_fs_ss_count_adjust.
>>
>> > specific (minor) issues:
>> >
>> > zpool-features.5: the new feature now depends on the extensible dataset
>> > feature.
>>
>> fixed
>>
>> > zfs.1m: I don't see documentation of the new read-only properties
>> (*_count).
>> > This documentation should include information about when the property
>> will be
>> > present (i.e. only if a limit has been set)
>>
>> added
>>
>> > dsl_dir.c:
>>
>> > 70: seems like we could now detect an uninitialized count by absence of
>> the
>> > entry in the ZAP object.
>>
>> We do behave that way (see code in the newly named dsl_fs_ss_limit_check
>> function). I updated the comment here as part of the other changes made,
>> so
>> this should be more accurate now.
>>
>> > 463: it would be nice to describe the primary purpose of this function
>> in the
>> > first sentence of the comment.  I.e. to initialize the count (not to
>> check the
>> > counts).
>>
>> Changed the first sentence to make this clearer.
>>
>> > 536: I think this comment should be "Count my snapshots." (since we
>> counted
>> > the children snaps above)
>>
>> changed
>>
>> > 546: this comment makes me think that there's some other way to update
>> the
>> > counts, but that isn't the case, is it?
>>
>> No, I rewrote the comment to clarify
>>
>> > 555-556: the convention is to prefix the member names (in this case, the
>> > rather unwieldy: ddafsla_name, ddafsla_removing).  It's annoying but at
>> least
>> > they are only used in a few places.
>>
>> This has been removed altogether, based on the next item.
>>
>> > 590-600: Do we need to do this sync task at all when removing?
>>
>> No. I rewrote things so it is only invoked when setting a limit.
>>
>> > 657: presumably this only applies to the *_limit props?  Would be nice
>> for the
>> > function name to reflect that, and to assert it.  Actually I wouldn't
>> include
>> > "secpolicy" in the name, because this function if very different from
>> the
>> > secpolicy funcs in zfs_ioctl.c.  How about "boolean_t
>> > dsl_enforce_ds_ss_limits"?
>>
>> changed
>>
>> > 697: I think "uint64_t cnt" should be "int64_t delta" (it's the change,
>> not
>> > the new count, but I guess we wouldn't bother checking for negative
>> deltas,
>> > so you could keep it uint64_t).
>>
>> Yes, we're always checking if its ok to increase, so I think what we have
>> now
>> is fine.
>>
>> > 707: is it possible that you are allowed to change the limit on a
>> filesystem,
>> > but not allowed to change the limit on an ancestor?  If so, then it
>> seems like
>> > you still need to recursively check the ancestors.
>>
>> Yes, thanks for catching this. I changed dsl_enforce_ds_ss_limits to
>> handle
>> this.
>>
>> > 720: and more pointedly, if the count is uninitialized, then there can
>> be no
>> > limit.
>>
>> changed the comment
>>
>> > 781: it's no longer necessary to dirty the dd_dbuf, since you aren't
>> modifying
>> > dd_phys (the zap_update() will take care of any necessary
>> dmu_buf_will_dirty()
>> > calls).
>>
>> removed
>>
>> > 1650: It's pretty unusual to be making on-disk changes from a checkfunc.
>> > I *think* it will work, but it at least deserves a comment.
>>
>> Added a comment and also placed in a conditional to address the next item.
>>
>> > 1661: Is it really possible for dd to not be zapified and have the
>> counts
>> > present, given that we just called dsl_dir_init_fs_ss_count()?
>>  Although, it
>> > seems like maybe you *shouldn't* be calling the init func, if the
>> feature
>> > isn't enabled.  And the init func should assert that the feature is
>> active.
>>
>> For the first question, if we're not syncing then dsl_dir_init_fs_ss_count
>> won't be called. I also changed the call to dsl_dir_init_fs_ss_count so
>> that
>> we first check if the feature is enabled and I added an assert in
>> dsl_dir_init_fs_ss_count to check that the feature is enabled.
>>
>> > 1823: why the special case for fs_cnt == 0, but not for ss_cnt?
>>
>> Falls out of my rewrite of the check code.
>>
>> > dsl_destroy.c: 665: if you relaxed the assertion in
>> dsl_dir_fscount_adjust(),
>> > you could simply always do "dsl_dir_fscount_adjust(dd, tx, -1)".  Up to
>> you if
>> > that's worth it.
>>
>> This seems to me like its clearer the way it is, but I could change it if
>> you think it makes sense.
>>
>> > dsl_dataset.c:
>> > 1192: I don't think you should ignore errors here (e.g. wrong value
>> type).
>> > "cnt = fnvpair_value_uint64(pair);" will do what you want.
>>
>> fixed
>>
>> > dmu_send.c:
>> >
>> > 802-850: why do you pre-check the fs/ss limits when creating a new fs +
>> snap,
>> > but not when creating just a new snap?
>>
>> Fixed, added the check to recv_begin_check_existing_impl
>>
>> > 1735: why is the fscount adjusted explicitly here?  I would have
>> thought it
>> > would be taken care of by dmu_recv_begin_sync()'s call to
>> > dsl_dataset_create_sync().  (if changed, also change/remove the call to
>> > fscount_check() in the checkfunc)  Ah, I see, we adjust the fscount in
>> the
>> > callers (e.g. dmu_objset_clone_sync()).  I think it would be better to
>> do it
>> > in dsl_dir_create_sync().  You already have the check for begins with
>> "%" in
>> > fscount_adjust().  That exception should ideally be implemented in as
>> few
>> > places as possible.  For example, in the current code if we abort the
>> receive
>> > (via dmu_recv_cleanup_ds()), you also need an exception to not
>> decrement the
>> > fscount.  This seems error-prone.
>>
>> Moved the check. I think I tested all of the code paths through here but I
>> would appreciate any extra confirmation that things are ok.
>>
>> Let me know if there are any new comments on the latest code.
>>
>> Thanks,
>> Jerry
>>
>>
>>
>> On Wed, Oct 30, 2013 at 2:42 PM, Matthew Ahrens <[email protected]>wrote:
>>
>>> On Fri, Oct 18, 2013 at 7:58 AM, Jerry Jelinek <[email protected]
>>> > wrote:
>>>
>>>> It has been a few months since the last round of review for this new
>>>> feature.
>>>>
>>>> https://illumos.org/issues/3897
>>>>
>>>> I have updated the code to address all of the previous comments
>>>> and (with Matt's help) rewritten it to use the new extensible dataset
>>>> feature instead of changing the on-disk format.
>>>>
>>>> There is a new webrev at:
>>>>
>>>>
>>>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-2/index.html
>>>>
>>>
>>> Jerry, this code is looking very good.  I appreciated the comments.  I
>>> have some feedback below.  Let me know if you have any questions or if
>>> you'd like help implementing (or at least prototyping) my suggested changes.
>>>
>>> High-level comments:
>>>
>>> Given some of the issues with initializing the counts (detailed below),
>>> would it be reasonable to simply initialize all of the counts (on all
>>> filesystems/zvols in the pool) when the feature is activated?  Obviously
>>> that has a performance impact, but no more than opening the pool (which
>>> requires iterating over all filesystems/zvols to claim the ZIL log blocks).
>>>
>>> It would be nice to avoid the extra txg_wait_synced associated with the
>>> dsl_dir_activate_fs_ss_limit() sync task whenever possible.  For example,
>>> if the dataset already has its counts present.  You can do this by having
>>> the check func return a nonzero value (e.g. EALREADY) if the counts are
>>> present. Then in dsl_dir_activate_fs_ss_limit(), if it gets EALREADY,
>>> return 0.  (this comment only applies if you stick with the current method
>>> of initializing on demand rather than all at once).  Not a huge deal but
>>> something to consider.
>>>
>>> I think dsl_snapcount_check() and dsl_dir_fscount_check() can be
>>> combined into one common function, which takes the count and limit
>>> zfs_prop_t's as arguments.  Perhaps keeping these 2 funcs as wrappers to
>>> pass in the zfs_prop_t's, if that's useful for convenience.  This is
>>> probably true for dsl_snapcount_adjust() and dsl_dir_fscount_adjust() as
>>> well.
>>>
>>> specific (minor) issues:
>>>
>>> zpool-features.5: the new feature now depends on the extensible dataset
>>> feature.
>>>
>>> zfs.1m: I don't see documentation of the new read-only properties
>>> (*_count).  This documentation should include information about when the
>>> property will be present (i.e. only if a limit has been set)
>>>
>>> dsl_dir.c:
>>> 70: seems like we could now detect an uninitialized count by absence of
>>> the entry in the ZAP object.
>>>
>>> 463: it would be nice to describe the primary purpose of this function
>>> in the first sentence of the comment.  I.e. to initialize the count (not to
>>> check the counts).
>>>
>>> 536: I think this comment should be "Count my snapshots." (since we
>>> counted the children snaps above)
>>>
>>> 546: this comment makes me think that there's some other way to update
>>> the counts, but that isn't the case, is it?
>>>
>>> 555-556: the convention is to prefix the member names (in this case, the
>>> rather unwieldy: ddafsla_name, ddafsla_removing).  It's annoying but at
>>> least they are only used in a few places.
>>>
>>> 590-600: Do we need to do this sync task at all when removing?
>>>
>>> 657: presumably this only applies to the *_limit props?  Would be nice
>>> for the function name to reflect that, and to assert it.  Actually I
>>> wouldn't include "secpolicy" in the name, because this function if very
>>> different from the secpolicy funcs in zfs_ioctl.c.  How about "boolean_t
>>> dsl_enforce_ds_ss_limits"?
>>>
>>> 697: I think "uint64_t cnt" should be "int64_t delta" (it's the change,
>>> not the new count, but I guess we wouldn't bother checking for negative
>>> deltas, so you could keep it uint64_t).
>>>
>>> 707: is it possible that you are allowed to change the limit on a
>>> filesystem, but not allowed to change the limit on an ancestor?  If so,
>>> then it seems like you still need to recursively check the ancestors.
>>>
>>> 720: and more pointedly, if the count is uninitialized, then there can
>>> be no limit.
>>>
>>> 781: it's no longer necessary to dirty the dd_dbuf, since you aren't
>>> modifying dd_phys (the zap_update() will take care of any necessary
>>> dmu_buf_will_dirty() calls).
>>>
>>> 1650: It's pretty unusual to be making on-disk changes from a checkfunc.
>>>  I *think* it will work, but it at least deserves a comment.
>>>
>>> 1661: Is it really possible for dd to not be zapified and have the
>>> counts present, given that we just called dsl_dir_init_fs_ss_count()?
>>>  Although, it seems like maybe you *shouldn't* be calling the init func, if
>>> the feature isn't enabled.  And the init func should assert that the
>>> feature is active.
>>>
>>> 1823: why the special case for fs_cnt == 0, but not for ss_cnt?
>>>
>>> dsl_destroy.c: 665: if you relaxed the assertion in
>>> dsl_dir_fscount_adjust(), you could simply always do
>>> "dsl_dir_fscount_adjust(dd, tx, -1)".  Up to you if that's worth it.
>>>
>>> dsl_dataset.c:
>>> 1192: I don't think you should ignore errors here (e.g. wrong value
>>> type).  "cnt = fnvpair_value_uint64(pair);" will do what you want.
>>>
>>> dmu_send.c:
>>>
>>> 802-850: why do you pre-check the fs/ss limits when creating a new fs +
>>> snap, but not when creating just a new snap?
>>>
>>> 1735: why is the fscount adjusted explicitly here?  I would have thought
>>> it would be taken care of by dmu_recv_begin_sync()'s call to
>>> dsl_dataset_create_sync().  (if changed, also change/remove the call to
>>> fscount_check() in the checkfunc)  Ah, I see, we adjust the fscount in the
>>> callers (e.g. dmu_objset_clone_sync()).  I think it would be better to do
>>> it in dsl_dir_create_sync().  You already have the check for begins with
>>> "%" in fscount_adjust().  That exception should ideally be implemented in
>>> as few places as possible.  For example, in the current code if we abort
>>> the receive (via dmu_recv_cleanup_ds()), you also need an exception to not
>>> decrement the fscount.  This seems error-prone.
>>>
>>> --matt
>>>
>>>
>>>
>>>>
>>>>
>>>> The previous webrev is still available at:
>>>>
>>>>
>>>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897/index.html
>>>>
>>>> Thanks,
>>>> Jerry
>>>>
>>>>
>>>> On Wed, Jul 24, 2013 at 5:00 PM, Matthew Ahrens <[email protected]>wrote:
>>>>
>>>>> Two high-level comments:
>>>>>
>>>>> 1. I'd like to avoid adding fields to dsl_dir_phys_t,
>>>>> dsl_dataset_phys_t, etc. if at all possible.  The issue is that if another
>>>>> feature is developed independently and uses the same field (same offset in
>>>>> the struct) to mean something else, then illumos can't simultaneously
>>>>> support both features.
>>>>>
>>>>> We have features that would naturally use new fields in these structs,
>>>>> but realized that this would impact anyone developing another such 
>>>>> feature,
>>>>> such as this one.  Therefore I have implemented functionality to use the
>>>>> dsl_dir and dsl_dataset objects as zap objects, so that we can add new ZAP
>>>>> entries rather than adding fields to the structs.  I can make that code
>>>>> available later this week for you to take a look at.  I can also help you
>>>>> switch your code to use it, if you'd be willing to go that route.  See
>>>>> below for another alternative.
>>>>>
>>>>> 2. It would be nice to expose the filesystem and snapshot counts to
>>>>> the user.  This will help answer the question, "how close am I to my 
>>>>> limit"?
>>>>>
>>>>> We could solve both these issues at once by storing the counts as
>>>>> properties (in the dd_props_zapobj).
>>>>>
>>>>> --matt
>>>>>
>>>>>
>>>>> On Mon, Jul 22, 2013 at 10:49 AM, Jerry Jelinek <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> I'd like another code review for:
>>>>>>
>>>>>> 3897 zfs filesystem and snapshot limits
>>>>>>
>>>>>> There have already been a few code reviews for this last fall with
>>>>>> the last one being in January. I was in the process of finalizing the
>>>>>> responses from that feedback in February when the big zfs re-write was
>>>>>> pushed and I didn't have any time to deal with that then.
>>>>>>
>>>>>> I have finally gotten a chance to update the code to work with the
>>>>>> new zfs design and to also finish addressing the previous comments. Since
>>>>>> this code is somewhat different from what I had previously, and since so
>>>>>> much time has passed, it is probably best to start over and treat this 
>>>>>> as a
>>>>>> fresh review.
>>>>>>
>>>>>> There is a webrev at:
>>>>>>
>>>>>>
>>>>>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897/index.html
>>>>>>
>>>>>> Thanks,
>>>>>> Jerry
>>>>>>
>>>>>>  *illumos-developer* | 
>>>>>> Archives<https://www.listbox.com/member/archive/182179/=now>
>>>>>> <https://www.listbox.com/member/archive/rss/182179/21175174-cd73734d>|
>>>>>> Modify<https://www.listbox.com/member/?member_id=21175174&id_secret=21175174-792643f6>Your
>>>>>>  Subscription
>>>>>> <http://www.listbox.com>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Re: [OpenZFS Developer] [developer] Review: zfs filesystem and snapshot limits

Reply via email to