Ship it!  Thanks a lot for taking the time to go through the review process
and upstream these changes.  It might be helpful to file bugs for the
follow-on work (zdb verifications; making it work for delegated datasets)
-- I imagine these would be relatively small amounts of work that might be
interesting for developers new to ZFS or illumos.

--matt


On Mon, Apr 21, 2014 at 10:18 AM, Jerry Jelinek <[email protected]>wrote:

> Matt,
>
> Thanks for reviewing this again. My responses are in-line below.
>
>
> On Sat, Mar 29, 2014 at 6:17 PM, Matthew Ahrens <[email protected]>wrote:
>
>>
>>
>>
>> On Mon, Mar 10, 2014 at 5:16 PM, Jerry Jelinek 
>> <[email protected]>wrote:
>>
>>> It has been quite a while since I last had a chance to work on this, but
>>> I finally got some time to address the previous code review comments.  My
>>> responses to Matt's feedback are in-line with each of his comments. There
>>> is an updated webrev at:
>>>
>>>
>>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-3/index.html
>>>
>>
>> This is looking great.  I only see one potential bug here (the first
>> comment below), the rest of the comments are style / code simplification.
>>
>> dsl_dir.c:
>> 541: The comment above (line 120) indicates that temporary snapshots are
>> not counted.  Wouldn't this code count temporary snapshots?
>>
>
> yes, thanks for catching that, fixed
>
>
>>
>> 647: What would we need to do to allow the limits to be set in non-global
>> zones (on datasets below the one that was delegated to the non-global
>> zone), as with space usage quotas?
>>
>
> It should be possible to make that work for delegated datasets but I would
> rather defer that to a later enhancement since this initial work has
> already taken such a long time.
>
>
>>
>> 679: I think you mean  == 0 (obj is not a pointer)
>>
>
> fixed
>
>
>>
>> 708: I think "cnt" would be better named "delta" or "increment".  It's
>> the amount we are adding to "count", which it could be easily confused with
>> (cnt vs count). in dsl_fs_ss_count_adjust() you use "delta".
>>
>
> changed
>
>
>>
>> 793: "prop" can be "const char *".  Also, typically the "tx" argument
>> comes last.
>>
>
> changed
>
>
>>
>> 1791: if the feature is enabled, then dd must be zapified and have the
>> counts present.  You should make this more clear by using that is the
>> condition, like:
>>
>> if (spa_feature_is_enabled(...)) {
>>   VERIFY0(zap_lookup(...));
>>   fs_cnt++;
>>   ...
>> }
>>
>
> changed
>
>
>>
>> 1813, 1820: dsl_fs_ss_count_adjust() already has a fast path for
>> delta==0, you should probably call it unconditionally here.
>>
>
> changed
>
>
>>
>> dsl_dataset.c:
>> 768: should be "it's" (= "it is")
>>
>
> fixed.
>
> Also, in a follow-on email you asked about enhancing zdb for this work.
> This is another thing I would like to defer. I haven't looked into zdb at
> all and it will probably be quite a while before I can get a good block of
> time to work on that.
>
> I have posted a new webrev at:
>
>
> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-4/index.html
>
> The previous webrevs are still in place if you need to look back at any of
> those.
>
> Thanks again,
> Jerry
>
>
>>
>>
>>>
>>>
>>> The previous webrev is still available at:
>>>
>>>
>>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-2/index.html
>>>
>>> This can be used to correlate back to the previous comments if necessary.
>>>
>>> > Given some of the issues with initializing the counts (detailed
>>> below), would
>>> > it be reasonable to simply initialize all of the counts (on all
>>> > filesystems/zvols in the pool) when the feature is activated?
>>>  Obviously that
>>> > has a performance impact, but no more than opening the pool (which
>>> requires
>>> > iterating over all filesystems/zvols to claim the ZIL log blocks).
>>>
>>> I kind of prefer the code the way it is. I believe I was able to easily
>>> resolve
>>> the issues below and with this implementation there is minimal impact,
>>> and
>>> that occurs only when a limit is first being set. Also, we've done it
>>> this way
>>> since our first attempt at this at the 2012 hackathon, so, I'd rather
>>> keep
>>> things using the current approach unless there is a fundamental problem
>>> with
>>> it.
>>>
>>> > It would be nice to avoid the extra txg_wait_synced associated with the
>>> > dsl_dir_activate_fs_ss_limit() sync task whenever possible.  For
>>> example, if
>>> > the dataset already has its counts present.  You can do this by having
>>> the
>>> > check func return a nonzero value (e.g. EALREADY) if the counts are
>>> present.
>>> > Then in dsl_dir_activate_fs_ss_limit(), if it gets EALREADY, return 0.
>>>  (this
>>> > comment only applies if you stick with the current method of
>>> initializing on
>>> > demand rather than all at once).  Not a huge deal but something to
>>> consider.
>>>
>>> Added a check for this and return EALREADY as suggested.
>>>
>>> > I think dsl_snapcount_check() and dsl_dir_fscount_check() can be
>>> combined
>>> > into one common function, which takes the count and limit zfs_prop_t's
>>> as
>>> > arguments. Perhaps keeping these 2 funcs as wrappers to pass in the
>>> > zfs_prop_t's, if that's useful for convenience.  This is probably true
>>> for
>>> > dsl_snapcount_adjust() and dsl_dir_fscount_adjust() as well.
>>>
>>> Yes, thanks. The code has evolved enough now that these were practically
>>> the same.
>>> I rewrote things to be common. The new function names are
>>> dsl_fs_ss_limit_check
>>> and dsl_fs_ss_count_adjust.
>>>
>>> > specific (minor) issues:
>>> >
>>> > zpool-features.5: the new feature now depends on the extensible dataset
>>> > feature.
>>>
>>> fixed
>>>
>>> > zfs.1m: I don't see documentation of the new read-only properties
>>> (*_count).
>>> > This documentation should include information about when the property
>>> will be
>>> > present (i.e. only if a limit has been set)
>>>
>>> added
>>>
>>> > dsl_dir.c:
>>>
>>> > 70: seems like we could now detect an uninitialized count by absence
>>> of the
>>> > entry in the ZAP object.
>>>
>>> We do behave that way (see code in the newly named dsl_fs_ss_limit_check
>>> function). I updated the comment here as part of the other changes made,
>>> so
>>> this should be more accurate now.
>>>
>>> > 463: it would be nice to describe the primary purpose of this function
>>> in the
>>> > first sentence of the comment.  I.e. to initialize the count (not to
>>> check the
>>> > counts).
>>>
>>> Changed the first sentence to make this clearer.
>>>
>>> > 536: I think this comment should be "Count my snapshots." (since we
>>> counted
>>> > the children snaps above)
>>>
>>> changed
>>>
>>> > 546: this comment makes me think that there's some other way to update
>>> the
>>> > counts, but that isn't the case, is it?
>>>
>>> No, I rewrote the comment to clarify
>>>
>>> > 555-556: the convention is to prefix the member names (in this case,
>>> the
>>> > rather unwieldy: ddafsla_name, ddafsla_removing).  It's annoying but
>>> at least
>>> > they are only used in a few places.
>>>
>>> This has been removed altogether, based on the next item.
>>>
>>> > 590-600: Do we need to do this sync task at all when removing?
>>>
>>> No. I rewrote things so it is only invoked when setting a limit.
>>>
>>> > 657: presumably this only applies to the *_limit props?  Would be nice
>>> for the
>>> > function name to reflect that, and to assert it.  Actually I wouldn't
>>> include
>>> > "secpolicy" in the name, because this function if very different from
>>> the
>>> > secpolicy funcs in zfs_ioctl.c.  How about "boolean_t
>>> > dsl_enforce_ds_ss_limits"?
>>>
>>> changed
>>>
>>> > 697: I think "uint64_t cnt" should be "int64_t delta" (it's the
>>> change, not
>>> > the new count, but I guess we wouldn't bother checking for negative
>>> deltas,
>>> > so you could keep it uint64_t).
>>>
>>> Yes, we're always checking if its ok to increase, so I think what we
>>> have now
>>> is fine.
>>>
>>> > 707: is it possible that you are allowed to change the limit on a
>>> filesystem,
>>> > but not allowed to change the limit on an ancestor?  If so, then it
>>> seems like
>>> > you still need to recursively check the ancestors.
>>>
>>> Yes, thanks for catching this. I changed dsl_enforce_ds_ss_limits to
>>> handle
>>> this.
>>>
>>> > 720: and more pointedly, if the count is uninitialized, then there can
>>> be no
>>> > limit.
>>>
>>> changed the comment
>>>
>>> > 781: it's no longer necessary to dirty the dd_dbuf, since you aren't
>>> modifying
>>> > dd_phys (the zap_update() will take care of any necessary
>>> dmu_buf_will_dirty()
>>> > calls).
>>>
>>> removed
>>>
>>> > 1650: It's pretty unusual to be making on-disk changes from a
>>> checkfunc.
>>> > I *think* it will work, but it at least deserves a comment.
>>>
>>> Added a comment and also placed in a conditional to address the next
>>> item.
>>>
>>> > 1661: Is it really possible for dd to not be zapified and have the
>>> counts
>>> > present, given that we just called dsl_dir_init_fs_ss_count()?
>>>  Although, it
>>> > seems like maybe you *shouldn't* be calling the init func, if the
>>> feature
>>> > isn't enabled.  And the init func should assert that the feature is
>>> active.
>>>
>>> For the first question, if we're not syncing then
>>> dsl_dir_init_fs_ss_count
>>> won't be called. I also changed the call to dsl_dir_init_fs_ss_count so
>>> that
>>> we first check if the feature is enabled and I added an assert in
>>> dsl_dir_init_fs_ss_count to check that the feature is enabled.
>>>
>>> > 1823: why the special case for fs_cnt == 0, but not for ss_cnt?
>>>
>>> Falls out of my rewrite of the check code.
>>>
>>> > dsl_destroy.c: 665: if you relaxed the assertion in
>>> dsl_dir_fscount_adjust(),
>>> > you could simply always do "dsl_dir_fscount_adjust(dd, tx, -1)".  Up
>>> to you if
>>> > that's worth it.
>>>
>>> This seems to me like its clearer the way it is, but I could change it if
>>> you think it makes sense.
>>>
>>> > dsl_dataset.c:
>>> > 1192: I don't think you should ignore errors here (e.g. wrong value
>>> type).
>>> > "cnt = fnvpair_value_uint64(pair);" will do what you want.
>>>
>>> fixed
>>>
>>> > dmu_send.c:
>>> >
>>> > 802-850: why do you pre-check the fs/ss limits when creating a new fs
>>> + snap,
>>> > but not when creating just a new snap?
>>>
>>> Fixed, added the check to recv_begin_check_existing_impl
>>>
>>> > 1735: why is the fscount adjusted explicitly here?  I would have
>>> thought it
>>> > would be taken care of by dmu_recv_begin_sync()'s call to
>>> > dsl_dataset_create_sync().  (if changed, also change/remove the call to
>>> > fscount_check() in the checkfunc)  Ah, I see, we adjust the fscount in
>>> the
>>> > callers (e.g. dmu_objset_clone_sync()).  I think it would be better to
>>> do it
>>> > in dsl_dir_create_sync().  You already have the check for begins with
>>> "%" in
>>> > fscount_adjust().  That exception should ideally be implemented in as
>>> few
>>> > places as possible.  For example, in the current code if we abort the
>>> receive
>>> > (via dmu_recv_cleanup_ds()), you also need an exception to not
>>> decrement the
>>> > fscount.  This seems error-prone.
>>>
>>> Moved the check. I think I tested all of the code paths through here but
>>> I
>>> would appreciate any extra confirmation that things are ok.
>>>
>>> Let me know if there are any new comments on the latest code.
>>>
>>> Thanks,
>>> Jerry
>>>
>>>
>>>
>>> On Wed, Oct 30, 2013 at 2:42 PM, Matthew Ahrens <[email protected]>wrote:
>>>
>>>> On Fri, Oct 18, 2013 at 7:58 AM, Jerry Jelinek <
>>>> [email protected]> wrote:
>>>>
>>>>> It has been a few months since the last round of review for this new
>>>>> feature.
>>>>>
>>>>> https://illumos.org/issues/3897
>>>>>
>>>>> I have updated the code to address all of the previous comments
>>>>> and (with Matt's help) rewritten it to use the new extensible dataset
>>>>> feature instead of changing the on-disk format.
>>>>>
>>>>> There is a new webrev at:
>>>>>
>>>>>
>>>>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-2/index.html
>>>>>
>>>>
>>>> Jerry, this code is looking very good.  I appreciated the comments.  I
>>>> have some feedback below.  Let me know if you have any questions or if
>>>> you'd like help implementing (or at least prototyping) my suggested 
>>>> changes.
>>>>
>>>> High-level comments:
>>>>
>>>> Given some of the issues with initializing the counts (detailed below),
>>>> would it be reasonable to simply initialize all of the counts (on all
>>>> filesystems/zvols in the pool) when the feature is activated?  Obviously
>>>> that has a performance impact, but no more than opening the pool (which
>>>> requires iterating over all filesystems/zvols to claim the ZIL log blocks).
>>>>
>>>> It would be nice to avoid the extra txg_wait_synced associated with the
>>>> dsl_dir_activate_fs_ss_limit() sync task whenever possible.  For example,
>>>> if the dataset already has its counts present.  You can do this by having
>>>> the check func return a nonzero value (e.g. EALREADY) if the counts are
>>>> present. Then in dsl_dir_activate_fs_ss_limit(), if it gets EALREADY,
>>>> return 0.  (this comment only applies if you stick with the current method
>>>> of initializing on demand rather than all at once).  Not a huge deal but
>>>> something to consider.
>>>>
>>>> I think dsl_snapcount_check() and dsl_dir_fscount_check() can be
>>>> combined into one common function, which takes the count and limit
>>>> zfs_prop_t's as arguments.  Perhaps keeping these 2 funcs as wrappers to
>>>> pass in the zfs_prop_t's, if that's useful for convenience.  This is
>>>> probably true for dsl_snapcount_adjust() and dsl_dir_fscount_adjust() as
>>>> well.
>>>>
>>>> specific (minor) issues:
>>>>
>>>> zpool-features.5: the new feature now depends on the extensible dataset
>>>> feature.
>>>>
>>>> zfs.1m: I don't see documentation of the new read-only properties
>>>> (*_count).  This documentation should include information about when the
>>>> property will be present (i.e. only if a limit has been set)
>>>>
>>>> dsl_dir.c:
>>>> 70: seems like we could now detect an uninitialized count by absence of
>>>> the entry in the ZAP object.
>>>>
>>>> 463: it would be nice to describe the primary purpose of this function
>>>> in the first sentence of the comment.  I.e. to initialize the count (not to
>>>> check the counts).
>>>>
>>>> 536: I think this comment should be "Count my snapshots." (since we
>>>> counted the children snaps above)
>>>>
>>>> 546: this comment makes me think that there's some other way to update
>>>> the counts, but that isn't the case, is it?
>>>>
>>>> 555-556: the convention is to prefix the member names (in this case,
>>>> the rather unwieldy: ddafsla_name, ddafsla_removing).  It's annoying but at
>>>> least they are only used in a few places.
>>>>
>>>> 590-600: Do we need to do this sync task at all when removing?
>>>>
>>>> 657: presumably this only applies to the *_limit props?  Would be nice
>>>> for the function name to reflect that, and to assert it.  Actually I
>>>> wouldn't include "secpolicy" in the name, because this function if very
>>>> different from the secpolicy funcs in zfs_ioctl.c.  How about "boolean_t
>>>> dsl_enforce_ds_ss_limits"?
>>>>
>>>> 697: I think "uint64_t cnt" should be "int64_t delta" (it's the change,
>>>> not the new count, but I guess we wouldn't bother checking for negative
>>>> deltas, so you could keep it uint64_t).
>>>>
>>>> 707: is it possible that you are allowed to change the limit on a
>>>> filesystem, but not allowed to change the limit on an ancestor?  If so,
>>>> then it seems like you still need to recursively check the ancestors.
>>>>
>>>> 720: and more pointedly, if the count is uninitialized, then there can
>>>> be no limit.
>>>>
>>>> 781: it's no longer necessary to dirty the dd_dbuf, since you aren't
>>>> modifying dd_phys (the zap_update() will take care of any necessary
>>>> dmu_buf_will_dirty() calls).
>>>>
>>>> 1650: It's pretty unusual to be making on-disk changes from a
>>>> checkfunc.  I *think* it will work, but it at least deserves a comment.
>>>>
>>>> 1661: Is it really possible for dd to not be zapified and have the
>>>> counts present, given that we just called dsl_dir_init_fs_ss_count()?
>>>>  Although, it seems like maybe you *shouldn't* be calling the init func, if
>>>> the feature isn't enabled.  And the init func should assert that the
>>>> feature is active.
>>>>
>>>> 1823: why the special case for fs_cnt == 0, but not for ss_cnt?
>>>>
>>>> dsl_destroy.c: 665: if you relaxed the assertion in
>>>> dsl_dir_fscount_adjust(), you could simply always do
>>>> "dsl_dir_fscount_adjust(dd, tx, -1)".  Up to you if that's worth it.
>>>>
>>>> dsl_dataset.c:
>>>> 1192: I don't think you should ignore errors here (e.g. wrong value
>>>> type).  "cnt = fnvpair_value_uint64(pair);" will do what you want.
>>>>
>>>> dmu_send.c:
>>>>
>>>> 802-850: why do you pre-check the fs/ss limits when creating a new fs +
>>>> snap, but not when creating just a new snap?
>>>>
>>>> 1735: why is the fscount adjusted explicitly here?  I would have
>>>> thought it would be taken care of by dmu_recv_begin_sync()'s call to
>>>> dsl_dataset_create_sync().  (if changed, also change/remove the call to
>>>> fscount_check() in the checkfunc)  Ah, I see, we adjust the fscount in the
>>>> callers (e.g. dmu_objset_clone_sync()).  I think it would be better to do
>>>> it in dsl_dir_create_sync().  You already have the check for begins with
>>>> "%" in fscount_adjust().  That exception should ideally be implemented in
>>>> as few places as possible.  For example, in the current code if we abort
>>>> the receive (via dmu_recv_cleanup_ds()), you also need an exception to not
>>>> decrement the fscount.  This seems error-prone.
>>>>
>>>> --matt
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>> The previous webrev is still available at:
>>>>>
>>>>>
>>>>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897/index.html
>>>>>
>>>>> Thanks,
>>>>> Jerry
>>>>>
>>>>>
>>>>> On Wed, Jul 24, 2013 at 5:00 PM, Matthew Ahrens 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> Two high-level comments:
>>>>>>
>>>>>> 1. I'd like to avoid adding fields to dsl_dir_phys_t,
>>>>>> dsl_dataset_phys_t, etc. if at all possible.  The issue is that if 
>>>>>> another
>>>>>> feature is developed independently and uses the same field (same offset 
>>>>>> in
>>>>>> the struct) to mean something else, then illumos can't simultaneously
>>>>>> support both features.
>>>>>>
>>>>>> We have features that would naturally use new fields in these
>>>>>> structs, but realized that this would impact anyone developing another 
>>>>>> such
>>>>>> feature, such as this one.  Therefore I have implemented functionality to
>>>>>> use the dsl_dir and dsl_dataset objects as zap objects, so that we can 
>>>>>> add
>>>>>> new ZAP entries rather than adding fields to the structs.  I can make 
>>>>>> that
>>>>>> code available later this week for you to take a look at.  I can also 
>>>>>> help
>>>>>> you switch your code to use it, if you'd be willing to go that route.  
>>>>>> See
>>>>>> below for another alternative.
>>>>>>
>>>>>> 2. It would be nice to expose the filesystem and snapshot counts to
>>>>>> the user.  This will help answer the question, "how close am I to my 
>>>>>> limit"?
>>>>>>
>>>>>> We could solve both these issues at once by storing the counts as
>>>>>> properties (in the dd_props_zapobj).
>>>>>>
>>>>>> --matt
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 22, 2013 at 10:49 AM, Jerry Jelinek <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> I'd like another code review for:
>>>>>>>
>>>>>>> 3897 zfs filesystem and snapshot limits
>>>>>>>
>>>>>>> There have already been a few code reviews for this last fall with
>>>>>>> the last one being in January. I was in the process of finalizing the
>>>>>>> responses from that feedback in February when the big zfs re-write was
>>>>>>> pushed and I didn't have any time to deal with that then.
>>>>>>>
>>>>>>> I have finally gotten a chance to update the code to work with the
>>>>>>> new zfs design and to also finish addressing the previous comments. 
>>>>>>> Since
>>>>>>> this code is somewhat different from what I had previously, and since so
>>>>>>> much time has passed, it is probably best to start over and treat this 
>>>>>>> as a
>>>>>>> fresh review.
>>>>>>>
>>>>>>> There is a webrev at:
>>>>>>>
>>>>>>>
>>>>>>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897/index.html
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jerry
>>>>>>>
>>>>>>>  *illumos-developer* | 
>>>>>>> Archives<https://www.listbox.com/member/archive/182179/=now>
>>>>>>> <https://www.listbox.com/member/archive/rss/182179/21175174-cd73734d>|
>>>>>>> Modify<https://www.listbox.com/member/?member_id=21175174&id_secret=21175174-792643f6>Your
>>>>>>>  Subscription
>>>>>>> <http://www.listbox.com>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to