On Mon, Mar 10, 2014 at 5:16 PM, Jerry Jelinek <[email protected]>wrote:
> It has been quite a while since I last had a chance to work on this, but I > finally got some time to address the previous code review comments. My > responses to Matt's feedback are in-line with each of his comments. There > is an updated webrev at: > > > http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-3/index.html > This is looking great. I only see one potential bug here (the first comment below), the rest of the comments are style / code simplification. dsl_dir.c: 541: The comment above (line 120) indicates that temporary snapshots are not counted. Wouldn't this code count temporary snapshots? 647: What would we need to do to allow the limits to be set in non-global zones (on datasets below the one that was delegated to the non-global zone), as with space usage quotas? 679: I think you mean == 0 (obj is not a pointer) 708: I think "cnt" would be better named "delta" or "increment". It's the amount we are adding to "count", which it could be easily confused with (cnt vs count). in dsl_fs_ss_count_adjust() you use "delta". 793: "prop" can be "const char *". Also, typically the "tx" argument comes last. 1791: if the feature is enabled, then dd must be zapified and have the counts present. You should make this more clear by using that is the condition, like: if (spa_feature_is_enabled(...)) { VERIFY0(zap_lookup(...)); fs_cnt++; ... } 1813, 1820: dsl_fs_ss_count_adjust() already has a fast path for delta==0, you should probably call it unconditionally here. dsl_dataset.c: 768: should be "it's" (= "it is") --matt > > > The previous webrev is still available at: > > > http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-2/index.html > > This can be used to correlate back to the previous comments if necessary. > > > Given some of the issues with initializing the counts (detailed below), > would > > it be reasonable to simply initialize all of the counts (on all > > filesystems/zvols in the pool) when the feature is activated? Obviously > that > > has a performance impact, but no more than opening the pool (which > requires > > iterating over all filesystems/zvols to claim the ZIL log blocks). > > I kind of prefer the code the way it is. I believe I was able to easily > resolve > the issues below and with this implementation there is minimal impact, and > that occurs only when a limit is first being set. Also, we've done it this > way > since our first attempt at this at the 2012 hackathon, so, I'd rather keep > things using the current approach unless there is a fundamental problem > with > it. > > > It would be nice to avoid the extra txg_wait_synced associated with the > > dsl_dir_activate_fs_ss_limit() sync task whenever possible. For > example, if > > the dataset already has its counts present. You can do this by having > the > > check func return a nonzero value (e.g. EALREADY) if the counts are > present. > > Then in dsl_dir_activate_fs_ss_limit(), if it gets EALREADY, return 0. > (this > > comment only applies if you stick with the current method of > initializing on > > demand rather than all at once). Not a huge deal but something to > consider. > > Added a check for this and return EALREADY as suggested. > > > I think dsl_snapcount_check() and dsl_dir_fscount_check() can be combined > > into one common function, which takes the count and limit zfs_prop_t's as > > arguments. Perhaps keeping these 2 funcs as wrappers to pass in the > > zfs_prop_t's, if that's useful for convenience. This is probably true > for > > dsl_snapcount_adjust() and dsl_dir_fscount_adjust() as well. > > Yes, thanks. The code has evolved enough now that these were practically > the same. > I rewrote things to be common. The new function names are > dsl_fs_ss_limit_check > and dsl_fs_ss_count_adjust. > > > specific (minor) issues: > > > > zpool-features.5: the new feature now depends on the extensible dataset > > feature. > > fixed > > > zfs.1m: I don't see documentation of the new read-only properties > (*_count). > > This documentation should include information about when the property > will be > > present (i.e. only if a limit has been set) > > added > > > dsl_dir.c: > > > 70: seems like we could now detect an uninitialized count by absence of > the > > entry in the ZAP object. > > We do behave that way (see code in the newly named dsl_fs_ss_limit_check > function). I updated the comment here as part of the other changes made, so > this should be more accurate now. > > > 463: it would be nice to describe the primary purpose of this function > in the > > first sentence of the comment. I.e. to initialize the count (not to > check the > > counts). > > Changed the first sentence to make this clearer. > > > 536: I think this comment should be "Count my snapshots." (since we > counted > > the children snaps above) > > changed > > > 546: this comment makes me think that there's some other way to update > the > > counts, but that isn't the case, is it? > > No, I rewrote the comment to clarify > > > 555-556: the convention is to prefix the member names (in this case, the > > rather unwieldy: ddafsla_name, ddafsla_removing). It's annoying but at > least > > they are only used in a few places. > > This has been removed altogether, based on the next item. > > > 590-600: Do we need to do this sync task at all when removing? > > No. I rewrote things so it is only invoked when setting a limit. > > > 657: presumably this only applies to the *_limit props? Would be nice > for the > > function name to reflect that, and to assert it. Actually I wouldn't > include > > "secpolicy" in the name, because this function if very different from the > > secpolicy funcs in zfs_ioctl.c. How about "boolean_t > > dsl_enforce_ds_ss_limits"? > > changed > > > 697: I think "uint64_t cnt" should be "int64_t delta" (it's the change, > not > > the new count, but I guess we wouldn't bother checking for negative > deltas, > > so you could keep it uint64_t). > > Yes, we're always checking if its ok to increase, so I think what we have > now > is fine. > > > 707: is it possible that you are allowed to change the limit on a > filesystem, > > but not allowed to change the limit on an ancestor? If so, then it > seems like > > you still need to recursively check the ancestors. > > Yes, thanks for catching this. I changed dsl_enforce_ds_ss_limits to handle > this. > > > 720: and more pointedly, if the count is uninitialized, then there can > be no > > limit. > > changed the comment > > > 781: it's no longer necessary to dirty the dd_dbuf, since you aren't > modifying > > dd_phys (the zap_update() will take care of any necessary > dmu_buf_will_dirty() > > calls). > > removed > > > 1650: It's pretty unusual to be making on-disk changes from a checkfunc. > > I *think* it will work, but it at least deserves a comment. > > Added a comment and also placed in a conditional to address the next item. > > > 1661: Is it really possible for dd to not be zapified and have the counts > > present, given that we just called dsl_dir_init_fs_ss_count()? > Although, it > > seems like maybe you *shouldn't* be calling the init func, if the feature > > isn't enabled. And the init func should assert that the feature is > active. > > For the first question, if we're not syncing then dsl_dir_init_fs_ss_count > won't be called. I also changed the call to dsl_dir_init_fs_ss_count so > that > we first check if the feature is enabled and I added an assert in > dsl_dir_init_fs_ss_count to check that the feature is enabled. > > > 1823: why the special case for fs_cnt == 0, but not for ss_cnt? > > Falls out of my rewrite of the check code. > > > dsl_destroy.c: 665: if you relaxed the assertion in > dsl_dir_fscount_adjust(), > > you could simply always do "dsl_dir_fscount_adjust(dd, tx, -1)". Up to > you if > > that's worth it. > > This seems to me like its clearer the way it is, but I could change it if > you think it makes sense. > > > dsl_dataset.c: > > 1192: I don't think you should ignore errors here (e.g. wrong value > type). > > "cnt = fnvpair_value_uint64(pair);" will do what you want. > > fixed > > > dmu_send.c: > > > > 802-850: why do you pre-check the fs/ss limits when creating a new fs + > snap, > > but not when creating just a new snap? > > Fixed, added the check to recv_begin_check_existing_impl > > > 1735: why is the fscount adjusted explicitly here? I would have thought > it > > would be taken care of by dmu_recv_begin_sync()'s call to > > dsl_dataset_create_sync(). (if changed, also change/remove the call to > > fscount_check() in the checkfunc) Ah, I see, we adjust the fscount in > the > > callers (e.g. dmu_objset_clone_sync()). I think it would be better to > do it > > in dsl_dir_create_sync(). You already have the check for begins with > "%" in > > fscount_adjust(). That exception should ideally be implemented in as few > > places as possible. For example, in the current code if we abort the > receive > > (via dmu_recv_cleanup_ds()), you also need an exception to not decrement > the > > fscount. This seems error-prone. > > Moved the check. I think I tested all of the code paths through here but I > would appreciate any extra confirmation that things are ok. > > Let me know if there are any new comments on the latest code. > > Thanks, > Jerry > > > > On Wed, Oct 30, 2013 at 2:42 PM, Matthew Ahrens <[email protected]>wrote: > >> On Fri, Oct 18, 2013 at 7:58 AM, Jerry Jelinek >> <[email protected]>wrote: >> >>> It has been a few months since the last round of review for this new >>> feature. >>> >>> https://illumos.org/issues/3897 >>> >>> I have updated the code to address all of the previous comments >>> and (with Matt's help) rewritten it to use the new extensible dataset >>> feature instead of changing the on-disk format. >>> >>> There is a new webrev at: >>> >>> >>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897-2/index.html >>> >> >> Jerry, this code is looking very good. I appreciated the comments. I >> have some feedback below. Let me know if you have any questions or if >> you'd like help implementing (or at least prototyping) my suggested changes. >> >> High-level comments: >> >> Given some of the issues with initializing the counts (detailed below), >> would it be reasonable to simply initialize all of the counts (on all >> filesystems/zvols in the pool) when the feature is activated? Obviously >> that has a performance impact, but no more than opening the pool (which >> requires iterating over all filesystems/zvols to claim the ZIL log blocks). >> >> It would be nice to avoid the extra txg_wait_synced associated with the >> dsl_dir_activate_fs_ss_limit() sync task whenever possible. For example, >> if the dataset already has its counts present. You can do this by having >> the check func return a nonzero value (e.g. EALREADY) if the counts are >> present. Then in dsl_dir_activate_fs_ss_limit(), if it gets EALREADY, >> return 0. (this comment only applies if you stick with the current method >> of initializing on demand rather than all at once). Not a huge deal but >> something to consider. >> >> I think dsl_snapcount_check() and dsl_dir_fscount_check() can be combined >> into one common function, which takes the count and limit zfs_prop_t's as >> arguments. Perhaps keeping these 2 funcs as wrappers to pass in the >> zfs_prop_t's, if that's useful for convenience. This is probably true for >> dsl_snapcount_adjust() and dsl_dir_fscount_adjust() as well. >> >> specific (minor) issues: >> >> zpool-features.5: the new feature now depends on the extensible dataset >> feature. >> >> zfs.1m: I don't see documentation of the new read-only properties >> (*_count). This documentation should include information about when the >> property will be present (i.e. only if a limit has been set) >> >> dsl_dir.c: >> 70: seems like we could now detect an uninitialized count by absence of >> the entry in the ZAP object. >> >> 463: it would be nice to describe the primary purpose of this function in >> the first sentence of the comment. I.e. to initialize the count (not to >> check the counts). >> >> 536: I think this comment should be "Count my snapshots." (since we >> counted the children snaps above) >> >> 546: this comment makes me think that there's some other way to update >> the counts, but that isn't the case, is it? >> >> 555-556: the convention is to prefix the member names (in this case, the >> rather unwieldy: ddafsla_name, ddafsla_removing). It's annoying but at >> least they are only used in a few places. >> >> 590-600: Do we need to do this sync task at all when removing? >> >> 657: presumably this only applies to the *_limit props? Would be nice >> for the function name to reflect that, and to assert it. Actually I >> wouldn't include "secpolicy" in the name, because this function if very >> different from the secpolicy funcs in zfs_ioctl.c. How about "boolean_t >> dsl_enforce_ds_ss_limits"? >> >> 697: I think "uint64_t cnt" should be "int64_t delta" (it's the change, >> not the new count, but I guess we wouldn't bother checking for negative >> deltas, so you could keep it uint64_t). >> >> 707: is it possible that you are allowed to change the limit on a >> filesystem, but not allowed to change the limit on an ancestor? If so, >> then it seems like you still need to recursively check the ancestors. >> >> 720: and more pointedly, if the count is uninitialized, then there can be >> no limit. >> >> 781: it's no longer necessary to dirty the dd_dbuf, since you aren't >> modifying dd_phys (the zap_update() will take care of any necessary >> dmu_buf_will_dirty() calls). >> >> 1650: It's pretty unusual to be making on-disk changes from a checkfunc. >> I *think* it will work, but it at least deserves a comment. >> >> 1661: Is it really possible for dd to not be zapified and have the counts >> present, given that we just called dsl_dir_init_fs_ss_count()? Although, >> it seems like maybe you *shouldn't* be calling the init func, if the >> feature isn't enabled. And the init func should assert that the feature is >> active. >> >> 1823: why the special case for fs_cnt == 0, but not for ss_cnt? >> >> dsl_destroy.c: 665: if you relaxed the assertion in >> dsl_dir_fscount_adjust(), you could simply always do >> "dsl_dir_fscount_adjust(dd, tx, -1)". Up to you if that's worth it. >> >> dsl_dataset.c: >> 1192: I don't think you should ignore errors here (e.g. wrong value >> type). "cnt = fnvpair_value_uint64(pair);" will do what you want. >> >> dmu_send.c: >> >> 802-850: why do you pre-check the fs/ss limits when creating a new fs + >> snap, but not when creating just a new snap? >> >> 1735: why is the fscount adjusted explicitly here? I would have thought >> it would be taken care of by dmu_recv_begin_sync()'s call to >> dsl_dataset_create_sync(). (if changed, also change/remove the call to >> fscount_check() in the checkfunc) Ah, I see, we adjust the fscount in the >> callers (e.g. dmu_objset_clone_sync()). I think it would be better to do >> it in dsl_dir_create_sync(). You already have the check for begins with >> "%" in fscount_adjust(). That exception should ideally be implemented in >> as few places as possible. For example, in the current code if we abort >> the receive (via dmu_recv_cleanup_ds()), you also need an exception to not >> decrement the fscount. This seems error-prone. >> >> --matt >> >> >> >>> >>> >>> The previous webrev is still available at: >>> >>> >>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897/index.html >>> >>> Thanks, >>> Jerry >>> >>> >>> On Wed, Jul 24, 2013 at 5:00 PM, Matthew Ahrens <[email protected]>wrote: >>> >>>> Two high-level comments: >>>> >>>> 1. I'd like to avoid adding fields to dsl_dir_phys_t, >>>> dsl_dataset_phys_t, etc. if at all possible. The issue is that if another >>>> feature is developed independently and uses the same field (same offset in >>>> the struct) to mean something else, then illumos can't simultaneously >>>> support both features. >>>> >>>> We have features that would naturally use new fields in these structs, >>>> but realized that this would impact anyone developing another such feature, >>>> such as this one. Therefore I have implemented functionality to use the >>>> dsl_dir and dsl_dataset objects as zap objects, so that we can add new ZAP >>>> entries rather than adding fields to the structs. I can make that code >>>> available later this week for you to take a look at. I can also help you >>>> switch your code to use it, if you'd be willing to go that route. See >>>> below for another alternative. >>>> >>>> 2. It would be nice to expose the filesystem and snapshot counts to the >>>> user. This will help answer the question, "how close am I to my limit"? >>>> >>>> We could solve both these issues at once by storing the counts as >>>> properties (in the dd_props_zapobj). >>>> >>>> --matt >>>> >>>> >>>> On Mon, Jul 22, 2013 at 10:49 AM, Jerry Jelinek < >>>> [email protected]> wrote: >>>> >>>>> I'd like another code review for: >>>>> >>>>> 3897 zfs filesystem and snapshot limits >>>>> >>>>> There have already been a few code reviews for this last fall with the >>>>> last one being in January. I was in the process of finalizing the >>>>> responses >>>>> from that feedback in February when the big zfs re-write was pushed and I >>>>> didn't have any time to deal with that then. >>>>> >>>>> I have finally gotten a chance to update the code to work with the new >>>>> zfs design and to also finish addressing the previous comments. Since this >>>>> code is somewhat different from what I had previously, and since so much >>>>> time has passed, it is probably best to start over and treat this as a >>>>> fresh review. >>>>> >>>>> There is a webrev at: >>>>> >>>>> >>>>> http://us-east.manta.joyent.com/jjelinek/public/webrevs/illumos3897/index.html >>>>> >>>>> Thanks, >>>>> Jerry >>>>> >>>>> *illumos-developer* | >>>>> Archives<https://www.listbox.com/member/archive/182179/=now> >>>>> <https://www.listbox.com/member/archive/rss/182179/21175174-cd73734d>| >>>>> Modify<https://www.listbox.com/member/?member_id=21175174&id_secret=21175174-792643f6>Your >>>>> Subscription >>>>> <http://www.listbox.com> >>>>> >>>> >>>> >>> >> >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
