Meem,

Thanks for your comments. See inline:

>  > RCM:                    Meem
>  > 
>  > External webrev:
>  > 
>  >        http://cr.grommit.com/~yun/webrev_uv_09_28
>  > 
>  > Internal webrev:
>  > 
>  > 
> http://jurassic.sfbay/net/forwar.prc/builds/yz147064/clearview-uv-review/webrev_review/
>  
>  > 
>  > Cscope:
>  > 
>  >        /net/forwar.prc/builds/yz147064/clearview-uv-review/
> 
> Here's my feedback on everything except network_rcm.c (which I'm still
> looking at).  Despite the volume of my feedback, I found the code to be
> quite good -- most of my feedback consists of nits, and even several of
> the more substantive issues exist in other RCM modules.  Still, I'm a
> picky bastard, and you asked for my review, so this is what you get ;-)
> 
> General comments:
> 
>         * The RCM namespace of SUNW_network/<linkid> seems a bit generic
>           and hard to understand -- perhaps SUNW_network could be
>           SUNW_datalink?  

Accept.

>           Also, I'm a bit concerned about the use of link
>           identifiers here, since e.g. cfgadm will output them (right edge
>           truncated to avoid 80-column wrap):
> 
>           cfgadm: Component system is busy, try again:
>                          Resource                               Info
>           --------------------------------------  ------------------
>           /devices/pci at 9,700000/pci at 2/network at 0   Network interface 
>           SUNW_network/1                          ce0 AGGRs: aggr0
>           SUNW_network/2                          aggr0 hosts IP add
>           SUNW_ip/10.8.57.204                     Plumbed IP Address
>            
>           What would be the impact of using link names instead?
> 
Note that all datalink configuration are based on linkids, so a rename-link 
should not affect the datalink datalink configuration 
offline/online/configure. But if we use link names, there would be races 
between the RCM operation and a "dladm rename-link" operation, and would 
cause unexpected result.

>           If (for some reason) linkids must be used at the link layer,
>           would it be possible to use link names at a higher level?  I'm
>           really uncomfortable with e.g. ip_rcm working with linkids.
> 
That is possible. But that would mean link rcm modules (for example, 
aggr_rcm), needs to populate a event to its consumer twice (one in the form 
of SUNW_datalink/<linkid>, another in the form of SUNW_network/<linkname>). 
Because its consumer could be at or above the link layer (for example, IP 
and VLAN).

I am not sure whether that implementation is clearer.

>         * Similar to the above: please be sure that link identifiers do
>           not appear in diagnostic messages the system administrator could
>           see.  (If possible, I'd prefer to avoid link identifiers in all
>           diagnostic messages).
> 
You mean aggr_log_err() etc. right? Sure. For debug information, I think we 
would want the link id, as that is the resource name get populated.

>         * I know it's common practice in the RCM code to assert() that the
>           input arguments are correct, but I cannot see the use in this.
>           For instance, code like this is just tedious:
> 
>           static int
>           vlan_resume(rcm_handle_t *hd, char *rsrc, id_t id, uint_t flags,
>               char **errorp, rcm_info_t ** depend_info)
>           {
>                   /* Guard against bad arguments */
>                   assert(hd != NULL);
>                   assert(rsrc != NULL);
>                   assert(id == (id_t)0);
>                   assert(errorp != NULL);
>                   assert(depend_info != NULL);
>           
>                   rcm_log_message(RCM_TRACE1, _("VLAN: resume(%s)\n"), rsrc);
>           
>                   return (RCM_SUCCESS);
>           }
> 
>           Please remove all of these asserts -- the modules should be able
>           to trust the framework to pass them correct arguments.
> 
Okay. I will remove for the aggr_rcm and vlan_rcm, which I added.

>         * There is no need to internationalize DEBUG and TRACE messages
>           (and you'll note that other modules like ip_rcm do not bother).
>           So, please remove the _()'s from the various DEBUG and TRACE
>           rcm_log_message() invocations.
> 
>         * Speaking of messages, your implementation of the aggr_log_err()
>           and vlan_log_err() functions already prefix everything with
>           AGGR: or VLAN:, so there's no need to include those strings in
>           the calls to those functions.
> 
Sure.
>         * It's really a shame that every RCM module has its own
>           implementation of cache_insert(), cache_remove() and so forth.
>           But that's well-established practice at this point, so I guess
>           I'll just have to live with it :-O
> 
>         * A number of functions have comments stating "Call with
>           cache_lock held".  I'd prefer the comment be removed and an
>           assert() added to the function in its place.  (I've also noted
>           the places I've found where we call them without holding
>           cache_lock in the specific comments below.)
> 
free_node() could be called without holding the lock. It is fine. See below.

>         * Many references to "succeed" should be "succeeded" (e.g.,
>           "VLAN: offline succeeded").
> 
Changed.
>         * Are we sure it's safe to call *_consumer_offline() while holding
>           cache_lock?  Seems like it should be OK, but I'm slightly
>           nervous about calling other modules (via rcm_request_offline())
>           while holding locks.
> 
Yes. It should be fine to hold locks. Note that it is not safe to hold the 
lock when calling rcm_notify_event(RCM_RESOURCE_LINK_NEW), as it could 
result another RCM_RESOURCE_LINK_NEW event.

>         * Occasionally, the vlan_cache and aggr_cache structure variables
>           use the name `probe' rather than `node' -- why?  What does
>           `probe' mean in this context?
> 
I don't know. ip_rcm and network_rcm sometimes use "probe". I changed to use 
"node" in vlan_rcm and aggr_rcm.

> vlan_rcm.c:
> 
>         * Please replace all use of "device" with "link" or "VLAN" as
>           appropriate.
> 
Accept.
>         * Would be nice to provide some high-level comments which explain
>           the code flow when a link with VLANs is brought online or
>           offline.  I'd be happy to work with you on these.
> 
Sure. I will talk to you offline.

>         * The `vlan_cache' structure seems misnamed -- this appears to
>           represent a link, not a VLAN.  Likewise, it seems like
>           CACHE_VLAN_* should really be CACHE_LINK_*.
> 
Changed to link_cache_t. Changed to CACHE_NODE_*.

>         * On a related note: some of the functions have confusing names.
>           For instance, the oddly-named vlan_offline_vlan() is really
>           semantically link_offline_vlan() (but e.g. vlan_update() is
>           properly named).
> 
The first vlan is really just a prefix.

>         * 68: dv_vlanid seems like a confusing name, since it's not the
>           VLAN id on the wire.  Seems like this should be dv_linkid
>           instead (see following comment).
Accept.
> 
>         * 69: dv_linkid appears to be set but never used.
> 
Accept.
>         * 70: dv_vid appears to be set but never used.
> 
Accept.
>         * 71: dv_force appears to be set but never used.
> 
Accept.
>         * 73: Could we consistently use an enum for flags, rather than
>           using a combination of uint32_t and bitfields?  (enums are
>           debugger-friendly.)
> 
I changed to a enum type , but it is still bitfield.

>         * 86-87: Move `{' onto previous line as per cstyle rules.
> 
Accept.
>         * 172: Second parameter should be 0, not NULL.
> 
Accept.
>         * 197: Unclear how this cache_free() would ever have any work to
>           do, since if vlan_register() was never called the cache will be
>           empty, and if it was called, then upon unregistration
>           vlan_unregister() will have already emptied the cache.  (And if
>           items were left in the cache, is it safe to remove them without
>           calling rcm_unregister_interest()?)  If the cache_free() call is
>           needless, then the cache_free() routine can be removed as well.
> 
The xxx_unregister() function does not seem to be called at all. So that I 
am not sure whether cache_free() could have work to do. But at this point, 
since the framework does not provide the rcm_handle, it is not possible to 
call rcm_unregister_interest() here.

It seems a framework issue.

>         * 257: Would be useful to print an RCM_ERROR diagnostic if
>           rcm_unregister_interest() fails.
> 
Accept.
>         * 268: s/Need to unregister/Unregister/
> 
Accept
>         * 311: Assuming this errno assignment is actually needed, it
>           should be done after the call to mutex_unlock() to ensure
>           mutex_unlock() doesn't change it.
> 
It is not needed. Removed.

>         * 317, 398: s/on associated/of associated/
> 
Accept
>         * 338: This message seems unnecessary given line 347.
>           
Accept
>         * 373, 522, 736, 800, 1348, 1358, 1436: Remove extra blank line.
Accept.
> 
>         * 390: Not sure I see the need to report a debug message when
>           vlan_online_vlan() succeeds.
> 
Removed.
>         * 405: vlan_consumer_online() failures deserve at least an
>           RCM_WARNING, not RCM_DEBUG.
> 
Accept.
>         * 416, 1194: Can you explain the need for the `break_on_failure'
>           flag?  Why not always try to online all VLANs?
> 
The reason was:

one is recovery after offline fails, break_on_failure  would be set to 
B_FALSE, and vlan_online_vlan() and vlan_consumer_online_common() will try 
best to online all VLANs and consumers.

Another is undo_offline() which is called by the framework. break_on_failure 
would be set to B_FALSE, and stop online operation once a failure occurs.

But I looked the cfgadm code again, it seems in the second case, the 
framework does not have any mechanism to report any failure of 
undo_offline(), and what it does is to try best to recover from a failed 
offline operation. I agree it should do the same and the first case.

I will make the change: ignore any failure occurs and online all the VLANs.

As a result, vlan_consumer_online(), vlan_online_vlan() will all return void 
instead of int.
>         * 422: Comment seems a bit strange -- a comment before the loop
>           of "Recreate all VLANs used by `node'" seems more fitting.
> 
Accept
>         * 424-444: Indentation would be cleaner with:
> 
>                 if (!(vlan->dv_flags & VLAN_OFFLINED))
>                         continue;
> 
Accept.
>         * 441: Would be simpler to `return (RCM_FAILURE)' here, remove
>           lines 446-448, and change line 450 to `return (RCM_SUCCESS)'
> 
This function has been changed to return void.

>         * 425-426: This function is sufficiently small that the nested
>           declarations add little value; please hoist to the top of the
>           function.
> 
Accept.
>         * 432, 1428: s/Prompt the warning/Print a warning/
> 
Accept.
>         * 443: Why is VLAN_OFFLINED cleared even if the dladm_vlan_up()
>           call failed?
> 
Accept.
>         * 454: Based on the callers, seems like vlan_offline_vlan() should
>           also take the CACHE_* state flag to set on node->vc_state.
> 
Accept.
>         * 464-466: Comment seems a bit strange -- a comment before the
>           loop of "Delete all VLANs used by `node'" seems more fitting.
> 
Accept.
>         * 467-481, 1323-1335: Indentation would be cleaner with:
> 
>                 if (vlan->dv_implicit)
>                         continue;
> 
Accept.
>         * 474: Would be simpler to `return (RCM_FAILURE)' here, remove
>           lines 484-486, and change line 487 to `return (RCM_SUCCESS)'
> 
Accept.
>         * 513, 648, 877: Explicitly test against NULL.
> 
Accept.
>         * 521: Why not directly assign to *usagep?
> 
Accept.
>         * 522: Can we drop cache_lock at this point instead? 
> 
Accept.
>         * 529: Why is *errorp set to NULL?
> 
It is set to NULL by the caller already. Removed.
>         * 572: Stray space between `**' and `depend_info'.
> 
Accept.
>         * 602, 1203, 1252: Would simplify code to use alloca() instead.
> 
Changed to declare rsrc as an array.

>         * 610: Under what situation do we get here without
>           VLAN_CONSUMER_OFFLINED being set?  (When some force mode is
>           enabled?)  It seems like this would mean that we're about to
>           remove a link that still has consumers actively using one or
>           more of its VLANs, which would be problematic.
> 
I don't think this case can happen - xxx_remove() should be called (by 
confrim_rcm()) when the device is already successfully detached. I could 
change the check to an assertion if you prefer that.

>         * 614: If rcm_notify_remove() fails part-way through, what
>           restores the state of the VLANs that were removed?
> 
No, the framework does not provide such API, and each rcm plugin usually 
just remove the cached node and populate the REMOVE event here. It is really 
hard to say what to do to recover this.

>         * 620, 1229, 1275: Remove unused `done:' labels.
> 
Accept.
>         * 635: Needless initialization of `rv'.
> 
Accept.
>         * 660: The ordering here seems strange -- shouldn't we defer the
>           removal of the node until after the consumers have agreed on
>           the remove?
> 
I don't think it matters.

>         * 661: This call to node_free() is done without holding cache_lock.
> 
It is already removed from the cache, so no need to hold the lock.

>         * 691, 699, 709, 719, 730: Given that vlan_log_err() calls
>           rcm_log_message(), why do we need to explicitly call
>           rcm_log_message()?  If it's just to provide the linkid, seems
>           like that could be addressed by changing vlan_log_err() to take
>           a linkid rather than a `node'.
> 
Accept.
>         * 705: Indenting would be cleaner here if you did:
> 
>                 if (strcmp(nvpair_name(nvp), RCM_NV_LINKID) != 0)
>                         continue;
> 
Accept.
>         * 721, 732: Why not continue and try to bring up the other VLANs?
> 
Sure. Changed.
>         * 785: First part of this expression has an extra set of parens.
> 
Accept.
>         * 793: bzero() here seems needless.
> 
Accept.
>         * 794: Prefer snprintf().
> 
Accept.
>         * 817, 819: Prefer strlcat().
> 
Accept.
>         * 818, 820: Remove needless braces.
> 
Accept.
>         * 855-864, 1066-1071: These would be more natural as `for' loops
>           -- though to keep the formatting decent, you may need to do
>           something like:
> 
>           probe = cache_head.vc_next;
>           for (; probe != &cache_tail; probe = probe->vc_next) {
>                 /* ... */
>           }
> 
Accept.
>         * 857: How can `vc_resource' be NULL here?
> 
Removed the check.
>         * 878: Needless comparison against NULL.
> 
Accept.
>         * 882: Needless assignment of `vlan'.
> 
Accept.
>         * 928: Needless ARGSUSED.
> 
Accept.
>         * 932, 1405: No need to cast `arg'.
> 
Accept.
>         * 936: No need to initialize `rsrc'.
> 
Accept.
>         * 948, 1415: s/infomation/information/ -- also, perhaps it
>           would be clearer as "cannot get vlan information for"?
> 
Accept.
>         * 975, 1002: Remove unhelpful comments.
> 
Accept.
>         * 992, 1022, 1131: Remove needless parens.
> 
Accept.
>         * 1004: This doesn't seem safe: we've already inserted `probe'
>           into the cache on line 987 -- and it has a reference to `rsrc'.
> 
I moved cache_insert() after allocation of "vlan" succeeds.

>         * 1022: What's the harm in just always clearing CACHE_VLAN_STALE?
>           Seems like `update_stale_state' isn't needed.
> 
Accept.
>         * 1042-1044: `vlan_update_arg_t arg = { hd, 0 };' would be simpler.
> 
Accept.
>         * 1085-1101: This seems more natural as
>           for (vlan = probe->vc_vlan; vlan != NULL; vlan = next)
> 
Accept.
>         * 1102: Seems like it would be simpler to introduce a `prnext'
>           variable and assign `prnext = probe->vc_next' here, then remove
>           lines 1110, 1117, and 1133.  It would also allow `freeit' to
>           be eliminated.
> 
Accept.
>         * 1164: As per previous comments, seems like vlan_log_err() should
>           take a linkid rather than a node (and print a link name).  Also,
>           a format like:
> 
>                 VLAN: vlan2: cannot offline consumers
> 
>           ... seems preferable to the current:
> 
>                 VLAN: cannot offline consumers(vlan2)
> 
>           ... but I realize the format you chose is well-established
>           practice with other RCM modules.
>
Like you said, I will follow the conventions and will not change the format.

>         * 1180: Remove needless cast.  Also, it's unclear why calloc() is
>           being used.
> 
Accept.
>         * 1191: Remove "/* ARRGSUSED */" (the extra R wasn't causing it to
>           do anything anyway ;-)'
> 
errorp is not used, so I will just fix the typo.

>         * 1245: No need to initialize `vlan'.
> 
Accept.
>         * 1281: I don't quite understand this XXX comment; if this is a new
>           VLAN, shouldn't we have already passed it along to consumers via
>           vlan_notify_event(link) -> vlan_configure(link) -> vlan_up(vlan)
>           -> dladm_vlan_up(vlan) -> vlan_notify_event(vlan)?  If not,
>           there seems to be a hole here in notifying consumers about
>           new VLANs.
> 
It is a left over from the old implementation. I will remove the comments.

>         * 1295: No need for `ret'.
> 
This has been changed as now vlan_consumer_online return void.

>         * 1328: s/Notify the RCM_RESOURCE_LINK_NEW event/
>                   Send RCM_RESOURCE_LINK_NEW events/
> 
Accept.
>         * 1363: Why `cached_name' rather than `rsrc'?
> 
Accept.
>         * 1381: s/consumers the new link/consumers of the new link/
> 
Accept.
>         * 1406, 1420-1424: Seems like it would be simpler to do:
> 
>                 if (vlan_attr.dv_linkid != vlan_up_argp->linkid)
>                         return (DLADM_WALK_CONTINUE);
> 
Accept.
>         * 1425: Shouldn't this message be at the top of the function?
> 
I don't like to print out the message for every persistent VLAN.

>         * 1461: CACHE_REFRESH is passed here, but it's ignored since
>           the handle is NULL.  But given that this codepath is only
>           reached via vlan_notify_event(), and that's already done a
>           cache_update(), it's probably OK to just pass CACHE_NO_REFRESH,
>           and then remove the hd != NULL check.
> 
I will add rcm_handle_t argument to the vlan_configure() function. Since 
vlan_configure() itself changes the list of active VLANs, I'd prefer to add 
refresh the cache here.

> aggr_rcm.c:
> 
>         * As with vlan_rcm.c, it would be nice to provide some high-level
>           comments which explain the code flow when an aggregation is
>           brought online or offline.  I'd be happy to work with you on
>           these.
> 
Sure. I will discuss this with you offline.

>         * Also as with vlan_rcm.c, the `aggr_cache' structure seems
>           misnamed -- this represents a link using an aggregation,
>           not an aggregation itself.
> 
I've changed it to link_cache.

>         * 61: Seems like dl_portinfo_t isn't really necessary, and that
>           we could remove it and change da_ports to be:
> 
>                 datalink_id_t              *da_portids;
> 
Accept.
>         * 75: da_policy appears to be set but never used.
> 
>         * 76: da_mac appears to be set but never used.
> 
>         * 77: da_lacp_mode appears to be set but never used.
> 
>         * 78: da_lacp_timer appears to be set but never used.
> 
>         * 79: da_mac_fixed appears to be set but never used.
> 
>         * 80: da_force appears to be set but never used.
> 
Removed all the above fields.

>         * 97-98: Move `{' onto previous line as per cstyle rules.
> 
Accept.

>         * 188, 193: Second parameter should be 0, not NULL.
> 
Accept.
>         * 221-227: As with vlan_rcm.c:197, it's unclear how this would
>           ever have any work to do -- and if it did have work to do, it
>           seems like it would need to call rcm_unregister_interest().
> 
The aggr_unregister() function does not seem to be called at all. So that I 
am not sure whether 221-227 could have work to do. But at this point, since 
the framework does not provide the rcm_handle, it is not possible to call 
rcm_unregister_interest() here.

>         * 230: Likewise, this call to aggr_list_free() seems unnecessary
>           given the call to aggr_list_free in aggr_unregister().
> 
See above.

>         * 333: Would be useful to print an RCM_ERROR diagnostic if
>           rcm_unregister_interest() fails.
> 
Accept.
>         * 348: s/Need to unregister/Unregister/
> 
Accept.
>         * 392: Assuming this errno assignment is actually needed, it
>           should be done after the call to mutex_unlock() to ensure
>           mutex_unlock() doesn't change it.
> 
No. I don't think it is needed.

>         * 403, 525, 572, 1404: Given the descriptive name of the function,
>           the is_only_port variable seems needless -- just inline the calls
>           into the `if' statements.
> 
Accept.
>         * 425, 478: These messages seem unnecessary.
> 
Accept.
>         * 461, 635, 1322, 1370, 1429, 1548: Remove extra blank line.
> 
Accept.
>         * 491: aggr_consumer_online() failures deserve at least an
>           RCM_WARNING, not RCM_DEBUG.
> 
Accept.
>         * 494, 497: Should this be "IP/VLAN" consumers?  (Or perhaps we
>           should just refrain from guessing what the consumers might be.)
> 
Changed "IP consumers" to "Consumers"

>         * 508: Based on the callers, seems like aggr_offline_port() would
>           benefit from taking a CACHE_* state flag to set on node->vc_state.
> 
Accept.
>         * 514, 558: Would be slightly simpler to declare a normal variable
>           rather than an array with one member -- e.g.:
> 
>                 port.lp_linkid = node->vc_linkid;
>                 status = dladm_aggr_remove(aggr->da_aggrid, 1, &port,
>                     DLADM_OPT_ACTIVE);
> 
Accept.
>         * 564: Indentation would be cleaner with:
> 
>                 if (!(node->vc_state & CACHE_AGGR_PORT_OFFLINED))
>                         return (RCM_SUCCESS);
> 
Accept.

>         * 577: What's the rationale behind setting *up to B_TRUE even
>           if dladm_aggr_up() fails?
> 
Fixed.

Just be noted that it was not a problem, as *up is not used in the failure case.

>         * 588, 1536: s/Prompt the warning/Print a warning/
> 
Accept.
>         * 626, 723: Explicitly test against NULL.
> 
Accept.
>         * 634: Why not directly assign to *usagep?
> 
Changed.
>         * 642: Why is *errorp set to NULL?
> 
It is set to NULL by the caller already. Removed.

>         * 655: Seems like we can drop cache_lock after the call to
>           aggr_usage().
> 
Agreed. Changed.

>         * 685: Stray space between `**' and `depend_info'.
> 
Accepted.

>         * 610: Under what situation do we get here without
>           CAHCE_AGGR_CONSUMER_OFFLINED being set?  As with VLANs, it 
>           seems odd to remove a link that still has consumers actively
>           using it.
> 
Did you mean line 710?

If the link is not the only port in the aggregation. Then 
CAHCE_AGGR_CONSUMER_OFFLINED is not set.

>         * 736: Don't we need to hold cache_lock here to safely access
>           vc_aggr?  (If not, why is it needed at line 1401?)
> 
No. We don't, because this node has already been removed from the cache list.

>         * 737: assert() seems needless; we'll crash two lines later.
> 
Accept.
>         * 738: What does the name `exported' mean here?
> 
It seems a common practice through out the existing rcm modules, which means 
the resource names which would populate the RCM events.

>         * 741: The ordering here seems strange -- shouldn't we defer the
>           removal of the node until after the consumers have agreed on
>           the remove?
> 
This does not matter. This node can be removed no matter whether the 
consumers agreed on the remove.

>         * 749: This call to node_free() is done without holding cache_lock.
> 
Again, this node is already removed from the cache list.

>         * 778, 787, 797, 806, 816: Given that aggr_log_err() calls
>           rcm_log_message(), why do we need to explicitly call
>           rcm_log_message()?  If it's just to provide the linkid, seems
>           like that could be addressed by changing aggr_log_err() to take
>           a linkid rather than a `node'.
> 
Accept.
>         * 795: Indenting would be cleaner here if you did:
> 
>                 if (strcmp(nvpair_name(nvp), RCM_NV_LINKID) != 0)
>                         continue;
> 
Accept.
>         * 852: The plural "AGGRs" seems misleading, since a port can only
>           be in one aggregation.  Maybe a message like "ce0 is part of
>           aggregation aggr0" would be clearer?  (Or, if we'll only output
>           this message when ce0 is the last port in the aggregation, we
>           could be more explicit.)
> 
Accept.
>         * 862: No separators are used here.
> 
Removed.
>         * 863: Why is LINKID_STR_WIDTH used here?  We output a link name,
>           so it seems like it should be DLPI_LINKNAME_MAX.
> 
Accept.

>         * 870: bzero() here seems needless.
> 
Accept.
>         * 871: Prefer snprintf().
> 
Accept.
>         * 889: Prefer strlcat().
> 
Accept.
>         * 916: cache_lookup() is never called with hd == NULL.
> 
Accept.
>         * 925:  How can `vc_resource' be NULL here?

I agree. Removed the check.

>         * 925-932, 1168-1172, 1175-1179: These would be more natural 
>           as `for' loops.
> 
Accept.
>         * 943: assert() seems needless; we'll crash on the next line.
> 
Accept.
>         * 976: aggr_port_update() needs a comment (or better yet an
>           assert()) that cache_lock is held.
>         
Accept.
>         * 1005, 1122, 1215: Remove needless parens.
>           
Accept.
>         * 1021: Remove needless comment.
> 
Accept.
>         * 1049: Needless ARGSUSED.
> 
Accept.
>         * 1053, 1496: No need to cast arg (and in the latter case, the
>           assignment can be hoisted to line 1488).
> 
Accept.
>         * 1069: s/infomation/information -- also, perhaps it would be
>           clearer as "cannot get aggr information for"?
> 
Accept.
>         * 1100: If this calloc() fails and `exist' is true, we end up
>           leaving an aggr in aggr_head with a NULL da_ports which will
>           e.g. later cause problems in aggr_is_only_port().  

I changed cache_update() to continue to remove the staled node even 
aggr_update_all() fails.

Therefore, no matter what fails, aggr_update_all() would try to walk through 
all the existing active aggregations and update the cache and aggr list. If 
"exist" is B_TRUE, the aggr would still be marked as AGGR_STALE and will 
later be freed in cache_update().

>           Same with the
>           logic on lines 1115-1117.  If `exist' is false, then we end up
>           freeing aggr even though it's currently inserted on the list via
>           the aggr_list_insert() on line 1092.
> 
         if (!exist)
                 aggr_list_insert(aggr);

         aggr->da_flags &= ~AGGR_STALE;


>         * 1121: Seems simpler to always clear AGGR_STALE.
> 
Accept.
>         * 1128: No need to check if aggr != NULL.
> 
Accept.
>         * 1130, 1551: The notion that dladm_aggr_info() partially returns
>           dynamically-allocated memory is ugly (though this is not related
>           to you changes ;-).
> 
>         * 1145-1147: `aggr_update_arg_t arg = { hd, 0 };' would be simpler.
> 
Accept.

>         * 1188: Seems like it would be simpler to introduce a `prnext'
>           variable and assign `prnext = probe->vc_next' here, then remove
>           lines 1195, 1202, and 1217.  It would also allow `freeit' to
>           be eliminated.
> 
Accept, and change while() to for() loop.

>         * 1224: s/clear/delete/
> 
Accept.
>         * 1244: As per previous comments, seems like aggr_log_err() could
>           take a linkid rather than a node (and print a link name).  Same
>           comments also apply on the format of the messages.
> 
For the rcm_log_message() part of this function, I think we still should 
print a linkid. Otherwise, I accept your comments.

>         * 1260: Remove needless cast.  Also, it's unclear why calloc() is
>           being used.
> 
Accept
>         * 1278-1281: Would be simpler as:
> 
>           return (aggr->da_nports == 1 && aggr->da_ports->dpi_portid == 
> portid);
>  
Accept.
>         * 1293, 1338: `aggr' variable seems of limited use.
> 
Accept.
>         * 1294, 1339: No need to initialize `rsrc'.
> 
Accept.
>         * 1301, 1352: Would simplify code to use alloca() instead.
> 
Accept. I changed rsrc as an array.

>         * 1314-1323, 1362-1371: No real need for the `goto done' here,
>           especially if you change the code to use alloca().
> 
Accept.
>         * 1378: s/Notify the RCM_RESOURCE_LINK_NEW event/
>                   Send RCM_RESOURCE_LINK_NEW events/
> 
Accept.
>         * 1443: Why `cached_name' rather than `rsrc'?
> 
Changed.
>         * 1501: I'm a bit confused on the semantics of `retval' -- seems
>           like it provides the status of the last aggregation to be
>           examined, which doesn't seem like what we want.
> 
>         * 1553: If the configuration was successful and a port can only
>           be in one aggregation, why do we return DLADM_WALK_CONTINUE?
> 
I've made some change based on your above comments. Please take a look once 
I send out the updated webrev.

> ip_rcm.c:
> 
>         * Since this is IP configuration, it seems strange (if not simply
>           wrong) to have link identifiers here.  Is there a specific
>           reason why link/interface names can no longer be used?
> 
See above.
>         * 822: Remove extra blank line.
> 
Accept.

>         * 1470: `ifname' seems more appropriate than `link' here.
> 
Accept.
>         * 1937-1940: When would `link' be NULL here?
> 
It is not possible. I've removed those.

I will do a self review and send an updated webrev after that. If you are 
interested, here is the webrev for now:

     /net/aquila.prc/export/home/cathy/clearview/webrev_rcm

Thanks
- Cathy

Reply via email to