meem-094        nwamd/dlpi_events.c'Throughout: libdlpi does not use `errno' 
for errors. All of the dlpi-related log messages in this file are busted.

ACCEPT

meem-095        nwamd/dlpi_events.c'34: Unclear what net/route.h has to do with 
this file.      done

ACCEPT

meem-096        nwamd/dlpi_events.c'65: This `if' clause is needless: nwamd has 
requested to only receive events of these two types.    done

ACCEPT

meem-097        nwamd/dlpi_events.c'67: Use of 'IFF_UP' to have some 
relationship to the link state seems wrong. Seems like it would be simpler to 
change link_state.link_state to link_state.link_up and just have it be a 
boolean.    BUG

ACCEPT

meem-098        nwamd/dlpi_events.c'71: Shouldn't this message explain the 
administrative impact?       done

ACCEPT

meem-099        nwamd/dlpi_events.c'88-102: Unclear why this function is 
allocating a buffer to receive into since it does nothing with this buffer ? 
and even more unclear why this buffer is sized to DLPI_PHYSADDR_MAX. Seems like 
this whole function should just be:

           static void *
           nwamd_dlpi_thread(void *arg)
           {
                int rc;
                dlpi_handle_t *dh = arg;

                for (;;) {
                        rc = dlpi_recv(*dh, NULL, NULL, NULL, NULL, -1, NULL);
                        if (rc != DLPI_SUCCESS) {
                                nlog(LOG_ERR, "nwam_dlpi_thread: dlpi_recv "
                                    "failed: %s", dlpi_strerror(rc));
                                break;
                        }
                }
                return (NULL);
           }

        done

ACCEPT

meem-100        nwamd/dlpi_events.c'94: Bogus cast.     done

ACCEPT

meem-101        nwamd/dlpi_events.c'99: Unclear where the magic notion of three 
failures comes from; one would think any unexpected failure would be worth 
stopping for.

ACCEPT

meem-102        nwamd/dlpi_events.c'100: presumably *dh was meant?      bug

ACCEPT

meem-103        nwamd/dlpi_events.c'105: This function would be much easier to 
read if a local variable was declared for ncu->ncu_node.u_link.

ACCEPT

meem-104        nwamd/dlpi_events.c'123: Can this happen? If so, what prevents 
a situation where two thread simultaneously pass this check and race to both 
call pthread_create()?

ACCEPT

meem-105        nwamd/dlpi_events.c'135, 182: Comments only tell me what is 
clear already from the code.

ACCEPT

meem-106        nwamd/dlpi_events.c'136-146: dlpi_bind() is no longer needed to 
use dlpi_enabnotify().

ACCEPT

meem-107        nwamd/dlpi_events.c'150: Sure seems like `strdup(name)' ends up 
getting leaked here. Shouldn't this be NULL?

ACCEPT

meem-108        nwamd/dlpi_events.c'160: Presumably the intent is to record the 
error value returned from pthread_create() in the nlog() message?

ACCEPT

seb-021         README: Is there room for an nwam dtrace provider that can help 
track the various states of the objects maintained by nwamd to help with 
debugging? The strategy for debugging problems could include "give me the 
output of this dtrace script" instead of (or maybe in addition to) the current 
"set the debug SMF property, enable debugging in syslog, and give me the output 
of the syslog file".

That was discussed early in the project and the idea rejected due to
lack of time and familiarity by the developers.

seb-022         README'124: It would help to expand the NCP, NCU, and ENM 
acronyms at least once in the README.

ACCEPT

seb-023         Makefile: For a daemon as central and critical to the system's 
operation as nwamd, it should be compiled with CTF data so that it can be 
debugged with dtrace and mdb. See the in.mpathd Makefile for an example of how 
to do this.

ACCEPT

seb-024         objects.h'45-52: There are over 90 other declarations of 
"object_name" in ON. Please use nwam-specific prefixes in nwam data structures 
to help with navigation of the code.

ACCEPT

seb-025         objects.h'47,48: Use of "void *" for object data defeats 
compiler type-checking and is not generally good practice. At a guess, 
shouldn't object handle be of type "struct nwam_handle *"? object_data could be 
pointer to a union or just a union (does object_data need to be allocated 
separately?).

We changed one of these.  The other is used to point at a variant
object with no commonality.

seb-026         ncu.h'70-99: Members need some identifying prefix. E.g., I have 
no hope of getting at "flags" or "dhp" from cscope. Please go through all data 
structures and rectify this where necessary.

ACCEPT

seb-027         main.c'71: There's not much point in making this a global 
variable in main.c since it's only used in one place in conditions.c, but I'm 
wondering why it's not initialized by an SMF property like other timer-related 
intervals.

ACCEPT

seb-028         main.c'71: Related to wlan_scan_interval, if the user manually 
selected a specific WLAN link, I would think that the daemon doesn't scan every 
120 seconds. For some WiFi drivers, initiating a scan can have really strange 
side-effects, including blocking data for extended periods of time and 
experiencing link flaps. It would be nice to have a mode where the daemon 
doesn't continually scan for no reason.

I'm going to file an RFE for this.  We don't scan in the steady state.

seb-029         main.c'82: And that nice README that comes with the source...

ACCEPT

seb-030         main.c'143: This isn't used outside of main.c and should be 
static. In accordance to csctyle guidelines, it also needs to go at the top of 
the file with other globals.

Moved to the top of the file.  It ends up being used in other files.

seb-031         main.c'170: refresh() calls nwamd_init_ncus() along with other 
nwamd_init_<stuff>() functions. Do these init routines handle cleaning up the 
existing objects before initializing? If they clean up, then why is 
nwamd_fini_ncus() called on line 169? If they don't, then why don't we need to 
call nwamd_fini_<stuff>() before the refresh() call on 173?

ACCEPT (need to call fini)

seb-032         main.c'172: I'm wondering why SIGHUP is any different from 
SIGTHAW. Shouldn't both signals essentially result in re-creating the nwamd 
universe from scratch as if it were starting anew?

ACCEPT

seb-033         main.c'242: There's no need for any of these local variables; 
you can pass the addresses of the global variables into 
nwamd_lookup_boolean_property().

ACCEPT mod need to keep scan_level_tmp, active_ncp_tmp

seb-034         main.c'288: Consider renaming to nwamd_refresh() for easier 
cscope browsing.

ACCEPT

seb-035         main.c'347: I may be misunderstanding how this works, but what 
is the point of enqueuing an event here if the daemon is immediately going to 
exit() (the event might still be in the queue when that happens). Did I miss 
something?

The idea is to run an event through the system letting the various
subsystems doing whatever they need to do in order to shutdown.

seb-036         main.c'414: I don't see how zonename is used anywhere. It 
should probably be removed (along with nwamd_lookup_zonename()).

ACCEPT

seb-037         main.c'424: Using dladm_status2str() to include additional 
information would be good.

ACCEPT

seb-038         main.c'444: What's this SOFT_RESET_FILE about? It doesn't look 
like it's used anywhere.

ACCEPT

seb-039         main.c'480: door_initialize() should return an error code that 
can be converted to a printable string passed to pfail(). Another approach 
would be to have door_initialize() pfail() upon failure and have it return void 
(which is what you do for other init routines).

ACCEPT

seb-040         main.c'481: What kind of error handling did you have in mind? 
Killing the daemon seems perfectly reasonable since this looks like a fatal 
error condition.

ACCEPT

seb-061         coditions.c'49: Yes, conditions are usually conditional.

ACCEPT

seb-062         coditions.c'94: Need block comment explaining what this does.

ACCEPT

seb-063         coditions.c'111,113: I don't grok this code. What does 
NWAM_CONDITION_IS mean? This looks needlessly complicated. Why not just have a 
function that returns the state of the object and have the caller do what it 
wants with it...

The caller is just executing an operation (the condition) passed
through from the user.  This is the common place that is understood.

seb-064         coditions.c'565: "ret" is not descriptive enough given that it 
has deeper semantics. It represents the state of some object.

A block comment has been added to the file.  I think that is better
then trying to compress that into a variable name.

seb-065         coditions.c'608: There should be a comment somewhere (perhaps 
at the top of this file) explaining what CONDITIONAL_ANY and CONDITIONAL_ALL 
mean. I'm really not sure what this is designed to do.

ACCEPT (block comment at top of file)

seb-066         dlpi_events.c'58,87: Please give these more distinguished names.

ACCEPT

seb-067         dlpi_events.c'67: IFF_UP is an IP interface flag; it is not a 
link state. There is a link_state_t defined in <sys/mac.h> that you can use for 
this purpose.

ACCEPT

seb-068         dlpi_events.c'82-84: This needs further clarification in the 
comment. You're only interested in notifications, and notifications are only 
processed in a pending dlpi_recv(). As such, you need to continuously call 
dlpi_recv() even if you're not interested in any other DLPI messages.

ACCEPT

seb-069         dlpi_events.c'99: This needs some comments, as I'm not clear on 
what it implies. Does the DLPI notification thread for this link simply vanish 
if it gets three consecutive read errors? What happens to the rest of the 
handling for that link if that happens?

ACCEPT

seb-070         dlpi_events.c'103: Need block comment.

ACCEPT

seb-071         dlpi_events.c'116: This can't possibly happen, right? Shouldn't 
this be an assert()?

ACCEPT

seb-072         dlpi_events.c'131,144,156,167: It would be simpler and less 
error-prone to declare a name string of appropriate size on the stack, and have 
nwam_ncu_typed_name_to_name() (that's quite a mouthful) use it instead of 
having it dynamically allocate memory.

ACCEPT

seb-073         dlpi_events.c'130,142,154: The use of errno in these locations 
is a bug: dlpi_open() only sets errno if the error is DL_SYSERR. You should use 
dlpi_strerror() on the returned libdlpi error code to print something 
meaningful, it handles this cleanly.

ACCEPT

seb-084         events.c'476,509: Stashing the user context like this is 
obsoleted by dtrace and mdb, and neither of those need to have debug enabled to 
work. Can we lose this?

ACCEPT

seb-085         events.c'477: s/returned/return/

ACCEPT

seb-086         events.c'481,519,550,etc.: Since this isn't a 
PTHREAD_MUTEX_ERRORCHECK mutex, failure of pthread_mutex_lock() is impossible.

ACCEPT

seb-087         loc.c'201: Need block comment: check what?

ACCEPT (block comment)

seb-088         loc.c'224,227: This would be simpler as:

  err = nwam_value_get_uint64(activationval, &activation);
  nwam_value_free(activationval);
  if (err != NWAM_SUCCESS) {
        nlog(...);
        return (0);
  }


ACCEPT

seb-089         loc.c'349: This doesn't look right since another thread could 
be modifying active_loc.

ACCEPT

seb-090         ncu.c'51,1138: This doesn't look kosher at all. Basing 
functionality on link names is not correct. There looks to be some architecture 
missing here.

We don't have a way around this at the moment.  Writing a bug to remove
this as soon as we do.

seb-091         ncu.c'191: Holding the link open is a problem for DR. This will 
need to eventually be addressed. This is a more general problem than just the 
DLPI interaction, obviously (I'm not sure how the existing rcm code interacts 
with nwamd's configuration of IP interfaces). The overall NWAM interaction with 
DR will need to be investigated fully in order for these two features to 
interoperate. Please file an RFE to track this.

ACCEPT (filing an RFE)

seb-092         ncu.c'1427: It looks like some other thread can sneak in and 
grab a reference to this object after it's been unlocked and before it's blown 
away by nwamd_object_fini().

ACCEPT

seb-093         ncu.c'1513-1528: This could all be collapsed:

  case NWAM_STATE_OFFLINE_TO_ONLINE:
  case NWAM_STATE_ONLINE_TO_OFFLINE:
  case NWAM_STATE_ONLINE:
        nwamd_ncu_state_machine(event->event_object);
        break;

There is a block comment on one of them and I'd like to allow for the
cases separate for documentation when needed.

seb-094         ncu.c'1545: Won't doing this after the object lock is dropped 
result in potential out-of-order issues with these calls?

These calls relock the object and decide what to do.

seb-095         ncu_ip.c'89: This function leaks both "request" and "reply".

ACCEPT

seb-096         ncu_ip.c'271: I think this needs to be newh. Otherwise, you're 
setting the broadcast on logical unit 0, and not the address that was created.

ACCEPT

seb-097         ncu_ip.c'540-582: The strategy used to figure out if a logical 
interface is needed or if it's the first address seems error prone and 
certainly not thread safe (although I'm not sure if thread safety is a concern 
for this function). This seems to be needed as a side-effect of the distinction 
between the newly introduced icfg_add_ipaddr() vs. icfg_set_addr() functions. 
It should be possible to fix this by defining a single icfg_add_addr() function 
that just does the right thing.

rewritten significantly, less complex and covers add/set distinction in 
add_ip_address

On a related note, why is one _ipaddr() and the other _addr()?

We inherited libinetcfg.  Ask the bad boy who wrote it?

seb-098         ncu_ip.c'700: There's a XXX here.

ACCEPT

seb-099         ncu_ip.c'722: A cleaner interface would be two separate 
functions, a plumb function and an unplumb function. They can share the same 
implementation if necessary, but at least the callers won't be saddled with a 
huge ambiguous function name and needless boolean argument.

ACCEPT

seb-100         ncu_phys.c'69: The caller should pass in a buffer of size 
LIFNAMSIZ; there's no reason for this function to dynamically allocate memory. 
It could then return void and callers wouldn't need to handle failure.

ACCEPT

seb-101         ncu_phys.c'132: Similar to my comment on the plumb_unplumb() 
function, this really should be two functions whose names are indicative of 
what they do instead of one ambiguous one with a boolean argument that 
indicates what it should do.

ACCEPT

seb-102         ncu_phys.c'185: There's a XXX here.

ACCEPT

seb-103         objects.c'195: pthread_rwlock_wrlock() can't possibly fail. If 
it fails, then it's because there's a bug in your code and you might as well 
abort().

ACCEPT

seb-104         objects.c'396: nwamd_object_fini() must not fail. It is called 
in a context where failure is not an option (there is no way to recover from 
such a failure). This must be rectified. It looks like every caller ignores the 
return value, so it looks like someone already knew that this was wrong...

ACCEPT

seb-105         util.c'241: Isn't this a little too late? zone_check_datalink() 
verifies if a link belongs to a running zone, but nwamd holds all links open 
when it starts, which is before zones boot. As such, the zone that needs this 
link won't be able to use it (a link's zone property can't be set if the link 
is open). I don't see how this works. Maybe I'm not seeing the bigger picture. 
How can an exclusive stack zone boot propertly while nwamd is running?

If the user wants to keep links available for a zone he needs to use a
User ncp.  We only open links that are in our current ncp.  For the
automatic ncp that is everything.

seb-106         util.c'521: A safer model (one less prone to memory leaks) is 
one where the caller passes in an appropriately-sized buffer to store the 
answer.

ACCEPT

Reply via email to