meem-094 nwamd/dlpi_events.c'Throughout: libdlpi does not use `errno'
for errors. All of the dlpi-related log messages in this file are busted.
ACCEPT
meem-095 nwamd/dlpi_events.c'34: Unclear what net/route.h has to do with
this file. done
ACCEPT
meem-096 nwamd/dlpi_events.c'65: This `if' clause is needless: nwamd has
requested to only receive events of these two types. done
ACCEPT
meem-097 nwamd/dlpi_events.c'67: Use of 'IFF_UP' to have some
relationship to the link state seems wrong. Seems like it would be simpler to
change link_state.link_state to link_state.link_up and just have it be a
boolean. BUG
ACCEPT
meem-098 nwamd/dlpi_events.c'71: Shouldn't this message explain the
administrative impact? done
ACCEPT
meem-099 nwamd/dlpi_events.c'88-102: Unclear why this function is
allocating a buffer to receive into since it does nothing with this buffer ?
and even more unclear why this buffer is sized to DLPI_PHYSADDR_MAX. Seems like
this whole function should just be:
static void *
nwamd_dlpi_thread(void *arg)
{
int rc;
dlpi_handle_t *dh = arg;
for (;;) {
rc = dlpi_recv(*dh, NULL, NULL, NULL, NULL, -1, NULL);
if (rc != DLPI_SUCCESS) {
nlog(LOG_ERR, "nwam_dlpi_thread: dlpi_recv "
"failed: %s", dlpi_strerror(rc));
break;
}
}
return (NULL);
}
done
ACCEPT
meem-100 nwamd/dlpi_events.c'94: Bogus cast. done
ACCEPT
meem-101 nwamd/dlpi_events.c'99: Unclear where the magic notion of three
failures comes from; one would think any unexpected failure would be worth
stopping for.
ACCEPT
meem-102 nwamd/dlpi_events.c'100: presumably *dh was meant? bug
ACCEPT
meem-103 nwamd/dlpi_events.c'105: This function would be much easier to
read if a local variable was declared for ncu->ncu_node.u_link.
ACCEPT
meem-104 nwamd/dlpi_events.c'123: Can this happen? If so, what prevents
a situation where two thread simultaneously pass this check and race to both
call pthread_create()?
ACCEPT
meem-105 nwamd/dlpi_events.c'135, 182: Comments only tell me what is
clear already from the code.
ACCEPT
meem-106 nwamd/dlpi_events.c'136-146: dlpi_bind() is no longer needed to
use dlpi_enabnotify().
ACCEPT
meem-107 nwamd/dlpi_events.c'150: Sure seems like `strdup(name)' ends up
getting leaked here. Shouldn't this be NULL?
ACCEPT
meem-108 nwamd/dlpi_events.c'160: Presumably the intent is to record the
error value returned from pthread_create() in the nlog() message?
ACCEPT
seb-021 README: Is there room for an nwam dtrace provider that can help
track the various states of the objects maintained by nwamd to help with
debugging? The strategy for debugging problems could include "give me the
output of this dtrace script" instead of (or maybe in addition to) the current
"set the debug SMF property, enable debugging in syslog, and give me the output
of the syslog file".
That was discussed early in the project and the idea rejected due to
lack of time and familiarity by the developers.
seb-022 README'124: It would help to expand the NCP, NCU, and ENM
acronyms at least once in the README.
ACCEPT
seb-023 Makefile: For a daemon as central and critical to the system's
operation as nwamd, it should be compiled with CTF data so that it can be
debugged with dtrace and mdb. See the in.mpathd Makefile for an example of how
to do this.
ACCEPT
seb-024 objects.h'45-52: There are over 90 other declarations of
"object_name" in ON. Please use nwam-specific prefixes in nwam data structures
to help with navigation of the code.
ACCEPT
seb-025 objects.h'47,48: Use of "void *" for object data defeats
compiler type-checking and is not generally good practice. At a guess,
shouldn't object handle be of type "struct nwam_handle *"? object_data could be
pointer to a union or just a union (does object_data need to be allocated
separately?).
We changed one of these. The other is used to point at a variant
object with no commonality.
seb-026 ncu.h'70-99: Members need some identifying prefix. E.g., I have
no hope of getting at "flags" or "dhp" from cscope. Please go through all data
structures and rectify this where necessary.
ACCEPT
seb-027 main.c'71: There's not much point in making this a global
variable in main.c since it's only used in one place in conditions.c, but I'm
wondering why it's not initialized by an SMF property like other timer-related
intervals.
ACCEPT
seb-028 main.c'71: Related to wlan_scan_interval, if the user manually
selected a specific WLAN link, I would think that the daemon doesn't scan every
120 seconds. For some WiFi drivers, initiating a scan can have really strange
side-effects, including blocking data for extended periods of time and
experiencing link flaps. It would be nice to have a mode where the daemon
doesn't continually scan for no reason.
I'm going to file an RFE for this. We don't scan in the steady state.
seb-029 main.c'82: And that nice README that comes with the source...
ACCEPT
seb-030 main.c'143: This isn't used outside of main.c and should be
static. In accordance to csctyle guidelines, it also needs to go at the top of
the file with other globals.
Moved to the top of the file. It ends up being used in other files.
seb-031 main.c'170: refresh() calls nwamd_init_ncus() along with other
nwamd_init_<stuff>() functions. Do these init routines handle cleaning up the
existing objects before initializing? If they clean up, then why is
nwamd_fini_ncus() called on line 169? If they don't, then why don't we need to
call nwamd_fini_<stuff>() before the refresh() call on 173?
ACCEPT (need to call fini)
seb-032 main.c'172: I'm wondering why SIGHUP is any different from
SIGTHAW. Shouldn't both signals essentially result in re-creating the nwamd
universe from scratch as if it were starting anew?
ACCEPT
seb-033 main.c'242: There's no need for any of these local variables;
you can pass the addresses of the global variables into
nwamd_lookup_boolean_property().
ACCEPT mod need to keep scan_level_tmp, active_ncp_tmp
seb-034 main.c'288: Consider renaming to nwamd_refresh() for easier
cscope browsing.
ACCEPT
seb-035 main.c'347: I may be misunderstanding how this works, but what
is the point of enqueuing an event here if the daemon is immediately going to
exit() (the event might still be in the queue when that happens). Did I miss
something?
The idea is to run an event through the system letting the various
subsystems doing whatever they need to do in order to shutdown.
seb-036 main.c'414: I don't see how zonename is used anywhere. It
should probably be removed (along with nwamd_lookup_zonename()).
ACCEPT
seb-037 main.c'424: Using dladm_status2str() to include additional
information would be good.
ACCEPT
seb-038 main.c'444: What's this SOFT_RESET_FILE about? It doesn't look
like it's used anywhere.
ACCEPT
seb-039 main.c'480: door_initialize() should return an error code that
can be converted to a printable string passed to pfail(). Another approach
would be to have door_initialize() pfail() upon failure and have it return void
(which is what you do for other init routines).
ACCEPT
seb-040 main.c'481: What kind of error handling did you have in mind?
Killing the daemon seems perfectly reasonable since this looks like a fatal
error condition.
ACCEPT
seb-061 coditions.c'49: Yes, conditions are usually conditional.
ACCEPT
seb-062 coditions.c'94: Need block comment explaining what this does.
ACCEPT
seb-063 coditions.c'111,113: I don't grok this code. What does
NWAM_CONDITION_IS mean? This looks needlessly complicated. Why not just have a
function that returns the state of the object and have the caller do what it
wants with it...
The caller is just executing an operation (the condition) passed
through from the user. This is the common place that is understood.
seb-064 coditions.c'565: "ret" is not descriptive enough given that it
has deeper semantics. It represents the state of some object.
A block comment has been added to the file. I think that is better
then trying to compress that into a variable name.
seb-065 coditions.c'608: There should be a comment somewhere (perhaps
at the top of this file) explaining what CONDITIONAL_ANY and CONDITIONAL_ALL
mean. I'm really not sure what this is designed to do.
ACCEPT (block comment at top of file)
seb-066 dlpi_events.c'58,87: Please give these more distinguished names.
ACCEPT
seb-067 dlpi_events.c'67: IFF_UP is an IP interface flag; it is not a
link state. There is a link_state_t defined in <sys/mac.h> that you can use for
this purpose.
ACCEPT
seb-068 dlpi_events.c'82-84: This needs further clarification in the
comment. You're only interested in notifications, and notifications are only
processed in a pending dlpi_recv(). As such, you need to continuously call
dlpi_recv() even if you're not interested in any other DLPI messages.
ACCEPT
seb-069 dlpi_events.c'99: This needs some comments, as I'm not clear on
what it implies. Does the DLPI notification thread for this link simply vanish
if it gets three consecutive read errors? What happens to the rest of the
handling for that link if that happens?
ACCEPT
seb-070 dlpi_events.c'103: Need block comment.
ACCEPT
seb-071 dlpi_events.c'116: This can't possibly happen, right? Shouldn't
this be an assert()?
ACCEPT
seb-072 dlpi_events.c'131,144,156,167: It would be simpler and less
error-prone to declare a name string of appropriate size on the stack, and have
nwam_ncu_typed_name_to_name() (that's quite a mouthful) use it instead of
having it dynamically allocate memory.
ACCEPT
seb-073 dlpi_events.c'130,142,154: The use of errno in these locations
is a bug: dlpi_open() only sets errno if the error is DL_SYSERR. You should use
dlpi_strerror() on the returned libdlpi error code to print something
meaningful, it handles this cleanly.
ACCEPT
seb-084 events.c'476,509: Stashing the user context like this is
obsoleted by dtrace and mdb, and neither of those need to have debug enabled to
work. Can we lose this?
ACCEPT
seb-085 events.c'477: s/returned/return/
ACCEPT
seb-086 events.c'481,519,550,etc.: Since this isn't a
PTHREAD_MUTEX_ERRORCHECK mutex, failure of pthread_mutex_lock() is impossible.
ACCEPT
seb-087 loc.c'201: Need block comment: check what?
ACCEPT (block comment)
seb-088 loc.c'224,227: This would be simpler as:
err = nwam_value_get_uint64(activationval, &activation);
nwam_value_free(activationval);
if (err != NWAM_SUCCESS) {
nlog(...);
return (0);
}
ACCEPT
seb-089 loc.c'349: This doesn't look right since another thread could
be modifying active_loc.
ACCEPT
seb-090 ncu.c'51,1138: This doesn't look kosher at all. Basing
functionality on link names is not correct. There looks to be some architecture
missing here.
We don't have a way around this at the moment. Writing a bug to remove
this as soon as we do.
seb-091 ncu.c'191: Holding the link open is a problem for DR. This will
need to eventually be addressed. This is a more general problem than just the
DLPI interaction, obviously (I'm not sure how the existing rcm code interacts
with nwamd's configuration of IP interfaces). The overall NWAM interaction with
DR will need to be investigated fully in order for these two features to
interoperate. Please file an RFE to track this.
ACCEPT (filing an RFE)
seb-092 ncu.c'1427: It looks like some other thread can sneak in and
grab a reference to this object after it's been unlocked and before it's blown
away by nwamd_object_fini().
ACCEPT
seb-093 ncu.c'1513-1528: This could all be collapsed:
case NWAM_STATE_OFFLINE_TO_ONLINE:
case NWAM_STATE_ONLINE_TO_OFFLINE:
case NWAM_STATE_ONLINE:
nwamd_ncu_state_machine(event->event_object);
break;
There is a block comment on one of them and I'd like to allow for the
cases separate for documentation when needed.
seb-094 ncu.c'1545: Won't doing this after the object lock is dropped
result in potential out-of-order issues with these calls?
These calls relock the object and decide what to do.
seb-095 ncu_ip.c'89: This function leaks both "request" and "reply".
ACCEPT
seb-096 ncu_ip.c'271: I think this needs to be newh. Otherwise, you're
setting the broadcast on logical unit 0, and not the address that was created.
ACCEPT
seb-097 ncu_ip.c'540-582: The strategy used to figure out if a logical
interface is needed or if it's the first address seems error prone and
certainly not thread safe (although I'm not sure if thread safety is a concern
for this function). This seems to be needed as a side-effect of the distinction
between the newly introduced icfg_add_ipaddr() vs. icfg_set_addr() functions.
It should be possible to fix this by defining a single icfg_add_addr() function
that just does the right thing.
rewritten significantly, less complex and covers add/set distinction in
add_ip_address
On a related note, why is one _ipaddr() and the other _addr()?
We inherited libinetcfg. Ask the bad boy who wrote it?
seb-098 ncu_ip.c'700: There's a XXX here.
ACCEPT
seb-099 ncu_ip.c'722: A cleaner interface would be two separate
functions, a plumb function and an unplumb function. They can share the same
implementation if necessary, but at least the callers won't be saddled with a
huge ambiguous function name and needless boolean argument.
ACCEPT
seb-100 ncu_phys.c'69: The caller should pass in a buffer of size
LIFNAMSIZ; there's no reason for this function to dynamically allocate memory.
It could then return void and callers wouldn't need to handle failure.
ACCEPT
seb-101 ncu_phys.c'132: Similar to my comment on the plumb_unplumb()
function, this really should be two functions whose names are indicative of
what they do instead of one ambiguous one with a boolean argument that
indicates what it should do.
ACCEPT
seb-102 ncu_phys.c'185: There's a XXX here.
ACCEPT
seb-103 objects.c'195: pthread_rwlock_wrlock() can't possibly fail. If
it fails, then it's because there's a bug in your code and you might as well
abort().
ACCEPT
seb-104 objects.c'396: nwamd_object_fini() must not fail. It is called
in a context where failure is not an option (there is no way to recover from
such a failure). This must be rectified. It looks like every caller ignores the
return value, so it looks like someone already knew that this was wrong...
ACCEPT
seb-105 util.c'241: Isn't this a little too late? zone_check_datalink()
verifies if a link belongs to a running zone, but nwamd holds all links open
when it starts, which is before zones boot. As such, the zone that needs this
link won't be able to use it (a link's zone property can't be set if the link
is open). I don't see how this works. Maybe I'm not seeing the bigger picture.
How can an exclusive stack zone boot propertly while nwamd is running?
If the user wants to keep links available for a zone he needs to use a
User ncp. We only open links that are in our current ncp. For the
automatic ncp that is everything.
seb-106 util.c'521: A safer model (one less prone to memory leaks) is
one where the caller passes in an appropriately-sized buffer to store the
answer.
ACCEPT