Hi Sarah,
Thank you for getting these too me! My comments in-line and
webrevs at:
Full:
http://cr.opensolaris.org/~clayb/webrev.multihomed/sarah/
Differential:
http://cr.opensolaris.org/~clayb/webrev.multihomed/sarah/diff
Thank you,
Clay
On Mon, 19 Jul 2010, Sarah Jelinek wrote:
Hi Clay,
I used this webrev location for my review:
http://cr.opensolaris.org/~clayb/webrev.multihomed/john
I tried not to repeat other's comments...I reviewed all files, only noted
those for which I had comments.
thanks,
sarah
******
ai_sd.py:
404 # dhcpinfo(1) should return a string akin to '192.168.5.1\n'
405 svc_address = dhcp_stdout.strip()
406
407 # check if we got an error or if we got no useful response
408 if dhcp_stderr or not svc_address:
409 AISD_LOG.post(AILog.AI_DBGLVL_ERR,
410 "Couldn't find BootSrvA record in DHCP;
error %s",
411 dhcp_stderr)
412 return 2
Wouldn't it be better to check for dhcp_stderr first, before trying to strip
out the address?
It could be; but either way, strip() won't raise an exception and if
svc_address is blank after the strip, that's a problem just the same as if
we have some standard error output.
412 return 2
What does a '2' return mean to the caller?
Interestingly, 2 is usually an option error[1] when using OptParse in
Python, however, ai_sd.py is only called by manifest-locator.sh and the
return code checked against non-zero. I don't see a page in Solaris like
FreeBSD's sysexits(3)[2] to take guidance from either.
No mention is made as to what 2 means, however ai_sd.py returns it for a
couple of reasons:
No valid AI service found
Couldn't open %s for saving service list
I would be returning 2 for the following extra reason:
Unable to retrieve BootSrvA option from DHCP server: %s
By the way, the script uses -1 for the following (exception
catching reasons):
Popen() raised OSError exception
Popen() raised ValueError exception
(For an uncaught issue where not svc_info_found and not
svc_txt_rec_found)
I'm not sure that 2 makes sense, it seems all places where 2 is returned
should be 1 or someother less mysterious number?
[1]: From http://docs.python.org/library/optparse.html
In either case, optparse handles the error the same way: it prints the
program?s usage message and an error message to standard error and exits
with error status 2.
[2]: http://www.freebsd.org/cgi/man.cgi?query=sysexits&sektion=3
installadm-common.sh:
97 #<get all addresses> |
498 #<remove IPv6> |<remove lo0> |
499 #<remove unconfigured>
Why do you remove ipv6 addresses here?
At this time, installadm(1) does no handling of IPv6 interfaces, being
both GRUB and wanboot are IPv4 only.
# exclude will be true or false depending on if we are excluding
557 # the networks configured in all_services/networks
558 exclude=$($SVCPROP -cp all_services/exclude_networks
$SMF_INST_SERVER)
559
Why is the exclude list different than the all_services/networks list? Not
saying it's wrong, just wondering why it is this way and why we have to
traverse the networks list twice to get the final list.
I'm confused here. The exclude property is a boolean so either true or
false; the networks property is a list of networks to either exclude or
include.
584 for octet in $net; do
585 # XXX I must be missing a better way to do
s/^0*//
586 octet=${octet#0}
587 stripped_net="${stripped_net}${octet#0}."
588 done
Can't you use the ${parameter//pattern/string} capabilities of the scripting
to do this substitution?
The XXX comment should be removed if not better way exists.
Yes, this was just a flag to code reviewers.
A better non-looping way was implemented when hacking away for some
changes Joe requested. I found that the following works in a single
operation. (Notice the dots to see where each octet is, then see the
leading optional *(0) pattern for matching leading zeros, followed by the
required +([0-9]) pattern which mandates at least one zero or 1-9 to keep
at least one digit retained however, thanks to the + operator the pattern
will expand if all three values of the octet are in use (i.e. one zero of
padding or no-padding). Lastly the expression back substitutes each of
these required patterns to retain the data of the IP address.
775 print
${ip/*(0)+([0-9]).*(0)+([0-9]).*(0)+([0-9]).*(0)+([0-9])/\2.\4.\6.\8}
637 if (is_multihomed() != B_FALSE) {
I would think it would read better as "if is_multihomed == B_TRUE) {...
Joe felt that way too, so it's now == B_TRUE.
setup-dhcp.sh:
changes to add_macro for updating as opposed to overwriting.. Is there a bug
related to this change? It wasn't obvious to me in the bugs listed. Just
wondering which piece covers this change.
This is an artifact of needing to work with network macros now for
multihomed servers. As we need to provide a different BootFile or BootSrvA
IP address for each subnet. When modifying these network macros, it
doesn't make sense to stomp them and overwrite any specific customizations
which an administrator would have in the network macro (like a router,
netmask, etc.).
In a discussion with Ethan, he's since pointed out we should print out the
instructions to update the macro if an administrator already has a
BootSrvA or BootFile record in their macro which conflicts with our
proposed change. This way, the administrator can make their own decision
as to the corrective action to take.
112 for option in $value; do
113 # ignore blank matches (the leading
colon
114 # in the macro string)
115 if [[ -z "$option" ]]; then
116 continue
117 fi
118 IFS="$OIFS"
119
option=${option/\$boot_file/$boot_file}
120 $DHTADM -M -m $name -e $option
121 IFS=":"
Is it possible that an option specified would be invalid? If so, do we care?
Or will it fail in another place?
It is possible that the function could be called with a bogus option. If
it is, dhtadm(1) will error out and the error will be printed on the
user's screen. Further, it is possible that one could embed a colon in an
option's value (like the case with BOOTFILE which is specially handled),
in that case the macros will not be properly separated, so I've added the
following comment on line 99 to warn an unwarray code modifier in the
future before this is Python-ized:
# this is necessary as colon's embedded in a macro
# value will cause the for loop below to split on
# the option value instead of on the option boundary
setup-service.sh:
115 # see if the machine has more than one interface up or no
interfaces up
116 if (( $(ipadm show-addr -p -o ADDR|\
117 grep -v ':'|grep -v '127.0.0.1' |grep -v '0.0.0.0'|\
118 wc -l) != 1 )); then
119 # for multi-homed AI servers use the "$serverIP"
keyword
120 ip="\$serverIP"
This code block is repeated in installadm-common, can we create one function
and use it?
Yes, valid_networks() is now used everywhere.
addr=`$PNTADM -P ${net} | /usr/bin/nawk '{ print $3 }' \
455 2>/dev/null | $GREP "^${ip}\$"`
This code is repeated serveral times in setup-dhcp.sh. Can we create a
function and just call it? Might be easier to maintain. Also, /usr/bin/nawk
should be globally defined as NAWK and used this way.
As I was simply making these blocks pass shcomp -n (changing `...` to
$(...)), I'd prefer not to get intimate with them and move them into a
function; however, through a lot of the scripts I've been trying to
replace the various programs strewn throughout with fully qualified paths.
I've done this through all scripts now for all executable, I believe, so
that means I've now addressed bug: 6448 - installadm scripts should use
full pathnames
interface=$(route -n get \
485 ${network%*.[0-9]*}.$((${network##*.}+1)) | $GREP
\
486 '^[^a-zA-Z0-9]*interface: ' | $SED \
487 's/^[^a-zA-Z0-9]*interface: //')
So, I am wondering if we are trying to get the interface the ip address why
don't you just use ipadm show-addr, and grep through the output to find the
network address you are looking for then get the interface? Then you can
proceed to the code starting at line 498. I am concerned about using the
routing tables to get this data.
Unfortunately, I explicitly need to query the routing table in the case of
multiple interfaces on the same subnet. As the kernel keeps a default
outgoing interface for a subnet I believe, and I should use that same
interface too.
E.g. on one machine you can see:
--------------------------------
cl...@jurassic: ipadm show-addr
ADDROBJ TYPE STATE ADDR
[...]
ipmp0/? static ok 10.5.240.81/24
ipmp0/? static ok 10.5.240.82/24
aggr1/? static ok 10.5.240.83/24
aggr2/? static ok 10.5.240.84/24
[...]
So, of these four 10.5.240.0/24 addresses which should AI publish? For
that I'd like to use the routing table:
cl...@jurassic: netstat -rn
Routing Table: IPv4
Destination Gateway Flags Ref Use Interface
-------------------- -------------------- ----- ----- ---------- ---------
default 10.5.240.1 UG 487 2640491534
10.5.240.0 10.5.240.84 U 3 767087 aggr2
10.5.240.0 10.5.240.83 U 3 767087 aggr1
10.5.240.0 10.5.240.82 U 32 45484797 ipmp0
10.5.240.0 10.5.240.81 U 9 412811 ipmp0
[...]
Routing Table: IPv6
Destination/Mask Gateway Flags Ref Use If
--------------------------- --------------------------- ----- --- ------- -----
::1 ::1 UH 12 71294 lo0
cl...@jurassic: route get 10.5.240.1
route to: xxxx.SFBay.Sun.COM
destination: xxxx.SFBay.Sun.COM
mask: 255.255.255.0
interface: ipmp0
flags: <UP,DONE>
recvpipe sendpipe ssthresh rtt,ms rttvar,ms hopcount mtu expire
0 0 0 0 0 0 1500 0
Here I'd use the ipmi0 interface over the other three as that's what the
kernel uses for its outgoing, primary interface on 10.5.240.0/24.
setup-sparch.sh:
39 WANBOOTCGI_PATH="/usr/lib/inet/wanboot/"
40 WANBOOTCGI="wanboot-cgi"
Why did you separate these into two variables? It doesn't look like we use
them independently.
Thank you, we did but that's now vestigial. I've collapsed them back down.
[snip...]
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss