Re: [caiman-discuss] code-review for AI multihomed

ClayB Wed, 21 Jul 2010 17:20:15 -0700

Hi Sarah,

Thank you for getting these too me! My comments in-line andwebrevs at:


Full:
http://cr.opensolaris.org/~clayb/webrev.multihomed/sarah/

Differential:
http://cr.opensolaris.org/~clayb/webrev.multihomed/sarah/diff

                                                        Thank you,
                                                        Clay

On Mon, 19 Jul 2010, Sarah Jelinek wrote:

Hi Clay,

I used this webrev location for my review:

http://cr.opensolaris.org/~clayb/webrev.multihomed/john
I tried not to repeat other's comments...I reviewed all files, only notedthose for which I had comments.
thanks,
sarah
******

ai_sd.py:
404         # dhcpinfo(1) should return a string akin to '192.168.5.1\n'
  405         svc_address = dhcp_stdout.strip()
  406
  407         # check if we got an error or if we got no useful response
  408         if dhcp_stderr or not svc_address:
  409             AISD_LOG.post(AILog.AI_DBGLVL_ERR,
410 "Couldn't find BootSrvA record in DHCP;error %s",
  411                           dhcp_stderr)
  412             return 2
Wouldn't it be better to check for dhcp_stderr first, before trying to stripout the address?

It could be; but either way, strip() won't raise an exception and ifsvc_address is blank after the strip, that's a problem just the same as ifwe have some standard error output.

  412             return 2

What does a '2' return mean to the caller?

Interestingly, 2 is usually an option error[1] when using OptParse inPython, however, ai_sd.py is only called by manifest-locator.sh and thereturn code checked against non-zero. I don't see a page in Solaris likeFreeBSD's sysexits(3)[2] to take guidance from either.

No mention is made as to what 2 means, however ai_sd.py returns it for acouple of reasons:

No valid AI service found
Couldn't open %s for saving service list

I would be returning 2 for the following extra reason:
Unable to retrieve  BootSrvA option from DHCP server: %s

By the way, the script uses -1 for the following (exceptioncatching reasons):

Popen() raised OSError exception
Popen() raised ValueError exception
(For an uncaught issue where not svc_info_found and not
 svc_txt_rec_found)

I'm not sure that 2 makes sense, it seems all places where 2 is returnedshould be 1 or someother less mysterious number?


[1]: From http://docs.python.org/library/optparse.html

In either case, optparse handles the error the same way: it prints theprogram?s usage message and an error message to standard error and exitswith error status 2.


[2]: http://www.freebsd.org/cgi/man.cgi?query=sysexits&sektion=3

installadm-common.sh:

97         #<get all addresses>  |
  498         #<remove IPv6>  |<remove lo0>  |
  499         #<remove unconfigured>

Why do you remove ipv6 addresses here?

At this time, installadm(1) does no handling of IPv6 interfaces, beingboth GRUB and wanboot are IPv4 only.

  # exclude will be true or false depending on if we are excluding
  557         # the networks configured in all_services/networks
558 exclude=$($SVCPROP -cp all_services/exclude_networks$SMF_INST_SERVER)
  559
Why is the exclude list different than the all_services/networks list? Notsaying it's wrong, just wondering why it is this way and why we have totraverse the networks list twice to get the final list.

I'm confused here. The exclude property is a boolean so either true orfalse; the networks property is a list of networks to either exclude orinclude.

  584                 for octet in $net; do
585 # XXX I must be missing a better way to dos/^0*//
  586                         octet=${octet#0}
  587                         stripped_net="${stripped_net}${octet#0}."
  588                 done
Can't you use the ${parameter//pattern/string} capabilities of the scriptingto do this substitution?
The XXX comment should be removed if not better way exists.


Yes, this was just a flag to code reviewers.

A better non-looping way was implemented when hacking away for somechanges Joe requested. I found that the following works in a singleoperation. (Notice the dots to see where each octet is, then see theleading optional *(0) pattern for matching leading zeros, followed by therequired +([0-9]) pattern which mandates at least one zero or 1-9 to keepat least one digit retained however, thanks to the + operator the patternwill expand if all three values of the octet are in use (i.e. one zero ofpadding or no-padding). Lastly the expression back substitutes each ofthese required patterns to retain the data of the IP address.


 775         print 
${ip/*(0)+([0-9]).*(0)+([0-9]).*(0)+([0-9]).*(0)+([0-9])/\2.\4.\6.\8}

  637                 if (is_multihomed() != B_FALSE) {

I would think it would read better as "if is_multihomed == B_TRUE) {...


Joe felt that way too, so it's now == B_TRUE.

setup-dhcp.sh:
changes to add_macro for updating as opposed to overwriting.. Is there a bugrelated to this change? It wasn't obvious to me in the bugs listed. Justwondering which piece covers this change.

This is an artifact of needing to work with network macros now formultihomed servers. As we need to provide a different BootFile or BootSrvAIP address for each subnet. When modifying these network macros, itdoesn't make sense to stomp them and overwrite any specific customizationswhich an administrator would have in the network macro (like a router,netmask, etc.).

In a discussion with Ethan, he's since pointed out we should print out theinstructions to update the macro if an administrator already has aBootSrvA or BootFile record in their macro which conflicts with ourproposed change. This way, the administrator can make their own decisionas to the corrective action to take.

112                         for option in $value; do
113 # ignore blank matches (the leadingcolon
  114                                 # in the macro string)
  115                                 if [[ -z "$option" ]]; then
  116                                         continue
  117                                 fi
  118                                 IFS="$OIFS"
119option=${option/\$boot_file/$boot_file}
  120                                 $DHTADM -M -m $name -e $option
  121                                 IFS=":"
Is it possible that an option specified would be invalid? If so, do we care?Or will it fail in another place?

It is possible that the function could be called with a bogus option. Ifit is, dhtadm(1) will error out and the error will be printed on theuser's screen. Further, it is possible that one could embed a colon in anoption's value (like the case with BOOTFILE which is specially handled),in that case the macros will not be properly separated, so I've added thefollowing comment on line 99 to warn an unwarray code modifier in thefuture before this is Python-ized:


# this is necessary as colon's embedded in a macro
# value will cause the for loop below to split on
# the option value instead of on the option boundary

setup-service.sh:
115 # see if the machine has more than one interface up or nointerfaces up
  116         if (( $(ipadm show-addr -p -o ADDR|\
  117                 grep -v ':'|grep -v '127.0.0.1' |grep -v '0.0.0.0'|\
  118                 wc -l) != 1 )); then
119 # for multi-homed AI servers use the "$serverIP"keyword
  120                 ip="\$serverIP"
This code block is repeated in installadm-common, can we create one functionand use it?


Yes, valid_networks() is now used everywhere.

addr=`$PNTADM -P ${net} | /usr/bin/nawk '{ print $3 }' \
  455                     2>/dev/null | $GREP "^${ip}\$"`
This code is repeated serveral times in setup-dhcp.sh. Can we create afunction and just call it? Might be easier to maintain. Also, /usr/bin/nawkshould be globally defined as NAWK and used this way.

As I was simply making these blocks pass shcomp -n (changing `...` to$(...)), I'd prefer not to get intimate with them and move them into afunction; however, through a lot of the scripts I've been trying toreplace the various programs strewn throughout with fully qualified paths.

I've done this through all scripts now for all executable, I believe, sothat means I've now addressed bug: 6448 - installadm scripts should usefull pathnames

interface=$(route -n get \
485 ${network%*.[0-9]*}.$((${network##*.}+1)) | $GREP\
  486                     '^[^a-zA-Z0-9]*interface: ' | $SED \
  487                     's/^[^a-zA-Z0-9]*interface: //')
So, I am wondering if we are trying to get the interface the ip address whydon't you just use ipadm show-addr, and grep through the output to find thenetwork address you are looking for then get the interface? Then you canproceed to the code starting at line 498. I am concerned about using therouting tables to get this data.

Unfortunately, I explicitly need to query the routing table in the case ofmultiple interfaces on the same subnet. As the kernel keeps a defaultoutgoing interface for a subnet I believe, and I should use that sameinterface too.


E.g. on one machine you can see:
--------------------------------
cl...@jurassic: ipadm show-addr
ADDROBJ           TYPE     STATE        ADDR
[...]
ipmp0/?           static   ok           10.5.240.81/24
ipmp0/?           static   ok           10.5.240.82/24
aggr1/?           static   ok           10.5.240.83/24
aggr2/?           static   ok           10.5.240.84/24
[...]

So, of these four 10.5.240.0/24 addresses which should AI publish? Forthat I'd like to use the routing table:

cl...@jurassic: netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              10.5.240.1           UG      487 2640491534
10.5.240.0           10.5.240.84          U         3     767087 aggr2
10.5.240.0           10.5.240.83          U         3     767087 aggr1
10.5.240.0           10.5.240.82          U        32   45484797 ipmp0
10.5.240.0           10.5.240.81          U         9     412811 ipmp0
[...]

Routing Table: IPv6
  Destination/Mask            Gateway                   Flags Ref   Use    If
--------------------------- --------------------------- ----- --- ------- -----
::1                         ::1                         UH     12   71294 lo0

cl...@jurassic: route get 10.5.240.1
   route to: xxxx.SFBay.Sun.COM
destination: xxxx.SFBay.Sun.COM
       mask: 255.255.255.0
  interface: ipmp0
      flags: <UP,DONE>
 recvpipe  sendpipe  ssthresh    rtt,ms rttvar,ms  hopcount      mtu     expire
       0         0         0         0         0         0      1500         0

Here I'd use the ipmi0 interface over the other three as that's what thekernel uses for its outgoing, primary interface on 10.5.240.0/24.

setup-sparch.sh:
39 WANBOOTCGI_PATH="/usr/lib/inet/wanboot/"
   40 WANBOOTCGI="wanboot-cgi"
Why did you separate these into two variables? It doesn't look like we usethem independently.


Thank you, we did but that's now vestigial. I've collapsed them back down.

[snip...]
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Re: [caiman-discuss] code-review for AI multihomed

Reply via email to