Hi Thorsten,

Many thanks for the detailed review. I really appreciate the time you've 
taken to go through this.

After a few reviews now I will  produce a code refresh of this agent and 
will ensure that all the packaging requirements are in place so that it 
will build.  I'll email again when this is done.

If I may I have included some comments below.

Regards
Neil

Thorsten Frueauf wrote:

>Hi Neil,
>
>I still could not find the necessary changes within usr/src/pkgdefs/* in
>your webrev - so it is not complete and I doubt it will properly build 
>as-is.
>Which means my review is not yet complete.
>
>Apart from that, here is my review to the existing files:
>
>general:  most scripts do not have the .ksh extension, although your 
>Makefile does expect them!
>
>usr/src/cmd/ha-services/gds-agents/ids/control_ids
>         - your file is named control_ids, but the Makefile really expects
>           control_ids.ksh - need to rename!
>
>  
>

Agreed.

>         - line 108, since you later "return 1", doesn't this need to be
>           of servity daemon.error instead of daemon.notice?
>           The SCMSGS is missing because this is used with other agents 
>as well?
>  
>

Agreed.

>         - line 133, although you setup $OUTPUT (which logs into 
>${LOGFILE}),
>           where is that output ever been printed out? In a former gds 
>template
>           we had log_message(), which you don't have. What is logging
>           $LOGFILE to syslog in the end?
>
>  
>

My initial thought was to do away with log_message() and simply rely on 
the return code from whatever command was being issued. However, if I 
follow that thought through then within set_redirection() I could just 
have /dev/null instead of ${LOGFILE}, i.e. discard any output from 
whatever command is being run and just rely on the return code.

If I recall correctly we coded log_message() to capture all output from 
a command. The benefit from doing so was that we could output, via 
log_message(), to syslog. The further benefit of this is that we would 
also capture any undocumented error messages.

With this in mind I'll reinstate log_message(). Thanks for questioning this.

>usr/src/cmd/ha-services/gds-agents/ids/functions:
>         - your file is named functions, but the Makefile really expects
>           functions.ksh - need to rename!
>
>  
>

Agreed.

>         - line 95 and 109, your are hard coding "informix" as user and
>           group that must exist. Is that really a Informix requirement?
>           Wouldn't it make sense to make it configurable if this changes
>           for future versions? If not, can you please add an comment that
>           Informix really requires this, maybe with a pointer to the docs?
>
>  
>

Yes, group and userid informix is a requirement I'll add a pointer to 
the docs and will also consider making it configurable.

>         - line 103 and 117, the @user_action is not helpfull. Does this 
>mean the
>           user must create this user manually? Or is this something 
>performed
>           by the Informix installation procedure?
>  
>

Agreed, I'll update the @user_action.

>         - line 127, if the user does not exists, wouldn't that create an
>           error message to stderr (ie. show on the console)?
>  
>

I don't believe it will, but I will check.

>         - line 129, your if clause looks strange - what are you 
>checking for?
>           Maybe there is a "-n" missing?
>  
>

Looking at the man page for test I can see the following,

-n string      True if  the  length  of  string  is non-zero.

string          True if the string string is not the null string.

However, I'll amend this and other similar entries to include "-n".

>         - line 181, you should quote ${INFORMIXDIR}
>  
>

Agreed.

>         - line 192, missing \ at the end of the line!
>  
>

Thanks.

>         - line 193, you should quote ${INFORMIXDIR}
>  
>

Agreed.

>         - line 197, maybe a comment would be helpful as of how the
>           ONCONFIG file looks like as an example - that would help to
>           understand what you validate for in the following.
>
>  
>

Agreed.

>         - line 201 and 216 and 246, note that your grep would also match
>           names where the one you rep for is included, but does have a
>           different pre- or postfix - it is not checking for an exact 
>match.
>           Is that intended?
>  
>

No, I'll include "-w".

>         - line 281, doesn't the rm create a failure message to stderr if
>           ${LOGFILE} does not exist? Maybe redirect the output to 
>/dev/null?
>  
>

Okay.

>         - line 294 - when would check_ids actually return 1?
>           Looking at the function I can not see where.
>  
>

You're right. I'm going to rework check_ids() to remove tolerating 
"maintenance" and "startup" modes. I now feel this is maybe confusing by 
not offering a clear separation between how a cluster should manage an 
IDS service and how a DBA should manage an IDS service, i.e. if any DBA 
work is required then that should be coordinated with those managing IDS 
within a cluster.  I will however still tolerate some blocked states.

>         - line 32ff., I am a bit surprised by the user_action, especially
>           "The Solaris Cluster resource will attempt to restart the 
>Informix Server."
>           That is not always correct, is it? If "oninit -y" always 
>fails, then
>           this will result into failed or stop_failed?
>  
>

Agreed, I'll amend the @user_action.

>         - line 349, the servity for syslog should be "daemon.error"
>
>  
>

Agreed.

>         - line 408/410 - can you explain with a comment why you do the 
>"grep ."?
>  
>

Okay.

>         - general on check_ids() - you rely a lot on the output of onstat.
>           Is that a stable interface to parse? If not, any way to get the
>           various status via a stable interface?
>
>  
>

Yes, it's my belief that onstat is a stable interface.

>         - general on check_ids() - you nearly never quote the variables you
>           provide to scds_syslog - you should quote them!
>  
>

Agreed.

>         - line 469, missing " before HANG_SYSTEM"
>  
>

Agreed.

>         - line 485, in the comment you say with the "HANG_SYSTEM" or
>           "MEDIA FAILURE" the DBA intervention is necessary - but you keep
>           restarting the resource?
>  
>

I'd prefer to just disable the resource in this sceanrio. However with 
returning 0-100 or 201 from a GDS probe I couldn't determine a better 
alternative to 100 as I'd like to keep Failover_enabled=true.

>         - line 490-502, I find the comment confusing - you say 
>something about
>           return code 1 - but in the following code it well rc will never
>           be set to 1? In fact you either set 100 or 0. Is that intended?
>           If so, please correct the comment.
>  
>

Agreed. As outlined above I'll rework check_ids() to not tolerate 
"maintenance" and "startup" nodes as I now think it will be confusing.

>         - line 508-521, same thing as with 490-502 - I do not fully 
>understand
>           the comment return code 1.
>  
>

Agreed.

>         - line 567 and 568, the check is not specifying an operation,
>           don't you need -n in the [ ] test?
>
>  
>

Agreed.

>         - line 634ff - again the confusion about return 1 - I have not yet
>           understood when check_ids() does return 1 in the code.
>
>         - line 669-674 distincts between zone and non-zone systems,
>           while later (line 677ff) you always call the -mcopbZ to ipcs.
>           Is that intended? Doesn't this limit the use of the agent to
>           Solaris 10 and newer? The whole block line 677-704 is S10 only.
>
>  
>

Yes, the objective is to only deploy this agent on Solaris 10 or newer. 
I'll amend to code to reflect this.

>- usr/src/cmd/ha-services/gds-agents/ids/ids_register
>         - your file is named ids_register, but the Makefile really expects
>           ids_register.ksh - need to rename!
>
>  
>

Sigh!

>         - line 29, you should not put temp files into /opt. I suggest
>           you use /tmp or /var/tmp for that. but I guess you need it there
>           for the register within the non-global zone / HA Container case.
>           In that case, why not transfer it (again to /tmp or /var/tmp) of
>           the non-global zone via the method shown at line 224 of
> 
>http://src.opensolaris.org/source/xref/ohac/ohacds/usr/src/cmd/ha-services/gds-agents/mys/ha_mysql_register.ksh
>  
>

Agreed.

>         - line 81, 87, 93, 102, 137, 142 and 151 does always exit 1 without
>           cleaning up the temp file - which would disturb a clean pkgrm.
>           This is also an argument to place it on /tmp or /var/tmp.
>  
>

I'll amend this.

>         - line 104, you need to grep for "^Zonename=", otherwise you might
>           match a comment or old commented out entry.
>  
>

Thanks.

>         - line 110-118 - do you really want the leading tab there?
>  
>

I think it's easier to read this way but I'm also happy to amend it.

>         - line 129, you just check if the state of the resource is 
>"OFFLINE".
>           But the "clrs delete ${RS}" does only succeed if the resource
>           is also disabled - so you also need to check for -O ON_OFF_SWITCH
>           via scha_resource_get if the resource is DISABLED.
>  
>

Agreed.

>
>- usr/src/cmd/ha-services/gds-agents/ids/ids_smf_register
>         - your file is named ids_smf_register, but the Makefile really 
>expects
>           ids_smf_register.ksh - need to rename!
>
>  
>

Hmm.

>         - line 125, you should check if $1 is really existing.
>
>  
>

Agreed.

>- usr/src/cmd/ha-services/gds-agents/ids/ids_smf_remove
>         - your file is named ids_smf_remove, but the Makefile really 
>expects
>           ids_smf_remove.ksh - need to rename!
>  
>

Thanks.

>         - line 44 and 50, you should quote the variables
>  
>

Agreed.

>         - line 56, shouldn't the check be
>                 [ "${rc}" -ne 0 ] && exit ${rc}
>
>  
>

True.

>All for now.
>
>  
>

Many thanks again Thorsten.

>Greets
>       Thorsten
>
>Thorsten Frueauf wrote:
>  
>
>>Hi Neil et al,
>>
>>one early comment about the webrev, while browsing through it I think 
>>you are missing the package source part within usr/src/pkgdefs/* for the 
>>new ids package.
>>
>>Greets
>>       Thorsten
>>
>>Neil Garthwaite wrote:
>>    
>>
>>>Hi Everyone,
>>>
>>>The HA-Informix project page 
>>>http://opensolaris.org/os/project/ha-informix has been updated to 
>>>include the initial code for the HA-Informix agent.
>>>
>>>I'd welcome a code review for this agent, a webrev can be found here
>>>
>>> http://cr.opensolaris.org/~neilg/ohacds-informix
>>>
>>>Alternatively, the agent is available for download. Please use the 
>>>accompanying cheatsheet if you want to try out the agent.
>>>
>>> http://opensolaris.org/os/project/ha-informix/cheatsheet_ohac_ids.pdf
>>>
>>>Finally, many thanks to everyone that reviewed the Design Doc and Caller 
>>>Sequence Doc, I've updated those docs to reflect your comments.
>>>
>>>Regards
>>>Neil
>>>      
>>>
>
>  
>


Reply via email to