Hi Hemachandran, Ok from me too.
Jonathan Detlef Ulherr wrote: > Hi Hemachandran, > > I am fine with the changes as far as I am concerned. > > Detlef > > Hemachandran Namachivayam wrote: >> Hi Detlef, >> >> Thanks for spotting that. I have made the changes. >> >> -Regards >> Hemachandran >> >> On 06/29/09 20:39, Detlef Ulherr wrote: >>> Sorry but fond some more >>> control_xvm.ksh >>> 45 R*) exit 1;; >>> remove the R >>> >>> Rest looks fine. >>> >>> Again, Sorry I should have spotted it before. >>> >>> Detlef >>> >>> Hemachandran Namachivayam wrote: >>>> Hi Detlef, Neil, Tim , Jonathan and Sreejith, >>>> >>>> Thanks for your valuable comments. >>>> >>>> Detlef, Please see my replies inline. >>>> >>>> On 06/29/09 17:43, Detlef Ulherr wrote: >>>>> Hi Hemachandran, >>>>> >>>>> here is what I found: >>>>> >>>>> spatz:~ sunmicrosystems$ cat feedback >>>>> SUNW.ldom and SUNW.xvm >>>>> >>>>> I would suggest to change the tunability of the start stop, probe >>>>> and validate commands to WHEN_DISABLED. >>>>> >>>>> I know there is nothing to specify right now, but for further >>>>> enhancements it will make things more easy. >>>>> >>>> The start/stop/probe and validate commands are not expected to be >>>> tuned by the administrators. If they are being modified, the agent >>>> would not produce the desired results. Hence they are set to NONE. >>>>> control_xvm.ksh >>>>> >>>>> 28 MYNAME=`basename ${0}` >>>>> 29 MYDIR=`dirname ${0}` >>>>> >>>>> I would change the ` to $(), it improves the ksh93 compliance >>>>> >>>>> validate_xvm.ksh >>>>> 28 MYNAME=`basename ${0}` >>>>> 29 MYDIR=`dirname ${0}` >>>>> >>>>> I would change the ` to $(), it improves the ksh93 compliance >>>>> The same applies in line 90,95 and 100 >>>>> >>>> Incorporated the above two changes. >>>>> >>>>> functions.ksh >>>>> 40 PREP=/usr/bin/pgrep >>>>> >>>>> There is a typo in the variable change it to PGREP >>>>> >>>> Fixed it. >>>>> 550 sleep ${START_TIMEOUT} & >>>>> I would use the absolute path for ksh93 compliance, the behavior >>>>> changed. Probably use a command variable. >>>>> >>>> Fixed it. >>>>> 621 if ! ${CCRADM} delkey --key noop_${RESOURCE} >>>>> ${CCR_TABLE} >> $LOGFILE 2>&1 >>>>> 622 then >>>>> 623 # SCMSGS >>>>> 624 # @explanation >>>>> 625 # Failed to add the NO-OP flag to CCR. >>>>> 626 # @user_action >>>>> 627 # Check the syslog for further messages. >>>>> 628 # Determine why the NO-OP flag was not added to >>>>> the CCR. >>>>> 629 scds_syslog -p daemon.error -t $(syslog_tag) -m \ >>>>> 630 "Failed to delete NO-OP flag for %s domain" \ >>>>> 631 "${DOMAIN}" >>>>> >>>>> The explanation is obviously wrong, it fails to delete the NO-OP flag. >>>>> >>>> Fixed it. >>>>> 845 >>>>> PROBE_TIMEOUT=$(${SCHA_RESOURCE_GET}-O Extension -R ${RESOURCE} -G >>>>> ${RESOURCEGROUP} Probe_timeout) >>>>> >>>>> 846 847 >>>>> PROBE_TIMEOUT=$(/usr/bin/echo ${PROBE_TIMEOUT} | ${AWK} '{print $2}') >>>>> could be doen faster in in one statement: >>>>> PROBE_TIMEOUT=$(${SCHA_RESOURCE_GET}-O Extension -R ${RESOURCE} -G >>>>> ${RESOURCEGROUP} Probe_timeout|tail -1) >>>>> >>>>> It is the probe so efficiency is key. >>>>> >>>> Incorporated the changes. >>>>> 852 >>>>> HATIMERUN_TIMEOUT=$((PROBE_TIMEOUT*90/100)) >>>>> 853 854 >>>>> output=$(${HATIMERUN} -t ${HATIMERUN_TIMEOUT} -k 9 ${PLUGIN_PROBE} >>>>> ${HATIMERUN_TIMEOUT}) >>>>> 855 rc=${?} >>>>> >>>>> I know it was already in the code, but I fear that 90 percent of >>>>> the probe timeout might be too long. Given the fact that you have >>>>> been doing things before and you do not take this into account. I >>>>> would suggest to work with SECONDS and reduce the HATIMERUN_TIMEOUT >>>>> by the elapsed seconds. Then you are sure to have 10 seconds rest >>>>> for housekeeping. >>>>> >>>> Incorporated the changes. >>>>> 1269 debug_message "Function: migrate_xvm - End" >>>>> 1270 return ${?} >>>>> >>>>> I do not think that you want to return with the return code of >>>>> debug_message. I would have expected, that you stored the return >>>>> code of the ha_timerun command. >>>>> >>>>> 1287 debug_message "Function: migrate_ldom - End" >>>>> 1288 return ${?} >>>>> >>>>> Dito >>>>> >>>>> 1466 debug_message "Function: shutdown_xvm - End" >>>>> 1467 return ${?} >>>>> >>>>> Dito >>>>> >>>> Fixed the above three changes. >>>> >>>> The updated webrev is @ http://cr.opensolaris.org/~hnamachi/ha-ldom/ >>>> >>>> -Thanks and Regards >>>> Hemachandran >>>>> Rest looks good. >>>>> >>>>> Regards >>>>> Detlef >>>>> >>>>> Hemachandran Namachivayam wrote: >>>>>> Hi Neil, >>>>>> >>>>>> Thanks for your review comments. Please see my replies inline. >>>>>> >>>>>> On 06/29/09 13:24, Neil Garthwaite wrote: >>>>>>> Hi Hemachandran, >>>>>>> Many thanks for this. I have just a few comments below, however >>>>>>> please note that I have not been able to verify the x86 non-LDoms >>>>>>> deployment, i.e. I don't have xVM + OHAC installed to test. >>>>>>> - functions.ksh >>>>>>> >>>>>>> Lines 84-86, are you missing SET_DEBUG= >>>>>> Fixed it. >>>>>>> >>>>>>> Lines 543-540, the re-work has replaced ${ADMIN} with a CCR table >>>>>>> entry, as such this message needs to be changed as ${ADMIN} >>>>>>> doesn't exist now. >>>>>>> >>>>>> Fixed it. >>>>>>> Line 912, function stop_domain has variable rc specified as >>>>>>> global, i.e. there is no typeset -i rc=0 >>>>>>> >>>>>> Incorporated this. >>>>>>> Line 1219, /opt/SUNWscxvm/bin/ldm_migrate is being run. I can see >>>>>>> an entry within prototype_sparc however I can't see a webrev >>>>>>> entry for this file. >>>>>>> >>>>>> The webrev now is able to list this file in the review. >>>>>>> General comments for functions.ksh. >>>>>>> - Some functions do not have debug_message Begin|End, as such >>>>>>> ${SET_DEBUG} may not be set. >>>>>> I have included the debug_message for all the functions. >>>>>> >>>>>> -Thanks and Regards >>>>>> Hemachandran >>>>>>> The rest looks good. >>>>>>> >>>>>>> Regards >>>>>>> Neil >>>>>>> >>>>>>> On 29 Jun 2009, at 05:53, Hemachandran Namachivayam wrote: >>>>>>> >>>>>>>> Hi Neil, >>>>>>>> >>>>>>>> On 06/28/09 23:50, Neil Garthwaite wrote: >>>>>>>>> Hi Hemachandran, >>>>>>>>> I note that things have been redesigned and reworked >>>>>>>>> accordingly, as such there is a reasonable amount to review. >>>>>>>>> Nevertheless, I plan to go through that later. >>>>>>>>> I do note that the CCR is being used to store, amend and >>>>>>>>> retrieve information, as such if I refer to the design document >>>>>>>>> http://opensolaris.org/os/project/ha-xvm/ha-ldoms-design-doc_v2.pdf, >>>>>>>>> I can see that /usr/cluster/lib/sc/ccrutil is referenced >>>>>>>>> however within the code you have /usr/cluster/lib/sc/ccradm, is >>>>>>>>> this a typo within the design document? Either way, as this is >>>>>>>>> an Open HA Cluster review, the ability to use the CCR to store, >>>>>>>>> amend and retrieve information is likely to be extremely useful >>>>>>>>> to others as simply put it provides a synchronized repository >>>>>>>>> across the cluster. >>>>>>>> Thanks for pointing this out. The table file operations to >>>>>>>> store, amend and retrieve information are now contained in >>>>>>>> "ccradm" utility, to fulfill the HA-xVM needs to share the >>>>>>>> information across cluster nodes. I have changed the references >>>>>>>> of ccrutil to "ccradm" in the design document. Please find the >>>>>>>> updated document at >>>>>>>> http://opensolaris.org/os/project/ha-xvm/ha-ldoms-design-doc_v2.pdf >>>>>>>> >>>>>>>>> As such I feel you need to expand on how the ccradm utility can >>>>>>>>> be used in the context of storing, amending and retrieving >>>>>>>>> information. I could only find >>>>>>>>> http://src.opensolaris.org/source/xref/ohac/ohac/usr/src/cmd/dtk/man/man1m/ccradm.1m >>>>>>>>> >>>>>>>>> which does not contain the arguments you are using. >>>>>>>> A revised edition of the man page is yet to be committed to the >>>>>>>> gate. For the time being, You can refer to the man page changes >>>>>>>> at http://opensolaris.org/os/project/ha-xvm/ccradm.pdf >>>>>>>> >>>>>>>> -Thanks and Regards >>>>>>>> Hemachandran >>>>>>>>> Regards >>>>>>>>> Neil >>>>>>>>> >>>>>>>>> On 26 Jun 2009, at 07:05, Hemachandran Namachivayam wrote: >>>>>>>>> >>>>>>>>>> Hi Folks, >>>>>>>>>> >>>>>>>>>> I'd welcome the code review for the "Solaris cluster failover >>>>>>>>>> agent for LDoms Guest domains" >>>>>>>>>> >>>>>>>>>> The failover agent for LDoms Guest domains is an extension of >>>>>>>>>> the HA-xVM agent and introduces a new resource type SUNW.ldom. >>>>>>>>>> The agent is redesigned to use the Generic Data Service >>>>>>>>>> (SUNW.gds) binaries and hence the resource type (SUNW.xvm) >>>>>>>>>> have been reworked upon. The HA-xVM agent makes use of CCR to >>>>>>>>>> store the xVM/LDom guest domain configuration; thereby a need >>>>>>>>>> to configure a separate admin directory is eliminated. >>>>>>>>>> >>>>>>>>>> To refer to the requirements specification and design >>>>>>>>>> document, please visit http://opensolaris.org/os/project/ha-xvm/ >>>>>>>>>> >>>>>>>>>> The webrev, for review, is located at >>>>>>>>>> http://cr.opensolaris.org/~hnamachi/ha-ldom/ >>>>>>>>>> <http://cr.opensolaris.org/%7Ehnamachi/ha-ldom/> >>>>>>>>>> >>>>>>>>>> -Thanks and Regards >>>>>>>>>> Hemachandran >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> ha-clusters-discuss mailing list >>>>>>>>>> ha-clusters-discuss at opensolaris.org >>>>>>>>>> <mailto:ha-clusters-discuss at opensolaris.org> >>>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------ >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ha-clusters-discuss mailing list >>>>>> ha-clusters-discuss at opensolaris.org >>>>>> http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss >>>>>> >>>>> >>>> >>> >> >