Hi Hemachandran,
Ok from me too.
Jonathan
Detlef Ulherr wrote:
> Hi Hemachandran,
>
> I am fine with the changes as far as I am concerned.
>
> Detlef
>
> Hemachandran Namachivayam wrote:
>> Hi Detlef,
>>
>> Thanks for spotting that. I have made the changes.
>>
>> -Regards
>> Hemachandran
>>
>> On 06/29/09 20:39, Detlef Ulherr wrote:
>>> Sorry but fond some more
>>> control_xvm.ksh
>>> 45 R*) exit 1;;
>>> remove the R
>>>
>>> Rest looks fine.
>>>
>>> Again, Sorry I should have spotted it before.
>>>
>>> Detlef
>>>
>>> Hemachandran Namachivayam wrote:
>>>> Hi Detlef, Neil, Tim , Jonathan and Sreejith,
>>>>
>>>> Thanks for your valuable comments.
>>>>
>>>> Detlef, Please see my replies inline.
>>>>
>>>> On 06/29/09 17:43, Detlef Ulherr wrote:
>>>>> Hi Hemachandran,
>>>>>
>>>>> here is what I found:
>>>>>
>>>>> spatz:~ sunmicrosystems$ cat feedback
>>>>> SUNW.ldom and SUNW.xvm
>>>>>
>>>>> I would suggest to change the tunability of the start stop, probe
>>>>> and validate commands to WHEN_DISABLED.
>>>>>
>>>>> I know there is nothing to specify right now, but for further
>>>>> enhancements it will make things more easy.
>>>>>
>>>> The start/stop/probe and validate commands are not expected to be
>>>> tuned by the administrators. If they are being modified, the agent
>>>> would not produce the desired results. Hence they are set to NONE.
>>>>> control_xvm.ksh
>>>>>
>>>>> 28 MYNAME=`basename ${0}`
>>>>> 29 MYDIR=`dirname ${0}`
>>>>>
>>>>> I would change the ` to $(), it improves the ksh93 compliance
>>>>>
>>>>> validate_xvm.ksh
>>>>> 28 MYNAME=`basename ${0}`
>>>>> 29 MYDIR=`dirname ${0}`
>>>>>
>>>>> I would change the ` to $(), it improves the ksh93 compliance
>>>>> The same applies in line 90,95 and 100
>>>>>
>>>> Incorporated the above two changes.
>>>>>
>>>>> functions.ksh
>>>>> 40 PREP=/usr/bin/pgrep
>>>>>
>>>>> There is a typo in the variable change it to PGREP
>>>>>
>>>> Fixed it.
>>>>> 550 sleep ${START_TIMEOUT} &
>>>>> I would use the absolute path for ksh93 compliance, the behavior
>>>>> changed. Probably use a command variable.
>>>>>
>>>> Fixed it.
>>>>> 621 if ! ${CCRADM} delkey --key noop_${RESOURCE}
>>>>> ${CCR_TABLE} >> $LOGFILE 2>&1
>>>>> 622 then
>>>>> 623 # SCMSGS
>>>>> 624 # @explanation
>>>>> 625 # Failed to add the NO-OP flag to CCR.
>>>>> 626 # @user_action
>>>>> 627 # Check the syslog for further messages.
>>>>> 628 # Determine why the NO-OP flag was not added to
>>>>> the CCR.
>>>>> 629 scds_syslog -p daemon.error -t $(syslog_tag) -m \
>>>>> 630 "Failed to delete NO-OP flag for %s domain" \
>>>>> 631 "${DOMAIN}"
>>>>>
>>>>> The explanation is obviously wrong, it fails to delete the NO-OP flag.
>>>>>
>>>> Fixed it.
>>>>> 845
>>>>> PROBE_TIMEOUT=$(${SCHA_RESOURCE_GET}-O Extension -R ${RESOURCE} -G
>>>>> ${RESOURCEGROUP} Probe_timeout)
>>>>>
>>>>> 846 847
>>>>> PROBE_TIMEOUT=$(/usr/bin/echo ${PROBE_TIMEOUT} | ${AWK} '{print $2}')
>>>>> could be doen faster in in one statement:
>>>>> PROBE_TIMEOUT=$(${SCHA_RESOURCE_GET}-O Extension -R ${RESOURCE} -G
>>>>> ${RESOURCEGROUP} Probe_timeout|tail -1)
>>>>>
>>>>> It is the probe so efficiency is key.
>>>>>
>>>> Incorporated the changes.
>>>>> 852
>>>>> HATIMERUN_TIMEOUT=$((PROBE_TIMEOUT*90/100))
>>>>> 853 854
>>>>> output=$(${HATIMERUN} -t ${HATIMERUN_TIMEOUT} -k 9 ${PLUGIN_PROBE}
>>>>> ${HATIMERUN_TIMEOUT})
>>>>> 855 rc=${?}
>>>>>
>>>>> I know it was already in the code, but I fear that 90 percent of
>>>>> the probe timeout might be too long. Given the fact that you have
>>>>> been doing things before and you do not take this into account. I
>>>>> would suggest to work with SECONDS and reduce the HATIMERUN_TIMEOUT
>>>>> by the elapsed seconds. Then you are sure to have 10 seconds rest
>>>>> for housekeeping.
>>>>>
>>>> Incorporated the changes.
>>>>> 1269 debug_message "Function: migrate_xvm - End"
>>>>> 1270 return ${?}
>>>>>
>>>>> I do not think that you want to return with the return code of
>>>>> debug_message. I would have expected, that you stored the return
>>>>> code of the ha_timerun command.
>>>>>
>>>>> 1287 debug_message "Function: migrate_ldom - End"
>>>>> 1288 return ${?}
>>>>>
>>>>> Dito
>>>>>
>>>>> 1466 debug_message "Function: shutdown_xvm - End"
>>>>> 1467 return ${?}
>>>>>
>>>>> Dito
>>>>>
>>>> Fixed the above three changes.
>>>>
>>>> The updated webrev is @ http://cr.opensolaris.org/~hnamachi/ha-ldom/
>>>>
>>>> -Thanks and Regards
>>>> Hemachandran
>>>>> Rest looks good.
>>>>>
>>>>> Regards
>>>>> Detlef
>>>>>
>>>>> Hemachandran Namachivayam wrote:
>>>>>> Hi Neil,
>>>>>>
>>>>>> Thanks for your review comments. Please see my replies inline.
>>>>>>
>>>>>> On 06/29/09 13:24, Neil Garthwaite wrote:
>>>>>>> Hi Hemachandran,
>>>>>>> Many thanks for this. I have just a few comments below, however
>>>>>>> please note that I have not been able to verify the x86 non-LDoms
>>>>>>> deployment, i.e. I don't have xVM + OHAC installed to test.
>>>>>>> - functions.ksh
>>>>>>>
>>>>>>> Lines 84-86, are you missing SET_DEBUG=
>>>>>> Fixed it.
>>>>>>>
>>>>>>> Lines 543-540, the re-work has replaced ${ADMIN} with a CCR table
>>>>>>> entry, as such this message needs to be changed as ${ADMIN}
>>>>>>> doesn't exist now.
>>>>>>>
>>>>>> Fixed it.
>>>>>>> Line 912, function stop_domain has variable rc specified as
>>>>>>> global, i.e. there is no typeset -i rc=0
>>>>>>>
>>>>>> Incorporated this.
>>>>>>> Line 1219, /opt/SUNWscxvm/bin/ldm_migrate is being run. I can see
>>>>>>> an entry within prototype_sparc however I can't see a webrev
>>>>>>> entry for this file.
>>>>>>>
>>>>>> The webrev now is able to list this file in the review.
>>>>>>> General comments for functions.ksh.
>>>>>>> - Some functions do not have debug_message Begin|End, as such
>>>>>>> ${SET_DEBUG} may not be set.
>>>>>> I have included the debug_message for all the functions.
>>>>>>
>>>>>> -Thanks and Regards
>>>>>> Hemachandran
>>>>>>> The rest looks good.
>>>>>>>
>>>>>>> Regards
>>>>>>> Neil
>>>>>>>
>>>>>>> On 29 Jun 2009, at 05:53, Hemachandran Namachivayam wrote:
>>>>>>>
>>>>>>>> Hi Neil,
>>>>>>>>
>>>>>>>> On 06/28/09 23:50, Neil Garthwaite wrote:
>>>>>>>>> Hi Hemachandran,
>>>>>>>>> I note that things have been redesigned and reworked
>>>>>>>>> accordingly, as such there is a reasonable amount to review.
>>>>>>>>> Nevertheless, I plan to go through that later.
>>>>>>>>> I do note that the CCR is being used to store, amend and
>>>>>>>>> retrieve information, as such if I refer to the design document
>>>>>>>>> http://opensolaris.org/os/project/ha-xvm/ha-ldoms-design-doc_v2.pdf,
>>>>>>>>> I can see that /usr/cluster/lib/sc/ccrutil is referenced
>>>>>>>>> however within the code you have /usr/cluster/lib/sc/ccradm, is
>>>>>>>>> this a typo within the design document? Either way, as this is
>>>>>>>>> an Open HA Cluster review, the ability to use the CCR to store,
>>>>>>>>> amend and retrieve information is likely to be extremely useful
>>>>>>>>> to others as simply put it provides a synchronized repository
>>>>>>>>> across the cluster.
>>>>>>>> Thanks for pointing this out. The table file operations to
>>>>>>>> store, amend and retrieve information are now contained in
>>>>>>>> "ccradm" utility, to fulfill the HA-xVM needs to share the
>>>>>>>> information across cluster nodes. I have changed the references
>>>>>>>> of ccrutil to "ccradm" in the design document. Please find the
>>>>>>>> updated document at
>>>>>>>> http://opensolaris.org/os/project/ha-xvm/ha-ldoms-design-doc_v2.pdf
>>>>>>>>
>>>>>>>>> As such I feel you need to expand on how the ccradm utility can
>>>>>>>>> be used in the context of storing, amending and retrieving
>>>>>>>>> information. I could only find
>>>>>>>>> http://src.opensolaris.org/source/xref/ohac/ohac/usr/src/cmd/dtk/man/man1m/ccradm.1m
>>>>>>>>>
>>>>>>>>> which does not contain the arguments you are using.
>>>>>>>> A revised edition of the man page is yet to be committed to the
>>>>>>>> gate. For the time being, You can refer to the man page changes
>>>>>>>> at http://opensolaris.org/os/project/ha-xvm/ccradm.pdf
>>>>>>>>
>>>>>>>> -Thanks and Regards
>>>>>>>> Hemachandran
>>>>>>>>> Regards
>>>>>>>>> Neil
>>>>>>>>>
>>>>>>>>> On 26 Jun 2009, at 07:05, Hemachandran Namachivayam wrote:
>>>>>>>>>
>>>>>>>>>> Hi Folks,
>>>>>>>>>>
>>>>>>>>>> I'd welcome the code review for the "Solaris cluster failover
>>>>>>>>>> agent for LDoms Guest domains"
>>>>>>>>>>
>>>>>>>>>> The failover agent for LDoms Guest domains is an extension of
>>>>>>>>>> the HA-xVM agent and introduces a new resource type SUNW.ldom.
>>>>>>>>>> The agent is redesigned to use the Generic Data Service
>>>>>>>>>> (SUNW.gds) binaries and hence the resource type (SUNW.xvm)
>>>>>>>>>> have been reworked upon. The HA-xVM agent makes use of CCR to
>>>>>>>>>> store the xVM/LDom guest domain configuration; thereby a need
>>>>>>>>>> to configure a separate admin directory is eliminated.
>>>>>>>>>>
>>>>>>>>>> To refer to the requirements specification and design
>>>>>>>>>> document, please visit http://opensolaris.org/os/project/ha-xvm/
>>>>>>>>>>
>>>>>>>>>> The webrev, for review, is located at
>>>>>>>>>> http://cr.opensolaris.org/~hnamachi/ha-ldom/
>>>>>>>>>> <http://cr.opensolaris.org/%7Ehnamachi/ha-ldom/>
>>>>>>>>>>
>>>>>>>>>> -Thanks and Regards
>>>>>>>>>> Hemachandran
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> ha-clusters-discuss mailing list
>>>>>>>>>> ha-clusters-discuss at opensolaris.org
>>>>>>>>>> <mailto:ha-clusters-discuss at opensolaris.org>
>>>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> ha-clusters-discuss mailing list
>>>>>> ha-clusters-discuss at opensolaris.org
>>>>>> http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
>>>>>>
>>>>>
>>>>
>>>
>>
>