On 09-Oct-13 6:09 AM, Greg Hurlman wrote:
> The other problem is:
>
> When a service unit is unlocked-in and unlocked to make the sa aware 
> components run, and while doing so, one of the component faulted due 
> to 'csiSetcallbackFailed' : Recovery is 'ComponentRestart',
> Then an immediate locking of the SU fails with reason as bad 
> operation. The component then enters to the escalation matrix, going 
> for a SuFailover and node restart. There is no admin operation
> I could find while the error escalation is happening with the SU with 
> states as,
> saAmfSUAdminState=UNLOCKED(1)
> saAmfSUOperState=ENABLED(1)
> saAmfSUPresenceState=INSTANTIATED(3)
> saAmfSUReadinessState=IN-SERVICE(2),
> to bring back the SU to locked-in or locked state. Only a service stop 
> could break out of the escalation matrix preventing a node failure.
>
> Is this correct behavior or is there any other way to break out while 
> in error escalation?
>
These is no break out from escalation. After each fault AMF cleans up 
the component and re instantiate it. So the clean up script must cleanup 
all the resources
taken up by the component. If AMF gets successful clean up status, then 
it  means component has released all the system resources. If clean up 
is unsuccessful it means
some system resources were not released. This will lead to abnormal exit 
status of cleanup script to AMF and due to this both the component and 
SU will move to TERM_FAILED state. In this state escalation will stop 
and admin intervention is required for repair.
Please see why component is not able to take assignments even after each 
cleanup and instantiation success phase.
Thanks
Praveen
> Greg
>
> ---------- Forwarded message ----------
> From: *Greg Hurlman* <hurlmang...@gmail.com 
> <mailto:hurlmang...@gmail.com>>
> Date: Mon, Oct 7, 2013 at 2:42 PM
> Subject: Re: [users] Info regd imm configuration
> To: praveen malviya <praveen.malv...@oracle.com 
> <mailto:praveen.malv...@oracle.com>>
> Cc: Opensaf-users@lists.sourceforge.net 
> <mailto:Opensaf-users@lists.sourceforge.net>
>
>
> Thanks praveen for the suggestion. However, I am not sure I understand 
> completely what you mean by "Otherwise let the component move to 
> INST/TERM failed state intentionally. In these two states AMF will not 
> take any recovery action."
>
> Could you please tell me how to intentionally change the state?
>
> Thanks,
> Greg
>
>
> On Mon, Oct 7, 2013 at 5:42 AM, praveen malviya 
> <praveen.malv...@oracle.com <mailto:praveen.malv...@oracle.com>> wrote:
>
>
>     On 04-Oct-13 10:16 AM, Greg Hurlman wrote:
>>     Thanks Praveen that answers my question.
>>
>>     Some other issues while testing in version 4.3.1 found that,
>>
>>     1. While a component is made to restarts for some valid error
>>     scenario for the defined number of
>>     saAmfSGCompRestartMax(Value=10), recovery option set to component
>>     restart, it does not honor the attributes
>>
>>     saAmfCompNumMaxInstantiateWithoutDelay(value=2)
>>     saAmfCompNumMaxInstantiateWithDelay and (value=8)
>>     saAmfCompDelayBetweenInstantiateAttempts(Value=10000000000).
>>
>     They are tickets for them: #107 and #374.
>
>>     Does this attributes depends on other configurations or is
>>     getting overridden by some other attributes?
>>
>>     2. While verifying compDisableRestart, I set
>>     compDisableRestart=true, when the component was continuously
>>     restarting.
>>
>>     Result was instead of component restart it went to SUFailover as
>>     the recovery option and then node reboot. While I was expecting
>>     the behavior to just pause restarting and again when the value
>>     will be set to compDisableRestart=false, then start instantiate
>>     it and if fails again continue restarting with restart count set
>>     to zero. Am I interpreting correctly here? Is this attribute
>>     dependent or being overridden by some other configuration attribute?
>>
>     If compDisableRestart=true means component is not restart capable
>     and AMF will perform next level of recovery which is suFailover.
>     It does not mean stop component from restarting. Please see spec
>     for more details of such attributes.
>
>>     Basically I was looking for debugging, a configuration option to
>>     pause temporarily or stop the continuous attempts to instantiate
>>     this component as a result of SU unlock admin operation, but not
>>     able to come up for a valid error reason. Every other attempt
>>     fails and eventually escalated to node reboot and prevents from
>>     looking into the issue. Do we have a way to stop this?
>>
>     CallbackTimeout can be configured with very large values and break
>     point on csiset callback can be put to wait for debugging.
>     Otherwise let the component move to INST/TERM failed state
>     intentionally. In these two states AMF will not take any recovery
>     action.
>
>     Thanks
>     Praveen
>>     Anything obvious errors I am doing here?
>>
>>     Thanks
>>     Greg
>>
>>
>>
>>     On Thu, Oct 3, 2013 at 2:17 AM, praveen malviya
>>     <praveen.malv...@oracle.com <mailto:praveen.malv...@oracle.com>>
>>     wrote:
>>
>>         Please see inline.
>>
>>         On 03-Oct-13 9:37 AM, Greg Hurlman wrote:
>>>         Thanks Praveen,
>>>
>>>         Precisely I wanted the configuration to form different node
>>>         groups out of the payload nodes. Ex: If PL-1, PL-2, PL-3 and
>>>         PL-4 are the payload nodes, then how do we differentiate the
>>>         DN names given to the saAmfNGNodeList when PL-1 and PL-2
>>>         form one group and PL-3 and PL-4 form the other group. I
>>>         believe PLs we can  not mention in DN otherwise the DN names
>>>         for both the groups will be the same.
>>         I tried to make a new group from existing imm.xml which is
>>         having 2 SCs and 2 PLs
>>         To make SC-1 and PL-3 in one group:
>>         1)immcfg -c SaAmfNodeGroup
>>         safAmfNodeGroup=SC-1andPL-3,safAmfCluster=myAmfCluster -a
>>         saAmfNGNodeList=safAmfNode=SC-1,safAmfCluster=myAmfCluster
>>         2) immcfg
>>         safAmfNodeGroup=SC-1andPL-3,safAmfCluster=myAmfCluster -a
>>         saAmfNGNodeList+=safAmfNode=PL-3,safAmfCluster=myAmfCluster
>>         3)immlist
>>         safAmfNodeGroup=SC-1andPL-3,safAmfCluster=myAmfCluster gives
>>         Name Type         Value(s)
>>         
>> ========================================================================
>>         safAmfNodeGroup SA_STRING_T safAmfNodeGroup=SC-1andPL-3
>>         saAmfNGNodeList SA_NAME_T
>>         safAmfNode=SC-1,safAmfCluster=myAmfCluster (42)
>>         safAmfNode=PL-3,safAmfCluster=myAmfCluster (42)
>>         SaImmAttrImplementerName SA_STRING_T  safAmfService
>>         SaImmAttrClassName SA_STRING_T  SaAmfNodeGroup
>>         SaImmAttrAdminOwnerName SA_STRING_T  <Empty>
>>
>>         4) Now in AMF application in SG class:
>>         <attr>
>>         <name>saAmfSGSuHostNodeGroup</name>
>>         <value>safAmfNodeGroup=SC-1andPL-3,safAmfCluster=myAmfCluster</value>
>>         </attr>
>>
>>         So AMF will spawn SUs on SC-1 and PL-3 for this SG.
>>
>>>
>>>         Also could you tell me the configurations for protection
>>>         group and service instance assignment with reference to both
>>>         examples for both 2N and N-way active models?
>>>
>>         I did not get this questions correctly. These configuration
>>         are already in samples directory of opensaf tar.
>>         Once these configurations are brought up, command "amf-state
>>         siass" will show HA states of SIs in SUs.
>>
>>         Thanks
>>         Praveen
>>
>>>         Thanks,
>>>         Greg
>>>
>>>
>>>         On Mon, Sep 23, 2013 at 10:41 PM, praveen malviya
>>>         <praveen.malv...@oracle.com
>>>         <mailto:praveen.malv...@oracle.com>> wrote:
>>>
>>>             Please see inline.
>>>
>>>             On 24-Sep-13 2:27 AM, Greg Hurlman wrote:
>>>
>>>                 Hi Guys,
>>>
>>>                 Need some help in understanding more on configuring
>>>                 the SG and protection
>>>                 groups for the below example scenario.
>>>
>>>                 Referring to spec SAI-AIS-AMF-B.04.01, section
>>>                 3.1.11 figure 2, and sample
>>>                 example AppConfig-2N.xml, how would I can configure
>>>                 IMM model for the below
>>>                 entities:
>>>
>>>                 1. SG1 spanning node U and V? Do I need to mention
>>>                 something like this or
>>>                 something else?
>>>
>>>                 <attr>
>>>                 <name>saAmfSGSuHostNodeGroup</name>
>>>                 <value>safAmfNodeGroup=U,V
>>>                 ,safAmfCluster=myAmfCluster</value>
>>>                 </attr>
>>>                 2. PG A1 between components C1 and C3.
>>>                 3. Service instance A, being assigned active to S1
>>>                 and standby to S2
>>>
>>>             1) Suppose the above mentioned node group has been
>>>             created and it contains two nodes in its attribute NodeList:
>>>             saAmfNGNodeList        SA_NAME_T
>>>             safAmfNode=NodeU,safAmfCluster=myAmfCluster (42)
>>>             safAmfNode=NodeV,safAmfCluster=myAmfCluster (42)
>>>
>>>             2) Now in order to map Service Unit S1 to  Node U,
>>>             configure S1 with attibute:
>>>             <attr>
>>>             <name>saAmfSUHostNodeOrNodeGroup</name>
>>>             <value>safAmfNode=NodeU,safAmfCluster=myAmfCluster</value>
>>>             </attr>
>>>
>>>             Similary for S2.
>>>
>>>             So in this way S1 will come up on Node U and S2 will
>>>             come up on S2.
>>>             In AppConfig-2N.xml:
>>>             For mapping SU1 to SC-1 add this attrbiute:
>>>             <attr>
>>>             <name>saAmfSUHostNodeOrNodeGroup</name>
>>>             <value>safAmfNode=SC-1,safAmfCluster=myAmfCluster</value>
>>>             </attr>
>>>
>>>             For SU2 replace SC-1 by SC-2.
>>>
>>>             Thanks
>>>             Praveen
>>>
>>>                 Thanks,
>>>                 Greg
>>>                 
>>> ------------------------------------------------------------------------------
>>>                 October Webinars: Code for Performance
>>>                 Free Intel webinars can help you accelerate
>>>                 application performance.
>>>                 Explore tips for MPI, OpenMP, advanced profiling,
>>>                 and more. Get the most from
>>>                 the latest Intel processors and coprocessors. See
>>>                 abstracts and register >
>>>                 
>>> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
>>>                 _______________________________________________
>>>                 Opensaf-users mailing list
>>>                 Opensaf-users@lists.sourceforge.net
>>>                 <mailto:Opensaf-users@lists.sourceforge.net>
>>>                 https://lists.sourceforge.net/lists/listinfo/opensaf-users
>>>
>>>
>>>
>>
>>
>
>
>

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-users mailing list
Opensaf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to