Wilson, The AMF in openais 0.80.3 is experimental and really not suitable to deploy at all. There are alot of bugs and basically it is a "tech preview" for developers to get a taste of AMF. The AMF in this "whitetank" version wont be improved. The production services, however, which are CLM, CKPT, EVT, EVS, CPG, libtotempg are fully supported and any issues found will result in new releases.
The AMF in trunk has been reworked quite a bit and shouldn't have these sorts of problems. Unfortunately this version of AMF is still not quite perfect yet and trunk is a bit unstable at the moment. Probably not the answer you were looking for, but I hope answers your questions. Regards -steve On Thu, 2008-02-21 at 03:50 -0800, Wilson Talaugon wrote: > Hi, > > > > Has anyone successfully run 2-nodes using the openais version > 0.80.3? For my case, I always get an assert failure the moment I run > aisexec on the first node. > > > > -Wilson > > > > logo > > > > > From: Wilson Talaugon [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 20, 2008 12:51 PM > To: [email protected] > Subject: assert in amfsg.c:332 > > > > > Hi, > > > > I’m trying to set up 2-node cluster, node1 as active and node2 as > standby. When I run aisexec on node1 I always get an assert failure > in amfsg.c at line 332 as soon as it instantiate the test application > testamf1. I’m trying to figure out if it’s a configuration problem. > The assert failure is in assign_si_assumed_cbfn() static function > inside the switch statement where it checks for the sg->avail_state > value whether it is in SG_AC_AssigningOnRequest or > SG_AC_AssigningStandBy state. For my setup it went to the default > case since the value of sg->avail_state is > SG_AC_InstantiatingServiceUnits. > > > > Included in this email are the amf configuration that I was using and > aisexec output. I would appreciate any help. Thanks. > > > > > > amf.conf: > > # AMF Test configuration file > > # - Times in milliseconds > > # - clccli_path can be set on any level from application and down and > will be > > # added to the CLI commands if they are not already specified with an > absolute > > # path (begins with /). > > # WL - WorkLoad > > > > safAmfCluster = CLUSTER { > > saAmfClusterStartupTimeout=3000 > > safAmfNode = 192.168.10.210 { > > saAmfNodeSuFailOverProb=2000 > > saAmfNodeSuFailoverMax=10 > > } > > safAmfNode = 192.168.10.212 { > > saAmfNodeSuFailOverProb=2000 > > saAmfNodeSuFailoverMax=10 > > } > > safApp = DB { > > safSg = DBSG { > > saAmfSGRedundancyModel=nplusm > > saAmfSGNumPrefActiveSUs=1 > > saAmfSGMaxActiveSIsperSUs=1 > > saAmfSGNumPrefStandbySUs=1 > > saAmfSGMaxStandbySIsperSUs=1 > > saAmfSGCompRestartProb=100000 > > saAmfSGCompRestartMax=0 > > saAmfSGSuRestartProb=20000 > > saAmfSGSuRestartMax=0 > > saAmfSGAutoAdjustProb=5000 > > safSu = SERVICE2 { > > clccli_path=/home/twelly/ais/test > > saAmfSUHostedByNode=192.168.10.212 > > saAmfSUNumComponents=1 > > safComp = SERVER2 { > > saAmfCompCategory=sa_aware > > > saAmfCompCapability=x_active_or_y_standby > > saAmfCompNumMaxActiveCsi=1 > > saAmfCompNumMaxStandbyCsi=1 > > saAmfCompDefaultClcCliTimeout = 500 > > saAmfCompDefaultCallbackTimeOut = 500 > > saAmfCompInstantiateCmd > = /home/twelly/ais/test/clc_cli_script > > saAmfCompInstantiateCmdArgv= > instantiate /home/twelly/ais/test/testamf1 > > saAmfCompTerminateCmd > = /home/twelly/ais/test/clc_cli_script > > saAmfCompTerminateCmdArgv = terminate > > saAmfCompCleanupCmd > = /home/twelly/ais/test/clc_cli_script > > saAmfCompCleanupCmdArgv = cleanup > > saAmfCompCsTypes { > > CSITYPE > > } > > saAmfCompCmdEnv { > > var1=val1 > > var2=val2 > > } > > > saAmfCompRecoveryOnError=component_failover > > safHealthcheckKey = key1 { > > saAmfHealthcheckPeriod = 5000 > > saAmfHealthcheckMaxDuration = 350 > > } > > } > > } > > safSu = SERVICE1 { > > clccli_path=/home/twelly/ais/test > > saAmfSUHostedByNode=192.168.10.210 > > saAmfSUNumComponents=1 > > safComp = SERVER1 { > > saAmfCompCategory=sa_aware > > > saAmfCompCapability=x_active_or_y_standby > > saAmfCompNumMaxActiveCsi=1 > > saAmfCompNumMaxStandbyCsi=1 > > saAmfCompDefaultClcCliTimeout = 500 > > saAmfCompDefaultCallbackTimeOut = 500 > > saAmfCompInstantiateCmd > = /home/twelly/ais/test/clc_cli_script > > saAmfCompInstantiateCmdArgv= > instantiate /home/twelly/ais/test/testamf1 > > saAmfCompTerminateCmd > = /home/twelly/ais/test/clc_cli_script > > saAmfCompTerminateCmdArgv = terminate > > saAmfCompCleanupCmd > = /home/twelly/ais/test/clc_cli_script > > saAmfCompCleanupCmdArgv = cleanup > > saAmfCompCsTypes { > > CSITYPE > > } > > saAmfCompCmdEnv { > > var1=val1 > > var2=val2 > > } > > > saAmfCompRecoveryOnError=component_failover > > safHealthcheckKey = key1 { > > saAmfHealthcheckPeriod = 5000 > > saAmfHealthcheckMaxDuration = 350 > > } > > } > > } > > } > > safSi = SI { > > saAmfSINumCSIs=1 > > safCsi = CSI { > > saAmfCSTypeName = CSITYPE > > } > > } > > safCSType = CSITYPE { > > } > > } > > } > > > > > > > > Aisexec output: > > # aisexec > > # Jan 1 k.118156 [MAIN ] AIS Executive Service RELEASE 'subrev 1358 > version 0.80.3' > > Jan 1 k.119156 [MAIN ] Copyright (C) 2002-2006 MontaVista Software, > Inc and contributors. > > Jan 1 k.119156 [MAIN ] Copyright (C) 2006 Red Hat, Inc. > > Jan 1 k.119156 [MAIN ] AIS Executive Service: started and ready to > provide service. > > Jan 1 k.120156 [MAIN ] openais component openais_cpg loaded. > > Jan 1 k.120156 [MAIN ] Registering service handler 'openais cluster > closed process group service v1.01' > > Jan 1 k.120156 [MAIN ] openais component openais_cfg loaded. > > Jan 1 k.122156 [MAIN ] Registering service handler 'openais > configuration service' > > Jan 1 k.123156 [MAIN ] openais component openais_msg loaded. > > Jan 1 k.123156 [MAIN ] Registering service handler 'openais message > service B.01.01' > > Jan 1 k.123156 [MAIN ] openais component openais_lck loaded. > > Jan 1 k.123156 [MAIN ] Registering service handler 'openais > distributed locking service B.01.01' > > Jan 1 k.124156 [MAIN ] openais component openais_evt loaded. > > Jan 1 k.124156 [MAIN ] Registering service handler 'openais event > service B.01.01' > > Jan 1 k.124156 [MAIN ] openais component openais_ckpt loaded. > > Jan 1 k.125156 [MAIN ] Registering service handler 'openais > checkpoint service B.01.01' > > Jan 1 k.125156 [MAIN ] openais component openais_amf loaded. > > Jan 1 k.125156 [MAIN ] Registering service handler 'openais > availability management framework B.01.01' > > Jan 1 k.126156 [MAIN ] openais component openais_clm loaded. > > Jan 1 k.126156 [MAIN ] Registering service handler 'openais cluster > membership service B.01.01' > > Jan 1 k.127156 [MAIN ] openais component openais_evs loaded. > > Jan 1 k.127156 [MAIN ] Registering service handler 'openais extended > virtual synchrony service' > > Jan 1 k.127156 [print.c:0344] log setup > > Jan 1 k.128156 [MAIN ] Scheduler priority left to default value (no > OS support) > > Jan 1 k.139155 [TOTEM] Token Timeout (1000 ms) retransmit timeout > (238 ms) > > Jan 1 k.140155 [TOTEM] token hold (180 ms) retransmits before loss (4 > retrans) > > Jan 1 k.140155 [TOTEM] join (50 ms) send_join (0 ms) consensus (800 > ms) merge (200 ms) > > Jan 1 k.140155 [TOTEM] downcheck (1000 ms) fail to recv const (50 > msgs) > > Jan 1 k.140155 [TOTEM] seqno unchanged const (30 rotations) Maximum > network MTU 1500 > > Jan 1 k.141155 [TOTEM] window size per rotation (50 messages) maximum > messages per rotation (17 messages) > > Jan 1 k.141155 [TOTEM] send threads (0 threads) > > Jan 1 k.141155 [TOTEM] RRP token expired timeout (238 ms) > > Jan 1 k.142155 [TOTEM] RRP token problem counter (2000 ms) > > Jan 1 k.142155 [TOTEM] RRP threshold (10 problem count) > > Jan 1 k.142155 [TOTEM] RRP mode set to none. > > Jan 1 k.142155 [TOTEM] heartbeat_failures_allowed (0) > > Jan 1 k.143155 [TOTEM] max_network_delay (50 ms) > > Jan 1 k.144155 [TOTEM] HeartBeat is Disabled. To enable set > heartbeat_failures_allowed > 0 > > Jan 1 k.147155 [TOTEM] Receive multicast socket recv buffer size > (144000 bytes). > > Jan 1 k.148155 [TOTEM] Transmit multicast socket send buffer size > (144000 bytesturn off loopback). > > : Invalid argument > > Jan 1 k.148155 [TOTEM] The network interface [192.168.10.210] is now > up. > > Jan 1 k.149155 [TOTEM] Created or loaded sequence id > 344.192.168.10.210 for this ring. > > Jan 1 k.149155 [TOTEM] entering GATHER state from 15. > > Jan 1 k.150155 [SERV ] Initialising service handler 'openais extended > virtual synchrony service' > > Jan 1 k.151155 [SERV ] Initialising service handler 'openais cluster > membership service B.01.01' > > Jan 1 k.152155 [SERV ] Initialising service handler 'openais > availability management framework B.01.01' > > Jan 1 k.155155 [SERV ] Initialising service handler 'openais > checkpoint service B.01.01' > > Jan 1 k.156155 [SERV ] Initialising service handler 'openais event > service B.01.01' > > Jan 1 k.156155 [SERV ] Initialising service handler 'openais > distributed locking service B.01.01' > > Jan 1 k.156155 [SERV ] Initialising service handler 'openais message > service B.01.01' > > Jan 1 k.156155 [SERV ] Initialising service handler 'openais > configuration service' > > Jan 1 k.156155 [SERV ] Initialising service handler 'openais cluster > closed process group service v1.01' > > Jan 1 k.157155 [SYNC ] Not using a virtual synchrony filter. > > Jan 1 k.159155 [TOTEM] Creating commit token because I am the rep. > > Jan 1 k.159155 [TOTEM] Saving state aru 0 high seq received 0 > > Jan 1 k.159155 [TOTEM] entering COMMIT state. > > Jan 1 k.160155 [TOTEM] entering RECOVERY state. > > Jan 1 k.161155 [TOTEM] position [0] member 192.168.10.210: > > Jan 1 k.161155 [TOTEM] previous ring seq 344 rep 192.168.10.210 > > Jan 1 k.161155 [TOTEM] aru 0 high delivered 0 received flag 0 > > Jan 1 k.161155 [TOTEM] Did not need to originate any messages in > recovery. > > Jan 1 k.162155 [TOTEM] Storing new sequence id for ring 15c > > Jan 1 k.162155 [TOTEM] Sending initial ORF token > > Jan 1 k.164155 [CLM ] CLM CONFIGURATION CHANGE > > Jan 1 k.164155 [CLM ] New Configuration: > > Jan 1 k.164155 [CLM ] Members Left: > > Jan 1 k.165155 [CLM ] Members Joined: > > Jan 1 k.165155 [SYNC ] This node is within the primary component and > will provide service. > > Jan 1 k.165155 [CLM ] CLM CONFIGURATION CHANGE > > Jan 1 k.166155 [CLM ] New Configuration: > > Jan 1 k.166155 [CLM ] r(0) ip(192.168.10.210) > > Jan 1 k.166155 [CLM ] Members Left: > > Jan 1 k.166155 [CLM ] Members Joined: > > Jan 1 k.167155 [CLM ] r(0) ip(192.168.10.210) > > Jan 1 k.167155 [SYNC ] This node is within the primary component and > will provide service. > > Jan 1 k.167155 [TOTEM] entering OPERATIONAL state. > > Jan 1 k.175154 [CLM ] got nodejoin message 192.168.10.210 > > Jan 1 k.167074 [AMF ] AMF Cluster: starting applications. > > 3735579: Hello world from > safComp=SERVER1,safSu=SERVICE1,safSg=DBSG,safApp=DB > > Jan 1 k.211072 [IPC ] Scheduler priority left to default value (no > OS support) > > Jan 1 k.216072 [IPC ] Scheduler priority left to default value (no > OS support) > > Jan 1 k.222072 [AMF ] Lib comp register: comp 'badname' not found > > Jan 1 k.223072 [AMF ] Setting SU 'SERVICE1' operational state: > ENABLED > > Jan 1 k.223072 [AMF ] Setting SU 'SERVICE1' readiness state: > IN-SERVICE > > Jan 1 k.223072 [AMF ] Setting SU 'SERVICE1' presence state: > INSTANTIATED > > 3735579: Component > 'safComp=SERVER1,safSu=SERVICE1,safSg=DBSG,safApp=DB' requested to > enter hastate SA_AMF_ACTIVE for > > CSI 'safCsi=CSI,safSi=SI,safApp=DB' > > Jan 1 k.179992 [AMF ] SU HA state changed to 'ACTIVE' for: > > SI 'SI', SU 'safSu=SERVICE1,safSg=DBSG,safApp=DB' > > Jan 1 k.179992 [AMF ] SI Assignment state changed to > 'PARTIALLY-ASSIGNED' for: > > SI 'SI', SU 'safSu=SERVICE1,safSg=DBSG,safApp=DB' > > Jan 1 k.179992 [MAIN ] AMF runtime attributes: > > Jan 1 k.180992 [MAIN ] > =================================================== > > Jan 1 k.180992 [MAIN ] safCluster=CLUSTER > > Jan 1 k.180992 [MAIN ] admin state: UNLOCKED > > Jan 1 k.180992 [MAIN ] safNode=192.168.10.212 > > Jan 1 k.181992 [MAIN ] admin state: UNLOCKED > > Jan 1 k.181992 [MAIN ] oper state: UNKNOWN > > Jan 1 k.181992 [MAIN ] safNode=192.168.10.210 > > Jan 1 k.181992 [MAIN ] admin state: UNLOCKED > > Jan 1 k.182992 [MAIN ] oper state: ENABLED > > Jan 1 k.182992 [MAIN ] safApp=DB > > Jan 1 k.182992 [MAIN ] admin state: UNLOCKED > > Jan 1 k.182992 [MAIN ] num_sg: 0 > > Jan 1 k.183992 [MAIN ] safSG=DBSG > > Jan 1 k.183992 [MAIN ] admin state: UNLOCKED > > Jan 1 k.183992 [MAIN ] assigned SUs 1 > > Jan 1 k.183992 [MAIN ] non inst. spare SUs 0 > > Jan 1 k.184992 [MAIN ] inst. spare SUs 0 > > Jan 1 k.184992 [MAIN ] safSU=SERVICE1 > > Jan 1 k.184992 [MAIN ] oper state: ENABLED > > Jan 1 k.184992 [MAIN ] admin state: UNLOCKED > > Jan 1 k.185992 [MAIN ] readiness state: IN-SERVICE > > Jan 1 k.185992 [MAIN ] presence state: INSTANTIATED > > Jan 1 k.185992 [MAIN ] hosted by node 192.168.10.210 > > Jan 1 k.185992 [MAIN ] num active SIs 1 > > Jan 1 k.186992 [MAIN ] num standby SIs 0 > > Jan 1 k.186992 [MAIN ] restart count 0 > > Jan 1 k.186992 [MAIN ] restart control state 0 > > Jan 1 k.186992 [MAIN ] SU failover cnt 0 > > Jan 1 k.187992 [MAIN ] assigned SIs: > > Jan 1 k.187992 [MAIN ] safSi=SI > > Jan 1 k.187992 [MAIN ] HA state: ACTIVE > > Jan 1 k.187992 [MAIN ] safComp=SERVER1 > > Jan 1 k.188992 [MAIN ] oper state: ENABLED > > Jan 1 k.188992 [MAIN ] readiness state: IN-SERVICE > > Jan 1 k.188992 [MAIN ] presence state: INSTANTIATED > > Jan 1 k.188992 [MAIN ] num active CSIs 1 > > Jan 1 k.189992 [MAIN ] num standby CSIs 0 > > Jan 1 k.189992 [MAIN ] restart count 0 > > Jan 1 k.189992 [MAIN ] assigned CSIs: > > Jan 1 k.189992 [MAIN ] safCSI=CSI > > Jan 1 k.190992 [MAIN ] HA state: ACTIVE > > Jan 1 k.190992 [MAIN ] safSU=SERVICE2 > > Jan 1 k.190992 [MAIN ] oper state: DISABLED > > Jan 1 k.190992 [MAIN ] admin state: UNLOCKED > > Jan 1 k.190992 [MAIN ] readiness state: OUT-OF-SERVICE > > Jan 1 k.191992 [MAIN ] presence state: UNINSTANTIATED > > Jan 1 k.191992 [MAIN ] hosted by node 192.168.10.212 > > Jan 1 k.191992 [MAIN ] num active SIs 0 > > Jan 1 k.191992 [MAIN ] num standby SIs 0 > > Jan 1 k.192992 [MAIN ] restart count 0 > > Jan 1 k.192992 [MAIN ] restart control state 0 > > Jan 1 k.192992 [MAIN ] SU failover cnt 0 > > Jan 1 k.192992 [MAIN ] assigned SIs: > > Jan 1 k.193992 [MAIN ] safComp=SERVER2 > > Jan 1 k.193992 [MAIN ] oper state: DISABLED > > Jan 1 k.193992 [MAIN ] readiness state: OUT-OF-SERVICE > > Jan 1 k.193992 [MAIN ] presence state: INSTANTIATING > > Jan 1 k.194992 [MAIN ] num active CSIs 0 > > Jan 1 k.194992 [MAIN ] num standby CSIs 0 > > Jan 1 k.194992 [MAIN ] restart count 0 > > Jan 1 k.194992 [MAIN ] assigned CSIs: > > Jan 1 k.195992 [MAIN ] safSi=SI > > Jan 1 k.195992 [MAIN ] admin state: UNLOCKED > > Jan 1 k.195992 [MAIN ] assignm. state: PARTIALLY-ASSIGNED > > Jan 1 k.195992 [MAIN ] active assignments: 1 > > amfsg.c:332 0 -- assertion failed > > Jan 1 k.196992 [MAIN ] standby assignments: 0 > > Jan 1 k.196992 [MAIN ] safCsi=CSI > > Jan 1 k.196992 [MAIN ] > =================================================== > > 3735579: saAmfResponse failed: 2 > > > > > > logo > > > > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
