There is a (probably not so well documented :-) assumption that the 
system controllers are configured with a lower node_id than the 
payloads. From what I can see in the logs you sent, I think it looks 
like you have configured the payload with a lower node_id than the 
system controllers.

By the way, the headless feature has been improved in OpenSAF 5.1.0 so I 
would suggest that you upgrade to that version if possible.

regards,

Anders Widell


On 10/10/2016 06:04 PM, Jianfeng Dong wrote:
> I tried with sufficient drive space but got same result, neither of the two 
> SCs can be promoted to be controller until the payload reboot.
>
> I also checked the network link between SC and payload, they can PING each 
> other when this issue happened. I suspect too the problem is caused by 
> IMMD/IMMND link among those nodes, but don't know how to prove it.
>
> From: Neelakanta Reddy [mailto:reddy.neelaka...@oracle.com]
> Sent: Monday, October 10, 2016 8:39 PM
> To: Jianfeng Dong <jd...@juniper.net>; opensaf-users@lists.sourceforge.net
> Subject: Re: [users] OpenSAF release 5.0.1 can not promote SC after enable 
> "headless cluster" feature
>
> Hi,
>
> Once after the "Headless" if any of the controller started then the IMMND 
> from the payaload will send the intro message to IMMD.
> Looks like this did not happen, the following is the log from the payload:
>
> 2016-10-10T11:09:18.507851+08:00 pld0101 osafimmnd[3141]: message repeated 2 
> times: [ logtrace: write failed, No space left on device]
> 2016-10-10T11:09:18.507883+08:00 pld0101 osafimmnd[3141]: NO Re-introduce-me 
> highestProcessed:23839 highestReceived:23839
> 2016-10-10T11:09:18.508011+08:00 pld0101 osafimmnd[3141]: logtrace: write 
> failed, No space left on device
> 2016-10-10T11:09:18.508129+08:00 pld0101 osafimmnd[3141]: logtrace: write 
> failed, No space left on device
> 2016-10-10T11:09:18.508501+08:00 pld0101 osafimmnd[3141]: WA MDS Send Failed 
> to service:IMMD rc:2
>
>
> Retry, again with the sufficient space in payload.
>
> /Neel.
>
> On 2016/10/10 03:59 PM, Jianfeng Dong wrote:
>
> Hi,
>
>
>
> For several years we use OpenSAF(4.5.2 now) to provide HA service in our 
> product(including 2 SC and several payload cards), but our customer keep on 
> requiring that it's better to do NOT reboot payload card even if both SC 
> reload or hang.
>
>
>
> We just knew that the new release 5.0.0 has provided this feature(i.e. 
> "headless cluster"), so we installed 5.0.0 into our product and enable 
> "headless" feature by setting "IMMSV_SC_ABSENCE_ALLOWED" to 900 seconds. 
> After installation we found it worked fine, our system with new OpenSAF 
> release can start to run successfully, all SC and payload cards can be "UP", 
> and payload card will NOT reboot immediately after we reload both SC.
>
>
>
> However we got a problem that, neither of two SC can't be promoted to be 
> controller after reboot until the "headless" payload reboot due to 
> 'IMMSV_SC_ABSENCE_ALLOWED' timeout after 900 seconds. Seems OpenSAF modules 
> in both SC just wait there and do nothing, till payload reboot due to 
> timeout, then OpenSAF in SC continue to run, whole system recovered finally.
>
>
>
> We thought ticket #1828 may has resolved this issue so we took another try 
> with release 5.0.1 but got same result.
>
>
>
> Could you please tell us in our case, why OpenSAF in both SC could not run 
> until payload card(in "headless" status) rebooted due to timeout?
>
> Besides 'IMMSV_SC_ABSENCE_ALLOWED', is there any other variable or parameter 
> need to set/modify to enable 'headless cluster' feature? Do we miss anything?
>
> Attachments are the syslog of SC and payload card when this problem happened, 
> hope the log files can help us to find out the root cause.
>
>
>
> Much appreciated to any comment, thanks!
>
>
>
> Regards,
>
> Jianfeng Dong
>
>
>
>
>
>
> ------------------------------------------------------------------------------
>
> Check out the vibrant tech community on one of the world's most
>
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>
>
>
>
> _______________________________________________
>
> Opensaf-users mailing list
>
> Opensaf-users@lists.sourceforge.net<mailto:Opensaf-users@lists.sourceforge.net>
>
> https://lists.sourceforge.net/lists/listinfo/opensaf-users
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Opensaf-users mailing list
> Opensaf-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/opensaf-users
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-users mailing list
Opensaf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to