I can send you a patch within the next few days and let you try it out.

regards,

Anders Widell


On 10/11/2016 11:36 AM, Jianfeng Dong wrote:
> Do you have a clear plan to remove this requirement?
> We want to know if we can't change node_id due to our architecture,  when we 
> could get a no-this-limit release to upgrade? After all, our products have 
> been deployed to many customers so we have to think about upgrade and 
> compatibility issues.
>
> Thanks,
> Jianfeng
>
> -----Original Message-----
> From: Anders Widell [mailto:anders.wid...@ericsson.com]
> Sent: Tuesday, October 11, 2016 4:10 PM
> To: Jianfeng Dong <jd...@juniper.net>; Neelakanta Reddy 
> <reddy.neelaka...@oracle.com>; opensaf-users@lists.sourceforge.net
> Subject: Re: [users] OpenSAF release 5.0.1 can not promote SC after enable 
> "headless cluster" feature
>
> Yes, this is required with the current implementation. It might be possible 
> to remove this requirement - I will think about how it can be done.
>
> regards,
>
> Anders Widell
>
>
> On 10/11/2016 09:06 AM, Jianfeng Dong wrote:
>> Is it obligatory that controller must have a slower slot_id than payload if 
>> we want to enable "headless" feature?
>> If it is obligatory, seems it's a big change to our architecture, but I will 
>> have a try at least.
>>
>> Thanks,
>> Jianfeng
>>
>> -----Original Message-----
>> From: Anders Widell [mailto:anders.wid...@ericsson.com]
>> Sent: Tuesday, October 11, 2016 2:30 PM
>> To: Jianfeng Dong <jd...@juniper.net>; Neelakanta Reddy
>> <reddy.neelaka...@oracle.com>; opensaf-users@lists.sourceforge.net
>> Subject: Re: [users] OpenSAF release 5.0.1 can not promote SC after
>> enable "headless cluster" feature
>>
>> There is a one-to-one mapping between /etc/opensaf/slot_id and the node_id. 
>> Simply make sure that all your system controller nodes have lower slot_id 
>> than any of your payloads. This file is read when the node is booted. You 
>> should be able to do an in-service renumbering of your nodes - just be 
>> careful so that you never have two nodes with the same node_id at the same 
>> time.
>>
>> Yes, the assumption is there in 5.1.0 as well.
>>
>> regards,
>>
>> Anders Widell
>>
>>
>> On 10/11/2016 04:29 AM, Jianfeng Dong wrote:
>>> Yes, in our product payload's node_id is lower than SC, could you please 
>>> tell us how to configure it?
>>>
>>> And, does this assumption exist in OpenSAF 5.1.0 as well?
>>>
>>> Thanks,
>>> Jianfeng
>>>
>>> -----Original Message-----
>>> From: Anders Widell [mailto:anders.wid...@ericsson.com]
>>> Sent: Tuesday, October 11, 2016 12:55 AM
>>> To: Jianfeng Dong <jd...@juniper.net>; Neelakanta Reddy
>>> <reddy.neelaka...@oracle.com>; opensaf-users@lists.sourceforge.net
>>> Subject: Re: [users] OpenSAF release 5.0.1 can not promote SC after
>>> enable "headless cluster" feature
>>>
>>> There is a (probably not so well documented :-) assumption that the system 
>>> controllers are configured with a lower node_id than the payloads. From 
>>> what I can see in the logs you sent, I think it looks like you have 
>>> configured the payload with a lower node_id than the system controllers.
>>>
>>> By the way, the headless feature has been improved in OpenSAF 5.1.0 so I 
>>> would suggest that you upgrade to that version if possible.
>>>
>>> regards,
>>>
>>> Anders Widell
>>>
>>>
>>> On 10/10/2016 06:04 PM, Jianfeng Dong wrote:
>>>> I tried with sufficient drive space but got same result, neither of the 
>>>> two SCs can be promoted to be controller until the payload reboot.
>>>>
>>>> I also checked the network link between SC and payload, they can PING each 
>>>> other when this issue happened. I suspect too the problem is caused by 
>>>> IMMD/IMMND link among those nodes, but don't know how to prove it.
>>>>
>>>> From: Neelakanta Reddy [mailto:reddy.neelaka...@oracle.com]
>>>> Sent: Monday, October 10, 2016 8:39 PM
>>>> To: Jianfeng Dong <jd...@juniper.net>;
>>>> opensaf-users@lists.sourceforge.net
>>>> Subject: Re: [users] OpenSAF release 5.0.1 can not promote SC after
>>>> enable "headless cluster" feature
>>>>
>>>> Hi,
>>>>
>>>> Once after the "Headless" if any of the controller started then the IMMND 
>>>> from the payaload will send the intro message to IMMD.
>>>> Looks like this did not happen, the following is the log from the payload:
>>>>
>>>> 2016-10-10T11:09:18.507851+08:00 pld0101 osafimmnd[3141]: message
>>>> repeated 2 times: [ logtrace: write failed, No space left on device]
>>>> 2016-10-10T11:09:18.507883+08:00 pld0101 osafimmnd[3141]: NO
>>>> Re-introduce-me highestProcessed:23839 highestReceived:23839
>>>> 2016-10-10T11:09:18.508011+08:00 pld0101 osafimmnd[3141]: logtrace:
>>>> write failed, No space left on device
>>>> 2016-10-10T11:09:18.508129+08:00 pld0101 osafimmnd[3141]: logtrace:
>>>> write failed, No space left on device
>>>> 2016-10-10T11:09:18.508501+08:00 pld0101 osafimmnd[3141]: WA MDS
>>>> Send Failed to service:IMMD rc:2
>>>>
>>>>
>>>> Retry, again with the sufficient space in payload.
>>>>
>>>> /Neel.
>>>>
>>>> On 2016/10/10 03:59 PM, Jianfeng Dong wrote:
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> For several years we use OpenSAF(4.5.2 now) to provide HA service in our 
>>>> product(including 2 SC and several payload cards), but our customer keep 
>>>> on requiring that it's better to do NOT reboot payload card even if both 
>>>> SC reload or hang.
>>>>
>>>>
>>>>
>>>> We just knew that the new release 5.0.0 has provided this feature(i.e. 
>>>> "headless cluster"), so we installed 5.0.0 into our product and enable 
>>>> "headless" feature by setting "IMMSV_SC_ABSENCE_ALLOWED" to 900 seconds. 
>>>> After installation we found it worked fine, our system with new OpenSAF 
>>>> release can start to run successfully, all SC and payload cards can be 
>>>> "UP", and payload card will NOT reboot immediately after we reload both SC.
>>>>
>>>>
>>>>
>>>> However we got a problem that, neither of two SC can't be promoted to be 
>>>> controller after reboot until the "headless" payload reboot due to 
>>>> 'IMMSV_SC_ABSENCE_ALLOWED' timeout after 900 seconds. Seems OpenSAF 
>>>> modules in both SC just wait there and do nothing, till payload reboot due 
>>>> to timeout, then OpenSAF in SC continue to run, whole system recovered 
>>>> finally.
>>>>
>>>>
>>>>
>>>> We thought ticket #1828 may has resolved this issue so we took another try 
>>>> with release 5.0.1 but got same result.
>>>>
>>>>
>>>>
>>>> Could you please tell us in our case, why OpenSAF in both SC could not run 
>>>> until payload card(in "headless" status) rebooted due to timeout?
>>>>
>>>> Besides 'IMMSV_SC_ABSENCE_ALLOWED', is there any other variable or 
>>>> parameter need to set/modify to enable 'headless cluster' feature? Do we 
>>>> miss anything?
>>>>
>>>> Attachments are the syslog of SC and payload card when this problem 
>>>> happened, hope the log files can help us to find out the root cause.
>>>>
>>>>
>>>>
>>>> Much appreciated to any comment, thanks!
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Jianfeng Dong
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --------------------------------------------------------------------
>>>> -
>>>> -
>>>> --------
>>>>
>>>> Check out the vibrant tech community on one of the world's most
>>>>
>>>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>>
>>>> Opensaf-users mailing list
>>>>
>>>> Opensaf-users@lists.sourceforge.net<mailto:Opensaf-users@lists.sourc
>>>> e
>>>> f
>>>> orge.net>
>>>>
>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-users
>>>>
>>>> --------------------------------------------------------------------
>>>> -
>>>> -
>>>> -------- Check out the vibrant tech community on one of the world's
>>>> most engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>>>> _______________________________________________
>>>> Opensaf-users mailing list
>>>> Opensaf-users@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-users
>>>>
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-users mailing list
Opensaf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to