Hi Adrian Szwej,

On 10/2/2014 12:09 AM, Adrian Szwej wrote:

I have now applied patch for #1032 ontop of 4.6 changeset *5969:ead18326c13b*.

You mean [#1036] ?
<http://sourceforge.net/p/opensaf/tickets/1036>

[devel] [PATCH 1 of 1] mds: use correct buff-length to distinguish mcast or multi-unicast [#1036] <http://sourceforge.net/p/opensaf/tickets/1036>

This patch does *not* resolve the problem.

This patch is not related to `TCP` this exclusively for `TIPC` ,
Please provide following , for me to reproduce the problem :

1. Reproducible steps
2. dtmd.conf file
3. imm.xml configuration details ( it seems you preperaed 70 node configuration )
4. You system buffers info ,check below link to get the data of your nodes:

    http://www.cyberciti.biz/faq/linux-tcp-tuning/

-AVM

SC-1 immnd get the TRY_AGAIN message with to many outstanding messages.
PL-3 - PL-6 joins without problems.
PL-7; which is the node causing this condition have following entries in the trace log:

Oct 1 18:25:45.749109 osafimmnd [472:immnd_mds.c:0127] >> immnd_mds_register Oct 1 18:25:45.749505 osafimmnd [472:immnd_mds.c:0192] T2 cb->node_id:2070f Oct 1 18:25:45.749525 osafimmnd [472:immnd_mds.c:0194] << immnd_mds_register Oct 1 18:25:45.749557 osafimmnd [472:immnd_main.c:0238] << immnd_initialize Oct 1 18:25:45.850504 osafimmnd [472:ImmModel.cc:3381] << protocol43Allowed Oct 1 18:25:45.850601 osafimmnd [472:immnd_proc.c:1626] T5 tmout:100 ste:1 ME:0 RE:0 crd:0 rim:FROM_FILE 4.3A!
  :0  2Pbe:0  VetA/B:  0/0  othsc:0/0
Oct   1  18:25:45.850631  osafimmnd  [472:immnd_proc.c:0393]  TR  First  
immnd_introduceMe,  sending  pbeEnabled:3  WITH  params
Oct   1  18:25:45.850653  osafimmnd  [472:immnd_proc.c:0413]  TR  Possibly  
extended  intro  from  this  IMMND  pbeEnabled:  3   dirsize:22
Oct   1  18:25:45.951519  osafimmnd  [472:immnd_proc.c:0393]  TR  First  
immnd_introduceMe,  sending  pbeEnabled:3  WITH  params
Oct   1  18:25:45.951618  osafimmnd  [472:immnd_proc.c:0413]  TR  Possibly  
extended  intro  from  this  IMMND  pbeEnabled:  3   dirsize:22

*Keeps on looping for long time with the last two messages*

------------------------------------------------------------------------

*[tickets:#1072] <http://sourceforge.net/p/opensaf/tickets/1072> Sync stop after few payload nodes joining the cluster (TCP)*

*Status:* unassigned
*Milestone:* 4.3.3
*Created:* Fri Sep 12, 2014 09:20 PM UTC by Adrian Szwej
*Last Updated:* Thu Sep 18, 2014 06:27 PM UTC
*Owner:* nobody

Communication is MDS over TCP. Cluster 2+3; where scenario is
Start SCs; start 1 payload; wait for sync; start second payload; wait for sync; start 3rd payload. Third one fails; or sometimes it might be forth.

There is problem of getting more than 2/3 payloads synchronized due to a consistent way of triggering a bug.

The following is triggered in the loading immnd causing the joined node to timeout/fail to start up.

Sep 6 6:58:02.096550 osafimmnd [502:immsv_evt.c:5382] T8 Received: IMMND_EVT_A2ND_SEARCHNEXT (17) from 2020f Sep 6 6:58:02.096575 osafimmnd [502:immnd_evt.c:1443] >> immnd_evt_proc_search_next Sep 6 6:58:02.096613 osafimmnd [502:immnd_evt.c:1454] T2 SEARCH NEXT, Look for id:1664 Sep 6 6:58:02.096641 osafimmnd [502:ImmModel.cc:1366] T2 ERR_TRY_AGAIN: Too many pending incoming fevs messages (> 16) rejecting sync iteration next request Sep 6 6:58:02.096725 osafimmnd [502:immnd_evt.c:1676] << immnd_evt_proc_search_next Sep 6 6:58:03.133230 osafimmnd [502:immnd_proc.c:1980] IN Sync Phase-3: step:540

I have managed to overcome this bug temporary by making following patch:

+++  b/osaf/libs/common/immsv/include/immsv_api.h         Sat  Sep  06  
08:38:16  2014  +0000
@@  -70,7  +70,7  @@

  /*Max # of outstanding fevs messages towards director.*/
  /*Note max-max is 255. cb->fevs_replies_pending is an uint8_t*/
-#define  IMMSV_DEFAULT_FEVS_MAX_PENDING  16
+#define  IMMSV_DEFAULT_FEVS_MAX_PENDING  255

  #define  IMMSV_MAX_OBJECTS  10000
  #define  IMMSV_MAX_ATTRIBUTES  128
------------------------------------------------------------------------

Sent from sourceforge.net because [email protected] is subscribed to https://sourceforge.net/p/opensaf/tickets/ <https://sourceforge.net/p/opensaf/tickets>

To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.



------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk


_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to