I went back to MAX_SYNC_TRANSFER_SIZE (30 * 1024 * 1024), and made the MDS changes you suggest.
This works great guys, even up at 40k and 50k sections. Thanks! Alex On 01/09/2014 04:43 AM, A V Mahesh wrote: > Hi Alex, > > Use the below patch as workaround for you to proceed your testing . > This patch just increases the MDS internal fragmentation value to > ~ TIPC_MAX_USER_MSG_SIZE define in tipc.h > > I will work with Hans to have final patch by considering the both > TIPC & TCP transports, > and testing involved as a part of ticket `#654 MDS improvements` > (https://sourceforge.net/p/opensaf/tickets/654/ ). > > I tested this patch with 10K sections checkpoint memory used was : > 10136000 on TIPC transport. > > ================================================================================== > > > diff --git a/osaf/libs/core/mds/include/mds_dt.h > b/osaf/libs/core/mds/include/mds_dt.h > --- a/osaf/libs/core/mds/include/mds_dt.h > +++ b/osaf/libs/core/mds/include/mds_dt.h > @@ -32,6 +32,7 @@ > #include "ncs_main_papi.h" > #include "ncssysf_mem.h" > #include "ncspatricia.h" > +#include <linux/tipc.h> > > > /* This file is private to the MDTM layer. */ > @@ -109,7 +110,7 @@ typedef struct mdtm_reassembly_queue { > > #define MDTM_MAX_DIRECT_BUFF_SIZE MDTM_MAX_SEGMENT_SIZE > > -#define MDTM_NORMAL_MSG_FRAG_SIZE 1400 > +#define MDTM_NORMAL_MSG_FRAG_SIZE (TIPC_MAX_USER_MSG_SIZE-1000) /* > TIPC_MAX_USER_MSG_SIZE = 66000 define <linux/tipc.h> */ > > #define MDTM_RECV_BUFFER_SIZE > ((MDS_DIRECT_BUF_MAXSIZE>MDTM_NORMAL_MSG_FRAG_SIZE)? \ > (MDS_DIRECT_BUF_MAXSIZE+SUM_MDS_HDR_PLUS_MDTM_HDR_PLUS_LEN):(MDTM_NORMAL_MSG_FRAG_SIZE+SUM_MDS_HDR_PLUS_MDTM_HDR_PLUS_LEN)) > > > ================================================================================== > > > > -AVM > > > On 1/8/2014 10:42 PM, Alex Jones wrote: >> Hi Hans, >> >> Changing rmem_default and rmem_max has no effect on the problem. >> I even tried up to 2M to no avail. >> >> However, after looking at the cpnd_transfer_replica function in >> cpnd_evt.c, I found the following in cpsv_evt.h which controls how >> large the packets are which are sent through MDS: >> >> #define MAX_SYNC_TRANSFER_SIZE (30 * 1024 * 1024) >> >> 30M? What is the rationale for this number? This seems way too >> high. When I change it to (4*1024*1024) (4M) it solves my problem, >> and doesn't appear to affect performance. >> >> Alex >> >> On 01/08/2014 08:30 AM, Hans Feldt wrote: >>> sysctl -a | grep rmem >>> >>> set rmem_default to 256K or so >>> >>> /Hans >>> >>>> -----Original Message----- >>>> From: Hans Feldt [mailto:hans.fe...@ericsson.com] >>>> Sent: den 8 januari 2014 14:01 >>>> To: A V Mahesh; Alex Jones >>>> Cc: opensaf-devel@lists.sourceforge.net >>>> Subject: Re: [devel] checkpoint problems >>>> >>>> The socket receive buffer size used is the system default. It can >>>> be too small, pump it up. >>>> I plan todo some change in MDS for this (and other stuff). >>>> /Hans >>>> >>>>> -----Original Message----- >>>>> From: A V Mahesh [mailto:mahesh.va...@oracle.com] >>>>> Sent: den 8 januari 2014 11:29 >>>>> To: Alex Jones >>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>> Subject: Re: [devel] checkpoint problems >>>>> >>>>> Hi Alex, >>>>> >>>>> I suggest you increase and try the following TIPC values ( tipc >>>>> code ) >>>>> and rebuild `tipc.ko`: >>>>> >>>>> net/tipc/tipc_socket.c:#define OVERLOAD_LIMIT_BASE 5000 >>>>> >>>>> You can increase it to 50000 and try again. >>>>> >>>>> - AVM. >>>>> >>>>> On 1/8/2014 4:16 AM, Alex Jones wrote: >>>>>> After doing some deep debugging I am seeing the following in the MDS >>>>>> log on node B. This is when the CPND_EVT_ND2ND_CKPT_ACTIVE_SYNC is >>>>>> sent from the active replica on node A to the replica on node B. >>>>>> The >>>>>> sync message never gets up to the CPND layer on node B because it is >>>>>> dropped. >>>>>> >>>>>> This is with 10k sections, each section 1k. >>>>>> >>>>>> Jan 7 21:32:32.772347 <1789648919> ERR |MDTM: Frag recd is not >>>>>> next frag so dropping adest=<0x010010023922604c> >>>>>> Jan 7 21:32:32.772399 <1789648919> ERR |MDTM: Message is dropped >>>>>> as msg is out of seq TRANSPOR-ID=<0x010010023922604c> >>>>>> >>>>>> I've turned on MDS debug on node B, and the packet being sent >>>>>> over is >>>>>> gigantic. It starts failing at fragment number 2703. The next >>>>>> fragment that comes in is 2707, then 2722. The last fragment that >>>>>> comes in is 7444. >>>>>> >>>>>> I've done a cursory look at the hardware stats, and nothing is being >>>>>> rate-limited or dropped. >>>>>> >>>>>> I'm going to take a deeper look at this, but I'm mentioning it in >>>>>> case >>>>>> it rings any bells. I am using TIPC as the transport. >>>>>> >>>>>> Alex >>>>>> >>>>>> On 01/07/2014 07:24 AM, Alex Jones wrote: >>>>>>> AVM, >>>>>>> >>>>>>> I get SA_AIS_ERR_TIMEOUT even when I pass SA_TIME_END as the >>>>>>> timeout value. Is this not a bug? the synchronous CheckpointOpen >>>>>>> call doesn't work at all in this scenario. It never succeeds. >>>>>>> >>>>>>> I can reproduce the problem with >>>>>>> sectionCreationAttributes.expirationTime set to SA_TIME_ONE_DAY. >>>>>>> >>>>>>> You should be able to reproduce the problem with the code I >>>>>>> sent >>>>>>> in the last e-mail. >>>>>>> >>>>>>> Alex >>>>>>> >>>>>>> On 01/06/2014 10:31 PM, A V Mahesh wrote: >>>>>>>> Hi Alex, >>>>>>>> >>>>>>>> CheckpointOpen call failing with SA_AIS_ERR_TIMEOUT NOT a bug , it >>>>>>>> is expected if you pass less time out value `timeout = >>>>>>>> 1000000000` >>>>>>>> to saCkptCheckpointOpen(....,timeout ...) call ,when ckpt has very >>>>>>>> large data/section. just increasing timeout will avoids the >>>>>>>> SA_AIS_ERR_TIMEOUT. >>>>>>>> >>>>>>>> Let us focus on your original issue/scenario, are you able to >>>>>>>> reproduce the problem with >>>>>>>> sectionCreationAttributes.expirationTime >>>>>>>> with SA_TIME_ONE_DAY ? >>>>>>>> >>>>>>>> -AVM >>>>>>>> >>>>>>>> On 1/7/2014 1:17 AM, Alex Jones wrote: >>>>>>>>> AVM, >>>>>>>>> >>>>>>>>> I've been playing around with your test program, and have >>>>>>>>> gotten it to fail. >>>>>>>>> >>>>>>>>> I made the following changes: >>>>>>>>> >>>>>>>>> 1. Change init_dataX to be 1024k bytes, so that you are >>>>>>>>> initializing the section to be 1024k. >>>>>>>>> 2. Also, don't start the program on node B until A has finished >>>>>>>>> writing/creating all the sections. >>>>>>>>> 3. Before hitting the enter key on node B, wait for the >>>>>>>>> OpenAsync >>>>>>>>> call to finish. >>>>>>>>> >>>>>>>>> You might notice the CheckpointOpen call failing now with >>>>>>>>> SA_AIS_ERR_TIMEOUT. I had to turn this into OpenAsync, and add a >>>>>>>>> thread to process CkptDispatch messages. This uncovers >>>>>>>>> another bug >>>>>>>>> in OpenAsync. I've attached the mods to your program here. >>>>>>>>> >>>>>>>>> The OpenAsync callback will be called twice, both times with >>>>>>>>> error == SA_AIS_ERR_TIMEOUT. If I call OpenAsync again when I >>>>>>>>> get >>>>>>>>> this error, the next callback returns success, but the callback >>>>>>>>> gets called twice with success and with two different checkpoint >>>>>>>>> handles! >>>>>>>>> >>>>>>>>> Alex >>>>>>>>> >>>>>>>>> >>>>>>>>> On 01/06/2014 06:18 AM, A V Mahesh wrote: >>>>>>>>>> Hi Alex, >>>>>>>>>> >>>>>>>>>> I have created 10K sections ( please find the attached test >>>>>>>>>> application `Alex_test_node_A_app.c` & >>>>>>>>>> `Alex_test_node_B_app.c ` ) >>>>>>>>>> with your specified scenario & configuration and I haven't >>>>>>>>>> observed any >>>>>>>>>> issue with sections on another node. >>>>>>>>>> >>>>>>>>>> Try to reproduce the problem on your setup & let me know the >>>>>>>>>> result . >>>>>>>>>> >>>>>>>>>> One more importent point how much did you configured >>>>>>>>>> `sectionCreationAttributes.expirationTime ` ? >>>>>>>>>> I configured SA_TIME_ONE_DAY. >>>>>>>>>> >>>>>>>>>> Steps to rung the application : >>>>>>>>>> >>>>>>>>>> >>>> ====================================================================================================== >>>> >>>> >>>>> ============= >>>>>>>>>> Compile : >>>>>>>>>> >>>>>>>>>> NODE-A# gcc Alex_test_node_A_app.c -o checkpoint_A -lSaCkpt >>>>>>>>>> NODE-A# gcc Alex_test_node_B_app.c -o checkpoint_B -lSaCkpt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Run : >>>>>>>>>> >>>>>>>>>> 1) saCkptCheckpointOpen On node A >>>>>>>>>> >>>>>>>>>> NODE-A# ./checkpoint_A >>>>>>>>>> >>>>>>>>>> CPSV:CPA:ONsaCkptSectionCreate Waiting to Create Sections >>>>>>>>>> safCkpt=test_checkpoint_name1,safApp=safCkptService.... >>>>>>>>>> saCkptSectionCreate Press <Enter> key to continue... >>>>>>>>>> >>>>>>>>>> . >>>>>>>>>> 2) saCkptCheckpointOpen() same ckpt On node B >>>>>>>>>> >>>>>>>>>> NODE-B# ./checkpoint_B >>>>>>>>>> >>>>>>>>>> CPSV:CPA:ONsaCkptSectionIterationInitialize Waiting to read >>>>>>>>>> Sections >>>>>>>>>> safCkpt=test_checkpoint_name1,safApp=safCkptService.... >>>>>>>>>> saCkptActiveReplicaSet saCkptSectionIterationInitialize Press >>>>>>>>>> <Enter> >>>>>>>>>> key to continue... >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 3) saCkptSectionCreate() On node A and read >>>>>>>>>> saCkptCheckpointStatusGet() >>>>>>>>>> >>>>>>>>>> NODE-A# >>>>>>>>>> checkpointStatus.numberOfSections : 10000 >>>>>>>>>> checkpointStatus.memoryUsed :756000 >>>>>>>>>> checkpointCreationAttributes.creationFlags;10 >>>>>>>>>> checkpointCreationAttributes.checkpointSize;10240000 >>>>>>>>>> checkpointCreationAttributes.retentionDuration;60000000000 >>>>>>>>>> checkpointCreationAttributes.maxSections;10000 >>>>>>>>>> checkpointCreationAttributes.maxSectionSize;1024 >>>>>>>>>> checkpointCreationAttributes.maxSectionIdSize;64 >>>>>>>>>> ================================ >>>>>>>>>> saCkptCheckpointUnlink / saCkptCheckpointClose / >>>>>>>>>> saCkptFinalize Press >>>>>>>>>> <Enter> key to continue... >>>>>>>>>> saCkptCheckpoint Press <Enter> key to continue... >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 4) saCkptActiveReplicaSet() & On node B and >>>>>>>>>> saCkptCheckpointStatusGet() >>>>>>>>>> >>>>>>>>>> NODE-B# >>>>>>>>>> checkpointStatus.numberOfSections : 10000 >>>>>>>>>> checkpointStatus.memoryUsed :756000 >>>>>>>>>> checkpointCreationAttributes.creationFlags;10 >>>>>>>>>> checkpointCreationAttributes.checkpointSize;10240000 >>>>>>>>>> checkpointCreationAttributes.retentionDuration;60000000000 >>>>>>>>>> checkpointCreationAttributes.maxSections;10000 >>>>>>>>>> checkpointCreationAttributes.maxSectionSize;1024 >>>>>>>>>> checkpointCreationAttributes.maxSectionIdSize;64 >>>>>>>>>> >>>>>>>>>> saCkptCheckpointUnlink / saCkptCheckpointClose / >>>>>>>>>> saCkptFinalize Press >>>>>>>>>> <Enter> key to continue... >>>>>>>>>> saCkptCheckpoint Press <Enter> key to continue.. >>>>>>>>>> >>>>>>>>>> >>>> ====================================================================================================== >>>> >>>> >>>>> ========================== >>>>>>>>>> -AVM >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 1/6/2014 12:32 PM, A V Mahesh wrote: >>>>>>>>>>> Hi Alex, >>>>>>>>>>> >>>>>>>>>>> We never tested the 7500 sections , will test & and let you >>>>>>>>>>> know , >>>>>>>>>>> can you please share your test application , >>>>>>>>>>> that allow us to respond quick. >>>>>>>>>>> >>>>>>>>>>> -AVM >>>>>>>>>>> >>>>>>>>>>> On 1/3/2014 8:23 PM, Alex Jones wrote: >>>>>>>>>>>> Hello All, >>>>>>>>>>>> >>>>>>>>>>>> I'm experimenting with the checkpoint service, and >>>>>>>>>>>> some things >>>>>>>>>>>> don't appear to work. >>>>>>>>>>>> >>>>>>>>>>>> The saCkptActiveReplicaSet and >>>>>>>>>>>> saCkptCheckpointSynchronize[Async] don't appear to work >>>>>>>>>>>> when the >>>>>>>>>>>> checkpoint has section numbers greater than around 5500. >>>>>>>>>>>> >>>>>>>>>>>> I've created a checkpoint with 7500 sections, each >>>>>>>>>>>> section being >>>>>>>>>>>> 1024 bytes. The checkpoint is co-located and the "active >>>>>>>>>>>> replica" >>>>>>>>>>>> bit is set. >>>>>>>>>>>> >>>>>>>>>>>> I can create and write all the sections. And from >>>>>>>>>>>> another node >>>>>>>>>>>> I run saCkptCheckpointStatusGet, and the information all >>>>>>>>>>>> looks good. >>>>>>>>>>>> Everything is there. I see no errors from any CKPT API calls. >>>>>>>>>>>> >>>>>>>>>>>> The problem comes when I call saCkptActiveReplicaSet >>>>>>>>>>>> from this >>>>>>>>>>>> other node. After I do this, saCkptCheckpointStatusGet now >>>>>>>>>>>> returns >>>>>>>>>>>> all the same information except the number of sections is >>>>>>>>>>>> no longer >>>>>>>>>>>> 7500 but 0. If I do this test with 50,000 sections only >>>>>>>>>>>> about 3,000 >>>>>>>>>>>> entries get synced. And iterating through the sections >>>>>>>>>>>> shows that >>>>>>>>>>>> there are only 3,000 sections. >>>>>>>>>>>> >>>>>>>>>>>> Calling saCkptCheckpointSynchronize[Async] in this >>>>>>>>>>>> situation has >>>>>>>>>>>> no effect, either. >>>>>>>>>>>> >>>>>>>>>>>> After looking through the code I see a comment in >>>>>>>>>>>> cpnd_evt_proc_ckpt_arep_set that says "/* ###TBD sync up is >>>>>>>>>>>> missing >>>>>>>>>>>> with old active if now this fellow is becoming active. */" >>>>>>>>>>>> So, it >>>>>>>>>>>> doesn't appear that syncing is being done in the >>>>>>>>>>>> saCkptActiveReplicaSet, which it should be. >>>>>>>>>>>> >>>>>>>>>>>> Can someone comment? >>>>>>>>>>>> >>>>>>>>>>>> I'm going to fix this and post a patch unless >>>>>>>>>>>> someone else is >>>>>>>>>>>> already working on it, but I didn't see a bug for it. >>>>>>>>>>>> >>>>>>>>>>>> Alex >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Rapidly troubleshoot problems before they affect your >>>>>>>>>>>> business. Most IT >>>>>>>>>>>> organizations don't have a clear picture of how application >>>>>>>>>>>> performance >>>>>>>>>>>> affects their revenue. With AppDynamics, you get 100% >>>>>>>>>>>> visibility into >>>>>>>>>>>> your >>>>>>>>>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of >>>>>>>>>>>> AppDynamics Pro! >>>>>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Opensaf-devel mailing list >>>>>>>>>>>> Opensaf-devel@lists.sourceforge.net >>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> >>>>> Rapidly troubleshoot problems before they affect your business. >>>>> Most IT >>>>> organizations don't have a clear picture of how application >>>>> performance >>>>> affects their revenue. With AppDynamics, you get 100% visibility >>>>> into your >>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of >>>>> AppDynamics Pro! >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >>>>> >>>>> >>>>> _______________________________________________ >>>>> Opensaf-devel mailing list >>>>> Opensaf-devel@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>> ------------------------------------------------------------------------------ >>>> >>>> >>>> Rapidly troubleshoot problems before they affect your business. >>>> Most IT >>>> organizations don't have a clear picture of how application >>>> performance >>>> affects their revenue. With AppDynamics, you get 100% visibility >>>> into your >>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of >>>> AppDynamics Pro! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >>>> >>>> >>>> _______________________________________________ >>>> Opensaf-devel mailing list >>>> Opensaf-devel@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >> >> > > ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel