AVM,
I found another performance issue in the checkpoint subsystem.
I was still not able to replicate 40k sections from 5 active blades
to the 6th standby blade (blade 6 has 200k sections in 5 checkpoints).
The writes on the active blades were fine, but the standby couldn't keep
up. On the standby the CPU was pegged and all ckpt API functions were
returning with SA_AIS_ERR_TIMEOUT, including ActiveReplicaSet, and
CheckpointClose!
So, I ran oprofile on the standby and found the following. 90% of
the time was being spent in cpnd_ckpt_sec_get_create(). In this
function it is clearly seen that the section database is implemented as
a linked list. This is horribly expensive with large numbers of
sections, especially when these sections are being replicated at high rates.
As a proof of concept, I reimplemented the section database using
the C++ STL map. This improved performance tenfold.
With the following changes I can now easily replicate 40k sections
each from 5 blades to the standby (200k sections being simultaneously
replicated on the backup blade.)
1. Make the section create message asynchronous when
SA_CKPT_WR_ACTIVE_REPLICA is specified.
2. Change the section database data structure from linked list to STL map.
3. Change MAX_SYNC_TRANSFER_SIZE in cpsv_evt.h from 30M to 3M.
I'll reimplement the section database patch using the internal
patricia tree code and post the patch, unless you feel there's a better
way to do it.
Alex
On 01/14/2014 05:13 PM, Alex Jones wrote:
> 3.7.1 saCkptSectionCreate()
> "If the checkpoint was created with the SA_CKPT_WR_ALL_REPLICAS
> property, the section is created in all of the checkpoint replicas
> when the invocation returns; otherwise, the section has been created
> at least in the active checkpoint replica when the invocation returns
> and will be created asynchronously in the other checkpoint replicas."
>
> It looks like the implementation behaves like the checkpoint was
> created with SA_CKPT_WR_ALL_REPLICAS, (even if
> SA_CKPT_WR_ACTIVE_REPLICA or SA_CKPT_WR_ACTIVE_REPLICA_WEAK are
> specified) for section creates.
>
> I've been digging into the code, and it looks like
> "cpnd_evt_proc_ckpt_sect_create" sends a synchronous message to each
> replica to create the section regardless of the property.
>
> I realize this is spec compliant, but performance could be greatly
> enhanced for large numbers of sections when using the
> SA_CKPT_WR_ACTIVE_REPLICA or SA_CKPT_WR_ACTIVE_REPLICA_WEAK property
> by making this asynchronous.
>
> Any chance we can change this behaviour, and make it like
> saCkptCheckpointWrite?
>
> Alex
>
> On 01/14/2014 03:16 PM, Alex Jones wrote:
>> AVM,
>>
>> In my 5+1 setup, when I have the standby node open all the
>> checkpoints and read from them, as well as open the hot-standby
>> callback, the section creates done on the other active nodes can take
>> a very long time. (For 40k sections, it can sometimes take over 2
>> minutes).
>>
>> Once the sections have been created, however, subsequent writes
>> and overwrites are very fast. (Writing 1k data into 40k sections
>> takes 10 seconds).
>>
>> But, if I don't open the checkpoints on the standby, the section
>> creates on the active nodes are fast (about 22 seconds for 40k
>> sections), and the write and overwrite performance is basically
>> unchanged.
>>
>> This suggests that there is some kind of synchronous mechanism
>> going on between replicas when creating sections.
>>
>> Can you explain why I am seeing this performance degradation when
>> creating sections when a standby replica is opened, but there is no
>> performance hit for writing and overwriting?
>>
>> Thanks!
>>
>> Alex
>>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel