Hi Alex,

On 10/1/2015 10:34 PM, Alex Jones wrote:
> When a collocated checkpoint replica is opened, and the active replica 
> has
>     large numbers of sections (~200k),

can you please share  the size of each sections .

-AVM


On 10/5/2015 9:31 AM, A V Mahesh wrote:
> Hi Alex,
>
> If you have ready to use test application can you please attach.
>
> -AVM
>
> On 10/1/2015 10:34 PM, Alex Jones wrote:
>> Summary: CKPT: fix crash in cpnd when opening replica times out [#1510]
>> Review request for Trac Ticket(s): 1510
>> Peer Reviewer(s): AVM
>> Pull request to: AVM
>> Affected branch(es): default, 4.7, 4.6, 4.5
>> Development branch: <<IF ANY GIVE THE REPO URL>>
>>
>> --------------------------------
>> Impacted area       Impact y/n
>> --------------------------------
>>   Docs                    n
>>   Build system            n
>>   RPM/packaging           n
>>   Configuration files     n
>>   Startup scripts         n
>>   SAF services            y
>>   OpenSAF services        n
>>   Core libraries          n
>>   Samples                 n
>>   Tests                   n
>>   Other                   n
>>
>>
>> Comments (indicate scope for each "y" above):
>> ---------------------------------------------
>>   <<EXPLAIN/COMMENT THE PATCH SERIES HERE>>
>>
>> changeset 923566e6c96312c15330b4e8ed0c81a80a2701f0
>> Author:    Alex Jones <[email protected]>
>> Date:    Thu, 01 Oct 2015 12:56:53 -0400
>>
>>     ckptnd: fix crash when checkpoint open sync to active times out 
>> [#1510]
>>
>>     ckptnd core dumps with many different stack traces
>>
>>     When a collocated checkpoint replica is opened, and the active 
>> replica has
>>     large numbers of sections (~200k), the sync from the active to 
>> the replica
>>     can timeout. If the MDS sync succeeds, but the error code in the 
>> out_evt is
>>     not SA_AIS_OK, the current code jumps to the 
>> ckpt_shm_node_free_error label.
>>     The code under this label assumes that the node was not 
>> successfully created
>>     in the database, so doesn't remove it. But in this case it was 
>> created. The
>>     node memory is freed, but the node is not removed from the 
>> database. The
>>     next time this checkpoint is accessed, cpnd will access freed 
>> memory and
>>     crash.
>>
>>     Set a flag after the node has been added to the database. And in the
>>     ckpt_node_free_error label, remove the node from the database if 
>> it was
>>     added.
>>
>>
>> Complete diffstat:
>> ------------------
>>   osaf/services/saf/cpsv/cpnd/cpnd_evt.c |  10 ++++++++++
>>   1 files changed, 10 insertions(+), 0 deletions(-)
>>
>>
>> Testing Commands:
>> -----------------
>> 1) create a collocated checkpoint with 200k sections, and continue 
>> updating the
>>     sections
>> 2) open the same checkpoint on another node (this creates a replica)
>>
>>
>> Testing, Expected Results:
>> --------------------------
>> 1) cpnd on the replica node should not crash, and sync should succeed
>>
>>
>> Conditions of Submission:
>> -------------------------
>>   <<HOW MANY DAYS BEFORE PUSHING, CONSENSUS ETC>>
>>
>>
>> Arch      Built     Started    Linux distro
>> -------------------------------------------
>> mips        n          n
>> mips64      n          n
>> x86         n          n
>> x86_64      y          y
>> powerpc     n          n
>> powerpc64   n          n
>>
>>
>> Reviewer Checklist:
>> -------------------
>> [Submitters: make sure that your review doesn't trigger any checkmarks!]
>>
>>
>> Your checkin has not passed review because (see checked entries):
>>
>> ___ Your RR template is generally incomplete; it has too many blank 
>> entries
>>      that need proper data filled in.
>>
>> ___ You have failed to nominate the proper persons for review and push.
>>
>> ___ Your patches do not have proper short+long header
>>
>> ___ You have grammar/spelling in your header that is unacceptable.
>>
>> ___ You have exceeded a sensible line length in your 
>> headers/comments/text.
>>
>> ___ You have failed to put in a proper Trac Ticket # into your commits.
>>
>> ___ You have incorrectly put/left internal data in your comments/files
>>      (i.e. internal bug tracking tool IDs, product names etc)
>>
>> ___ You have not given any evidence of testing beyond basic build tests.
>>      Demonstrate some level of runtime or other sanity testing.
>>
>> ___ You have ^M present in some of your files. These have to be removed.
>>
>> ___ You have needlessly changed whitespace or added whitespace crimes
>>      like trailing spaces, or spaces before tabs.
>>
>> ___ You have mixed real technical changes with whitespace and other
>>      cosmetic code cleanup changes. These have to be separate commits.
>>
>> ___ You need to refactor your submission into logical chunks; there is
>>      too much content into a single commit.
>>
>> ___ You have extraneous garbage in your review (merge commits etc)
>>
>> ___ You have giant attachments which should never have been sent;
>>      Instead you should place your content in a public tree to be 
>> pulled.
>>
>> ___ You have too many commits attached to an e-mail; resend as threaded
>>      commits, or place in a public tree for a pull.
>>
>> ___ You have resent this content multiple times without a clear 
>> indication
>>      of what has changed between each re-send.
>>
>> ___ You have failed to adequately and individually address all of the
>>      comments and change requests that were proposed in the initial 
>> review.
>>
>> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
>>
>> ___ Your computer have a badly configured date and time; confusing the
>>      the threaded patch review.
>>
>> ___ Your changes affect IPC mechanism, and you don't present any results
>>      for in-service upgradability test.
>>
>> ___ Your changes affect user manual and documentation, your patch series
>>      do not contain the patch that updates the Doxygen manual.
>>
>


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to