Hi Hoang,

Can you please provide the snapshot of "df -h /dev/shm" when the issue 
occurs.

Also.. can you please provide the corresponding core dump file and 
"osafckptnd" process file (process file not required if this image is 
build for SLES x86_64)

Thanks,
Ramesh.

On 11/30/2016 12:10 PM, Vo Minh Hoang wrote:
> Dear Mahesh,o
>
> I will do following your request.
> Please note that test case executing time is very long so result might not
> be available in today.
>
> Thank you and best regards,
> Hoang
>
> -----Original Message-----
> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
> Sent: Wednesday, November 30, 2016 1:35 PM
> To: Vo Minh Hoang <hoang.m...@dektech.com.au>
> Cc: opensaf-devel@lists.sourceforge.net; ramesh.bet...@oracle.com
> Subject: Re: [PATCH 0 of 3] Review Request for leap : now leap library
> ensure shm availability before writing [#2202]
>
> Hi Hoang,
>
> Apply  these change test and provide sys log at the time issue occurred.
>
> ============================================================================
> ========
>
> diff --git a/osaf/libs/core/leap/os_defs.c b/osaf/libs/core/leap/os_defs.c
> --- a/osaf/libs/core/leap/os_defs.c
> +++ b/osaf/libs/core/leap/os_defs.c
> @@ -865,11 +865,15 @@ uint32_t ncs_os_posix_shm(NCS_OS_POSIX_S
>                           }
>
>                           /* Checking whether sufficient shared memory is
> available to write the data, to be safer side ffree reduced to 1 block size
> */
> -                       if (req->info.write.i_write_size >
> (statsvfs.f_bfree * statsvfs.f_frsize)) {
> +                       if (req->info.write.i_write_size >
> ((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) {
>                                   syslog(LOG_ERR, "Insufficient shared memory
> space (%ld) available to write the data of size: %ld \n",
>                                                   (statsvfs.f_bfree *
> statsvfs.f_frsize), req->info.write.i_write_size);
>                                   return NCSCC_RC_FAILURE;
> +                       } else {
> +                               syslog(LOG_ERR, "Sufficient shared
> memory space (%ld) available to write the data of size: %ld \n",
> +                                                (statsvfs.f_bfree *
> statsvfs.f_frsize), req->info.write.i_write_size);
>                           }
> +
>                   }
>                   memcpy((void *)((char *)req->info.write.i_addr +
> req->info.write.i_offset), req->info.write.i_from_buff,
>                                   req->info.write.i_write_size);
>
> ============================================================================
> ========
>
> -AVM
>
> On 11/30/2016 11:56 AM, A V Mahesh wrote:
>> Hi Hoang,
>>
>> Thansk for the test .
>>
>> Then it looks issue is not  related to  SHM deficiency ( 100 % used by
>> other application ) can you please  re-test with below changes and
>> that will confirm us it is completely not related to SHM  free size.
>>
>> replacing:
>>
>> `if (req->info.write.i_write_size > (statsvfs.f_bfree *
>> statsvfs.f_frsize)) {`
>>
>> with below, to be safer side ffree reduced to 1 block size :
>>
>> `if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) *
>> statsvfs.f_frsize)) {
>>
>>
>> -AVM
>>
>>
>> On 11/30/2016 11:37 AM, Vo Minh Hoang wrote:
>>> Dear Mahesh,
>>>
>>> Unfortunately, I have just receive information that the same core
>>> dump still occur after applying patch.
>>>
>>> Here is dump information in short, please tell me if I can do
>>> anything in
>>> support:
>>>
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0  0x00007fe314aa0109 in __memcpy_sse2_unaligned () from
>>> /lib64/libc.so.6
>>> Missing separate debuginfos, use: zypper install
>>> opensaf-ckpt-nodedirector-debuginfo-5.1.0-9999.0.4997518.sle12.x86_64
>>> (gdb) where
>>> #0  0x00007fe314aa0109 in __memcpy_sse2_unaligned () from
>>> /lib64/libc.so.6
>>> #1  0x00007fe315c26082 in memcpy (__len=<optimized out>,
>>> __src=<optimized
>>> out>, __dest=<optimized out>)
>>>       at /usr/include/x86_64-linux-gnu/bits/string3.h:51
>>> #2  ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874
>>> #3  0x0000000000415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0,
>>> sec_info=sec_info@entry=0xb8ff60,
>>>       cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880
>>> #4  0x0000000000406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0,
>>> cp_node=0xb8e8c0, id=0x7fe30c002390,
>>>       exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at
>>> cpnd_db.c:457
>>> #5  0x000000000040d17c in cpnd_evt_proc_ckpt_sect_create
>>> (cb=cb@entry=0x9e57f0,
>>>       evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828)
>>> at
>>> cpnd_evt.c:2267
>>> #6  0x000000000040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at
>>> cpnd_evt.c:227
>>> #7  0x00000000004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at
>>> cpnd_init.c:579
>>> #8  0x0000000000405383 in main (argc=<optimized out>, argv=<optimized
>>> out>)
>>> at cpnd_main.c:79
>>>
>>> Sincerely,
>>> Hoang
>>>
>>> -----Original Message-----
>>> From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com]
>>> Sent: Tuesday, November 29, 2016 5:37 PM
>>> To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com
>>> Cc: opensaf-devel@lists.sourceforge.net
>>> Subject: [PATCH 0 of 3] Review Request for leap : now leap library
>>> ensure shm availability before writing [#2202]
>>>
>>> Summary:leap : now leap library ensure shm availability before
>>> writing [#2202] Review request for Trac Ticket(s): #2202 Peer
> Reviewer(s):
>>> Hoang /
>>> Ramesh Pull request to: <<LIST THE PERSON WITH PUSH ACCESS HERE>>
>>> Affected
>>> branch(es): <<LIST ALL AFFECTED BRANCH(ES)>> Development branch: <<IF
>>> ANY GIVE THE REPO URL>>
>>>
>>> --------------------------------
>>> Impacted area       Impact y/n
>>> --------------------------------
>>>    Docs                    n
>>>    Build system            n
>>>    RPM/packaging           n
>>>    Configuration files     n
>>>    Startup scripts         n
>>>    SAF services            n
>>>    OpenSAF services        y
>>>    Core libraries          y
>>>    Samples                 n
>>>    Tests                   n
>>>    Other                   n
>>>
>>>
>>> Comments (indicate scope for each "y" above):
>>> ---------------------------------------------
>>>
>>> changeset 7b53e1b3754622fe90c22c801adeb7df6d808c30
>>> Author:    A V Mahesh <mahesh.va...@oracle.com>
>>> Date:    Tue, 29 Nov 2016 15:59:21 +0530
>>>
>>>      leap : now leap library ensure shm availability before writing
>>> [#2202]
>>>    Issue    :
>>>
>>>      If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used
>>> in system
>>>      , pnd Segmentation fault (core dumped) at LEAP memcpy().
>>>
>>> Fix :
>>>
>>>      Now LEAP library ensures shm free space before writing This may
>>> degrade
>>>      some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is
>>> set, cpsv
>>>      give natural performance.
>>>
>>> changeset 083114e13c00c9c4267ffe65a86c1a97a951b876
>>> Author:    A V Mahesh <mahesh.va...@oracle.com>
>>> Date:    Tue, 29 Nov 2016 16:02:06 +0530
>>>
>>>      cpsv : update cpsv error handing based on leap changes [#2202]
>>>
>>> changeset fb509abb1d1583315f585663fd75bf73e35211a6
>>> Author:    A V Mahesh <mahesh.va...@oracle.com>
>>> Date:    Tue, 29 Nov 2016 16:02:58 +0530
>>>
>>>      mqsv : update mqsv error handing based on leap changes [#2202]
>>>
>>>
>>> Complete diffstat:
>>> ------------------
>>>    osaf/libs/common/cpsv/include/cpnd_cb.h   |   4 ++--
>>>    osaf/libs/common/cpsv/include/cpnd_init.h |   8 ++++----
>>>    osaf/libs/common/cpsv/include/cpnd_sec.h  |   2 +-
>>>    osaf/libs/core/include/ncs_osprm.h        |   2 +-
>>>    osaf/libs/core/leap/os_defs.c             |  20 ++++++++++++++++++--
>>>    osaf/services/saf/cpsv/cpnd/cpnd_db.c     |  12 ++++++------
>>>    osaf/services/saf/cpsv/cpnd/cpnd_evt.c    |  82
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
>>>
>>> ------
>>>    osaf/services/saf/cpsv/cpnd/cpnd_proc.c   |  31
>>> ++++++++++++++++++-------------
>>>    osaf/services/saf/cpsv/cpnd/cpnd_res.c    |  24
>>> ++++++++----------------
>>>    osaf/services/saf/cpsv/cpnd/cpnd_sec.cc   |  12 ++++++------
>>>    osaf/services/saf/glsv/glnd/glnd_shm.c    |   2 +-
>>>    osaf/services/saf/mqsv/mqnd/mqnd_shm.c    |   2 +-
>>>    12 files changed, 123 insertions(+), 78 deletions(-)
>>>
>>>
>>> Testing Commands:
>>> -----------------
>>> Create situation that node SHM  reaches 100% usage and then perform
>>> any CPSV operation which writes to SHM
>>>
>>> Testing, Expected Results:
>>> --------------------------
>>>    <<PASTE COMMAND OUTPUTS / TEST RESULTS>>
>>>
>>>
>>> Conditions of Submission:
>>> -------------------------
>>>    <<HOW MANY DAYS BEFORE PUSHING, CONSENSUS ETC>>
>>>
>>>
>>> Arch      Built     Started    Linux distro
>>> -------------------------------------------
>>> mips        n          n
>>> mips64      n          n
>>> x86         n          n
>>> x86_64      y          y
>>> powerpc     n          n
>>> powerpc64   n          n
>>>
>>>
>>> Reviewer Checklist:
>>> -------------------
>>> [Submitters: make sure that your review doesn't trigger any
>>> checkmarks!]
>>>
>>>
>>> Your checkin has not passed review because (see checked entries):
>>>
>>> ___ Your RR template is generally incomplete; it has too many blank
>>> entries
>>>       that need proper data filled in.
>>>
>>> ___ You have failed to nominate the proper persons for review and push.
>>>
>>> ___ Your patches do not have proper short+long header
>>>
>>> ___ You have grammar/spelling in your header that is unacceptable.
>>>
>>> ___ You have exceeded a sensible line length in your
>>> headers/comments/text.
>>>
>>> ___ You have failed to put in a proper Trac Ticket # into your commits.
>>>
>>> ___ You have incorrectly put/left internal data in your comments/files
>>>       (i.e. internal bug tracking tool IDs, product names etc)
>>>
>>> ___ You have not given any evidence of testing beyond basic build tests.
>>>       Demonstrate some level of runtime or other sanity testing.
>>>
>>> ___ You have ^M present in some of your files. These have to be removed.
>>>
>>> ___ You have needlessly changed whitespace or added whitespace crimes
>>>       like trailing spaces, or spaces before tabs.
>>>
>>> ___ You have mixed real technical changes with whitespace and other
>>>       cosmetic code cleanup changes. These have to be separate commits.
>>>
>>> ___ You need to refactor your submission into logical chunks; there is
>>>       too much content into a single commit.
>>>
>>> ___ You have extraneous garbage in your review (merge commits etc)
>>>
>>> ___ You have giant attachments which should never have been sent;
>>>       Instead you should place your content in a public tree to be
>>> pulled.
>>>
>>> ___ You have too many commits attached to an e-mail; resend as threaded
>>>       commits, or place in a public tree for a pull.
>>>
>>> ___ You have resent this content multiple times without a clear
>>> indication
>>>       of what has changed between each re-send.
>>>
>>> ___ You have failed to adequately and individually address all of the
>>>       comments and change requests that were proposed in the initial
>>> review.
>>>
>>> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
>>>
>>> ___ Your computer have a badly configured date and time; confusing the
>>>       the threaded patch review.
>>>
>>> ___ Your changes affect IPC mechanism, and you don't present any results
>>>       for in-service upgradability test.
>>>
>>> ___ Your changes affect user manual and documentation, your patch series
>>>       do not contain the patch that updates the Doxygen manual.
>>>
>>>
>


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to