Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]
Dear Mahesh, ACK all three patches, tested, found no problem. Sincerely, Hoang -Original Message- From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] Sent: Tuesday, November 29, 2016 5:37 PM To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com Cc: opensaf-devel@lists.sourceforge.net Subject: [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202] Summary:leap : now leap library ensure shm availability before writing [#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): Hoang / Ramesh Pull request to: <> Affected branch(es): <> Development branch: <> Impacted area Impact y/n Docsn Build systemn RPM/packaging n Configuration files n Startup scripts n SAF servicesn OpenSAF servicesy Core libraries y Samples n Tests n Other n Comments (indicate scope for each "y" above): - changeset 7b53e1b3754622fe90c22c801adeb7df6d808c30 Author: A V Mahesh Date: Tue, 29 Nov 2016 15:59:21 +0530 leap : now leap library ensure shm availability before writing [#2202] Issue : If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in system , pnd Segmentation fault (core dumped) at LEAP memcpy(). Fix : Now LEAP library ensures shm free space before writing This may degrade some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is set, cpsv give natural performance. changeset 083114e13c00c9c4267ffe65a86c1a97a951b876 Author: A V Mahesh Date: Tue, 29 Nov 2016 16:02:06 +0530 cpsv : update cpsv error handing based on leap changes [#2202] changeset fb509abb1d1583315f585663fd75bf73e35211a6 Author: A V Mahesh Date: Tue, 29 Nov 2016 16:02:58 +0530 mqsv : update mqsv error handing based on leap changes [#2202] Complete diffstat: -- osaf/libs/common/cpsv/include/cpnd_cb.h | 4 ++-- osaf/libs/common/cpsv/include/cpnd_init.h | 8 osaf/libs/common/cpsv/include/cpnd_sec.h | 2 +- osaf/libs/core/include/ncs_osprm.h| 2 +- osaf/libs/core/leap/os_defs.c | 20 ++-- osaf/services/saf/cpsv/cpnd/cpnd_db.c | 12 ++-- osaf/services/saf/cpsv/cpnd/cpnd_evt.c| 82 +--- -- osaf/services/saf/cpsv/cpnd/cpnd_proc.c | 31 ++- osaf/services/saf/cpsv/cpnd/cpnd_res.c| 24 osaf/services/saf/cpsv/cpnd/cpnd_sec.cc | 12 ++-- osaf/services/saf/glsv/glnd/glnd_shm.c| 2 +- osaf/services/saf/mqsv/mqnd/mqnd_shm.c| 2 +- 12 files changed, 123 insertions(+), 78 deletions(-) Testing Commands: - Create situation that node SHM reaches 100% usage and then perform any CPSV operation which writes to SHM Testing, Expected Results: -- <> Conditions of Submission: - <> Arch Built StartedLinux distro --- mipsn n mips64 n n x86 n n x86_64 y y powerpc n n powerpc64 n n Reviewer Checklist: --- [Submitters: make sure that your review doesn't trigger any checkmarks!] Your checkin has not passed review because (see checked entries): ___ Your RR template is generally incomplete; it has too many blank entries that need proper data filled in. ___ You have failed to nominate the proper persons for review and push. ___ Your patches do not have proper short+long header ___ You have grammar/spelling in your header that is unacceptable. ___ You have exceeded a sensible line length in your headers/comments/text. ___ You have failed to put in a proper Trac Ticket # into your commits. ___ You have incorrectly put/left internal data in your comments/files (i.e. internal bug tracking tool IDs, product names etc) ___ You have not given any evidence of testing beyond basic build tests. Demonstrate some level of runtime or other sanity testing. ___ You have ^M present in some of your files. These have to be removed. ___ You have needlessly changed whitespace or added whitespace crimes like trailing spaces, or spaces before tabs. ___ You have mixed real technical changes with whitespace and other cosmetic code cleanup changes. These have to be separate commits. ___ You need to refactor your submission into logical chunks; there is too much content into a single commit. ___ You have extraneous garbage in your review (merge commits etc) ___ You have giant attachments which should never have been sent; Instead you should place your content in a pu
Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]
Hi Hoang, We are some how able to simulated your test case. Following are the detailed steps how we reproduced , this test is generating same core dumb as below. But the provided patch resolved the issue, can you please test your self provide your observations , if the test is different please update . Attached :035_cpsv_2202_V2_debug.patch & 036_cpsv_2207_debug.patch Test application : cpsv_shm_2202.c == 1) /etc/init.d/opensafd stop 2) Change the defaults /dev/shm size to 3MB # vi /etc/fstab tmpfs And add following line ` tmpfs /dev/shm tmpfs defaults,size=3m 0 0` 3) Remount /dev/shm #mount -o remount /dev/shm 4) Check /dev/shm reflected with new value # df -k /dev/shm/ Filesystem 1K-blocks Used Available Use% Mounted on tmpfs 3072 0 3072 0% /dev/shm 5) set ulimit to unlimited #ulimit -c unlimited 6) #/etc/init.d/opensafd start 7) Compile & run attached test application ( cpsv_shm_2202.c ) #gcc cpsv_shm_2202.c -o ckpt_shm -lSaCkpt # ./ckpt_shm 8) Once /dev/shm/ reach 100% Use you will see core dump same as yours # df -k /dev/shm/ 7) Then we applied the patch test again with `cpsv_2202_V2_debug.patch` & `cpsv_2207_debug.patch`) no core dump saCkptSectionCreate 1 returned 18. ( no core dump ) == -AVM On 11/30/2016 11:37 AM, Vo Minh Hoang wrote: Dear Mahesh, Unfortunately, I have just receive information that the same core dump still occur after applying patch. Here is dump information in short, please tell me if I can do anything in support: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6 Missing separate debuginfos, use: zypper install opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64 (gdb) where #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6 #1 0x7fe315c26082 in memcpy (__len=, __src=, __dest=) at /usr/include/x86_64-linux-gnu/bits/string3.h:51 #2 ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874 #3 0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0, sec_info=sec_info@entry=0xb8ff60, cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880 #4 0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0, cp_node=0xb8e8c0, id=0x7fe30c002390, exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at cpnd_db.c:457 #5 0x0040d17c in cpnd_evt_proc_ckpt_sect_create (cb=cb@entry=0x9e57f0, evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) at cpnd_evt.c:2267 #6 0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at cpnd_evt.c:227 #7 0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at cpnd_init.c:579 #8 0x00405383 in main (argc=, argv=) at cpnd_main.c:79 Sincerely, Hoang -Original Message- From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] Sent: Tuesday, November 29, 2016 5:37 PM To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com Cc: opensaf-devel@lists.sourceforge.net Subject: [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202] Summary:leap : now leap library ensure shm availability before writing [#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): Hoang / Ramesh Pull request to: <> Affected branch(es): <> Development branch: <> Impacted area Impact y/n Docsn Build systemn RPM/packaging n Configuration files n Startup scripts n SAF servicesn OpenSAF servicesy Core libraries y Samples n Tests n Other n Comments (indicate scope for each "y" above): - changeset 7b53e1b3754622fe90c22c801adeb7df6d808c30 Author: A V Mahesh Date: Tue, 29 Nov 2016 15:59:21 +0530 leap : now leap library ensure shm availability before writing [#2202] Issue : If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in system , pnd Segmentation fault (core dumped) at LEAP memcpy(). Fix : Now LEAP library ensures shm free space before writing This may degrade some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is set, cpsv give natural performance. changeset 083114e13c00c9c4267ffe65a86c1a97a951b876 Author: A V Mahesh Date: Tue, 29 Nov 2016 16:02:06 +0530 cpsv : update cpsv error handing based on leap changes [#2202] changeset fb509abb1d1583315f585663fd75bf73e35211a6 Author: A V Mahesh Date: Tue, 29 Nov 2016 16:02:58 +0530 mqsv : update mqsv error ha
Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]
Hi Hoang, Can you please provide the snapshot of "df -h /dev/shm" when the issue occurs. Also.. can you please provide the corresponding core dump file and "osafckptnd" process file (process file not required if this image is build for SLES x86_64) Thanks, Ramesh. On 11/30/2016 12:10 PM, Vo Minh Hoang wrote: > Dear Mahesh,o > > I will do following your request. > Please note that test case executing time is very long so result might not > be available in today. > > Thank you and best regards, > Hoang > > -Original Message- > From: A V Mahesh [mailto:mahesh.va...@oracle.com] > Sent: Wednesday, November 30, 2016 1:35 PM > To: Vo Minh Hoang > Cc: opensaf-devel@lists.sourceforge.net; ramesh.bet...@oracle.com > Subject: Re: [PATCH 0 of 3] Review Request for leap : now leap library > ensure shm availability before writing [#2202] > > Hi Hoang, > > Apply these change test and provide sys log at the time issue occurred. > > > > > diff --git a/osaf/libs/core/leap/os_defs.c b/osaf/libs/core/leap/os_defs.c > --- a/osaf/libs/core/leap/os_defs.c > +++ b/osaf/libs/core/leap/os_defs.c > @@ -865,11 +865,15 @@ uint32_t ncs_os_posix_shm(NCS_OS_POSIX_S > } > > /* Checking whether sufficient shared memory is > available to write the data, to be safer side ffree reduced to 1 block size > */ > - if (req->info.write.i_write_size > > (statsvfs.f_bfree * statsvfs.f_frsize)) { > + if (req->info.write.i_write_size > > ((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) { > syslog(LOG_ERR, "Insufficient shared memory > space (%ld) available to write the data of size: %ld \n", > (statsvfs.f_bfree * > statsvfs.f_frsize), req->info.write.i_write_size); > return NCSCC_RC_FAILURE; > + } else { > + syslog(LOG_ERR, "Sufficient shared > memory space (%ld) available to write the data of size: %ld \n", > +(statsvfs.f_bfree * > statsvfs.f_frsize), req->info.write.i_write_size); > } > + > } > memcpy((void *)((char *)req->info.write.i_addr + > req->info.write.i_offset), req->info.write.i_from_buff, > req->info.write.i_write_size); > > > > > -AVM > > On 11/30/2016 11:56 AM, A V Mahesh wrote: >> Hi Hoang, >> >> Thansk for the test . >> >> Then it looks issue is not related to SHM deficiency ( 100 % used by >> other application ) can you please re-test with below changes and >> that will confirm us it is completely not related to SHM free size. >> >> replacing: >> >> `if (req->info.write.i_write_size > (statsvfs.f_bfree * >> statsvfs.f_frsize)) {` >> >> with below, to be safer side ffree reduced to 1 block size : >> >> `if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * >> statsvfs.f_frsize)) { >> >> >> -AVM >> >> >> On 11/30/2016 11:37 AM, Vo Minh Hoang wrote: >>> Dear Mahesh, >>> >>> Unfortunately, I have just receive information that the same core >>> dump still occur after applying patch. >>> >>> Here is dump information in short, please tell me if I can do >>> anything in >>> support: >>> >>> Program terminated with signal SIGSEGV, Segmentation fault. >>> #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from >>> /lib64/libc.so.6 >>> Missing separate debuginfos, use: zypper install >>> opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64 >>> (gdb) where >>> #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from >>> /lib64/libc.so.6 >>> #1 0x7fe315c26082 in memcpy (__len=, >>> __src=>> out>, __dest=) >>> at /usr/include/x86_64-linux-gnu/bits/string3.h:51 >>> #2 ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874 >>> #3 0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0, >>> sec_info=sec_info@entry=0xb8ff60, >>> cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880 >>> #4 0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0, >>> cp_node=0xb8e8c0, id=0x7fe30c002390, >>> exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at >>> cpnd_db.c:457 >>> #5 0x0040d17c in cpnd_evt_proc_ckpt_sect_create >>> (cb=cb@entry=0x9e57f0, >>> evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) >>> at >>> cpnd_evt.c:2267 >>> #6 0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at >>> cpnd_evt.c:227 >>> #7 0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at >>> cpnd_init.c:579 >>> #8 0x00405383 in main (argc=, argv=>> out>) >>> at cpnd_main.c:79 >>> >>> Sincerely, >>> Hoang >>> >>> -Original Message- >>> From: mahesh.va
Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]
Hi Hoang, We have no option that , why because reproducible steps are not easy . Please find the additional `2202_addtional.patch` patch on top of published patch this will confirm us it is completely not related to SHM free size. -AVM On 11/30/2016 12:10 PM, Vo Minh Hoang wrote: Dear Mahesh, I will do following your request. Please note that test case executing time is very long so result might not be available in today. Thank you and best regards, Hoang -Original Message- From: A V Mahesh [mailto:mahesh.va...@oracle.com] Sent: Wednesday, November 30, 2016 1:35 PM To: Vo Minh Hoang Cc: opensaf-devel@lists.sourceforge.net; ramesh.bet...@oracle.com Subject: Re: [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202] Hi Hoang, Apply these change test and provide sys log at the time issue occurred. diff --git a/osaf/libs/core/leap/os_defs.c b/osaf/libs/core/leap/os_defs.c --- a/osaf/libs/core/leap/os_defs.c +++ b/osaf/libs/core/leap/os_defs.c @@ -865,11 +865,15 @@ uint32_t ncs_os_posix_shm(NCS_OS_POSIX_S } /* Checking whether sufficient shared memory is available to write the data, to be safer side ffree reduced to 1 block size */ - if (req->info.write.i_write_size > (statsvfs.f_bfree * statsvfs.f_frsize)) { + if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) { syslog(LOG_ERR, "Insufficient shared memory space (%ld) available to write the data of size: %ld \n", (statsvfs.f_bfree * statsvfs.f_frsize), req->info.write.i_write_size); return NCSCC_RC_FAILURE; + } else { + syslog(LOG_ERR, "Sufficient shared memory space (%ld) available to write the data of size: %ld \n", +(statsvfs.f_bfree * statsvfs.f_frsize), req->info.write.i_write_size); } + } memcpy((void *)((char *)req->info.write.i_addr + req->info.write.i_offset), req->info.write.i_from_buff, req->info.write.i_write_size); -AVM On 11/30/2016 11:56 AM, A V Mahesh wrote: Hi Hoang, Thansk for the test . Then it looks issue is not related to SHM deficiency ( 100 % used by other application ) can you please re-test with below changes and that will confirm us it is completely not related to SHM free size. replacing: `if (req->info.write.i_write_size > (statsvfs.f_bfree * statsvfs.f_frsize)) {` with below, to be safer side ffree reduced to 1 block size : `if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) { -AVM On 11/30/2016 11:37 AM, Vo Minh Hoang wrote: Dear Mahesh, Unfortunately, I have just receive information that the same core dump still occur after applying patch. Here is dump information in short, please tell me if I can do anything in support: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6 Missing separate debuginfos, use: zypper install opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64 (gdb) where #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6 #1 0x7fe315c26082 in memcpy (__len=, __src=, __dest=) at /usr/include/x86_64-linux-gnu/bits/string3.h:51 #2 ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874 #3 0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0, sec_info=sec_info@entry=0xb8ff60, cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880 #4 0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0, cp_node=0xb8e8c0, id=0x7fe30c002390, exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at cpnd_db.c:457 #5 0x0040d17c in cpnd_evt_proc_ckpt_sect_create (cb=cb@entry=0x9e57f0, evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) at cpnd_evt.c:2267 #6 0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at cpnd_evt.c:227 #7 0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at cpnd_init.c:579 #8 0x00405383 in main (argc=, argv=) at cpnd_main.c:79 Sincerely, Hoang -Original Message- From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] Sent: Tuesday, November 29, 2016 5:37 PM To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com Cc: opensaf-devel@lists.sourceforge.net Subject: [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202] Summary:leap : now leap library ensure shm availability before writing [#2202] Rev
Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]
Dear Mahesh, I will do following your request. Please note that test case executing time is very long so result might not be available in today. Thank you and best regards, Hoang -Original Message- From: A V Mahesh [mailto:mahesh.va...@oracle.com] Sent: Wednesday, November 30, 2016 1:35 PM To: Vo Minh Hoang Cc: opensaf-devel@lists.sourceforge.net; ramesh.bet...@oracle.com Subject: Re: [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202] Hi Hoang, Apply these change test and provide sys log at the time issue occurred. diff --git a/osaf/libs/core/leap/os_defs.c b/osaf/libs/core/leap/os_defs.c --- a/osaf/libs/core/leap/os_defs.c +++ b/osaf/libs/core/leap/os_defs.c @@ -865,11 +865,15 @@ uint32_t ncs_os_posix_shm(NCS_OS_POSIX_S } /* Checking whether sufficient shared memory is available to write the data, to be safer side ffree reduced to 1 block size */ - if (req->info.write.i_write_size > (statsvfs.f_bfree * statsvfs.f_frsize)) { + if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) { syslog(LOG_ERR, "Insufficient shared memory space (%ld) available to write the data of size: %ld \n", (statsvfs.f_bfree * statsvfs.f_frsize), req->info.write.i_write_size); return NCSCC_RC_FAILURE; + } else { + syslog(LOG_ERR, "Sufficient shared memory space (%ld) available to write the data of size: %ld \n", +(statsvfs.f_bfree * statsvfs.f_frsize), req->info.write.i_write_size); } + } memcpy((void *)((char *)req->info.write.i_addr + req->info.write.i_offset), req->info.write.i_from_buff, req->info.write.i_write_size); -AVM On 11/30/2016 11:56 AM, A V Mahesh wrote: > Hi Hoang, > > Thansk for the test . > > Then it looks issue is not related to SHM deficiency ( 100 % used by > other application ) can you please re-test with below changes and > that will confirm us it is completely not related to SHM free size. > > replacing: > > `if (req->info.write.i_write_size > (statsvfs.f_bfree * > statsvfs.f_frsize)) {` > > with below, to be safer side ffree reduced to 1 block size : > > `if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * > statsvfs.f_frsize)) { > > > -AVM > > > On 11/30/2016 11:37 AM, Vo Minh Hoang wrote: >> Dear Mahesh, >> >> Unfortunately, I have just receive information that the same core >> dump still occur after applying patch. >> >> Here is dump information in short, please tell me if I can do >> anything in >> support: >> >> Program terminated with signal SIGSEGV, Segmentation fault. >> #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from >> /lib64/libc.so.6 >> Missing separate debuginfos, use: zypper install >> opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64 >> (gdb) where >> #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from >> /lib64/libc.so.6 >> #1 0x7fe315c26082 in memcpy (__len=, >> __src=> out>, __dest=) >> at /usr/include/x86_64-linux-gnu/bits/string3.h:51 >> #2 ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874 >> #3 0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0, >> sec_info=sec_info@entry=0xb8ff60, >> cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880 >> #4 0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0, >> cp_node=0xb8e8c0, id=0x7fe30c002390, >> exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at >> cpnd_db.c:457 >> #5 0x0040d17c in cpnd_evt_proc_ckpt_sect_create >> (cb=cb@entry=0x9e57f0, >> evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) >> at >> cpnd_evt.c:2267 >> #6 0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at >> cpnd_evt.c:227 >> #7 0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at >> cpnd_init.c:579 >> #8 0x00405383 in main (argc=, argv=> out>) >> at cpnd_main.c:79 >> >> Sincerely, >> Hoang >> >> -Original Message- >> From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] >> Sent: Tuesday, November 29, 2016 5:37 PM >> To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com >> Cc: opensaf-devel@lists.sourceforge.net >> Subject: [PATCH 0 of 3] Review Request for leap : now leap library >> ensure shm availability before writing [#2202] >> >> Summary:leap : now leap library ensure shm availability before >> writing [#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): >> Hoang / >> Ramesh Pull request t
Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]
Hi Hoang, Apply these change test and provide sys log at the time issue occurred. diff --git a/osaf/libs/core/leap/os_defs.c b/osaf/libs/core/leap/os_defs.c --- a/osaf/libs/core/leap/os_defs.c +++ b/osaf/libs/core/leap/os_defs.c @@ -865,11 +865,15 @@ uint32_t ncs_os_posix_shm(NCS_OS_POSIX_S } /* Checking whether sufficient shared memory is available to write the data, to be safer side ffree reduced to 1 block size */ - if (req->info.write.i_write_size > (statsvfs.f_bfree * statsvfs.f_frsize)) { + if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) { syslog(LOG_ERR, "Insufficient shared memory space (%ld) available to write the data of size: %ld \n", (statsvfs.f_bfree * statsvfs.f_frsize), req->info.write.i_write_size); return NCSCC_RC_FAILURE; + } else { + syslog(LOG_ERR, "Sufficient shared memory space (%ld) available to write the data of size: %ld \n", +(statsvfs.f_bfree * statsvfs.f_frsize), req->info.write.i_write_size); } + } memcpy((void *)((char *)req->info.write.i_addr + req->info.write.i_offset), req->info.write.i_from_buff, req->info.write.i_write_size); -AVM On 11/30/2016 11:56 AM, A V Mahesh wrote: > Hi Hoang, > > Thansk for the test . > > Then it looks issue is not related to SHM deficiency ( 100 % used by > other application ) > can you please re-test with below changes and that will confirm us it > is completely not related to SHM free size. > > replacing: > > `if (req->info.write.i_write_size > (statsvfs.f_bfree * > statsvfs.f_frsize)) {` > > with below, to be safer side ffree reduced to 1 block size : > > `if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * > statsvfs.f_frsize)) { > > > -AVM > > > On 11/30/2016 11:37 AM, Vo Minh Hoang wrote: >> Dear Mahesh, >> >> Unfortunately, I have just receive information that the same core >> dump still >> occur after applying patch. >> >> Here is dump information in short, please tell me if I can do >> anything in >> support: >> >> Program terminated with signal SIGSEGV, Segmentation fault. >> #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from >> /lib64/libc.so.6 >> Missing separate debuginfos, use: zypper install >> opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64 >> (gdb) where >> #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from >> /lib64/libc.so.6 >> #1 0x7fe315c26082 in memcpy (__len=, >> __src=> out>, __dest=) >> at /usr/include/x86_64-linux-gnu/bits/string3.h:51 >> #2 ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874 >> #3 0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0, >> sec_info=sec_info@entry=0xb8ff60, >> cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880 >> #4 0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0, >> cp_node=0xb8e8c0, id=0x7fe30c002390, >> exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at >> cpnd_db.c:457 >> #5 0x0040d17c in cpnd_evt_proc_ckpt_sect_create >> (cb=cb@entry=0x9e57f0, >> evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) at >> cpnd_evt.c:2267 >> #6 0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at >> cpnd_evt.c:227 >> #7 0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at >> cpnd_init.c:579 >> #8 0x00405383 in main (argc=, argv=> out>) >> at cpnd_main.c:79 >> >> Sincerely, >> Hoang >> >> -Original Message- >> From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] >> Sent: Tuesday, November 29, 2016 5:37 PM >> To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com >> Cc: opensaf-devel@lists.sourceforge.net >> Subject: [PATCH 0 of 3] Review Request for leap : now leap library >> ensure >> shm availability before writing [#2202] >> >> Summary:leap : now leap library ensure shm availability before writing >> [#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): >> Hoang / >> Ramesh Pull request to: <> >> Affected >> branch(es): <> Development branch: <> ANY >> GIVE THE REPO URL>> >> >> >> Impacted area Impact y/n >> >> Docsn >> Build systemn >> RPM/packaging n >> Configuration files n >> Startup scripts n >> SAF servicesn >> OpenSAF servicesy >> Core libraries y >> Samples n >> Tests
Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]
Hi Hoang, Thansk for the test . Then it looks issue is not related to SHM deficiency ( 100 % used by other application ) can you please re-test with below changes and that will confirm us it is completely not related to SHM free size. replacing: `if (req->info.write.i_write_size > (statsvfs.f_bfree * statsvfs.f_frsize)) {` with below, to be safer side ffree reduced to 1 block size : `if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) { -AVM On 11/30/2016 11:37 AM, Vo Minh Hoang wrote: > Dear Mahesh, > > Unfortunately, I have just receive information that the same core dump still > occur after applying patch. > > Here is dump information in short, please tell me if I can do anything in > support: > > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6 > Missing separate debuginfos, use: zypper install > opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64 > (gdb) where > #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6 > #1 0x7fe315c26082 in memcpy (__len=, __src= out>, __dest=) > at /usr/include/x86_64-linux-gnu/bits/string3.h:51 > #2 ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874 > #3 0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0, > sec_info=sec_info@entry=0xb8ff60, > cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880 > #4 0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0, > cp_node=0xb8e8c0, id=0x7fe30c002390, > exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at > cpnd_db.c:457 > #5 0x0040d17c in cpnd_evt_proc_ckpt_sect_create > (cb=cb@entry=0x9e57f0, > evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) at > cpnd_evt.c:2267 > #6 0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at > cpnd_evt.c:227 > #7 0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at > cpnd_init.c:579 > #8 0x00405383 in main (argc=, argv=) > at cpnd_main.c:79 > > Sincerely, > Hoang > > -Original Message- > From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] > Sent: Tuesday, November 29, 2016 5:37 PM > To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com > Cc: opensaf-devel@lists.sourceforge.net > Subject: [PATCH 0 of 3] Review Request for leap : now leap library ensure > shm availability before writing [#2202] > > Summary:leap : now leap library ensure shm availability before writing > [#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): Hoang / > Ramesh Pull request to: <> Affected > branch(es): <> Development branch: < GIVE THE REPO URL>> > > > Impacted area Impact y/n > > Docsn > Build systemn > RPM/packaging n > Configuration files n > Startup scripts n > SAF servicesn > OpenSAF servicesy > Core libraries y > Samples n > Tests n > Other n > > > Comments (indicate scope for each "y" above): > - > > changeset 7b53e1b3754622fe90c22c801adeb7df6d808c30 > Author: A V Mahesh > Date: Tue, 29 Nov 2016 15:59:21 +0530 > > leap : now leap library ensure shm availability before writing > [#2202] > Issue : > > If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in > system > , pnd Segmentation fault (core dumped) at LEAP memcpy(). > > Fix : > > Now LEAP library ensures shm free space before writing This may > degrade > some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is set, > cpsv > give natural performance. > > changeset 083114e13c00c9c4267ffe65a86c1a97a951b876 > Author: A V Mahesh > Date: Tue, 29 Nov 2016 16:02:06 +0530 > > cpsv : update cpsv error handing based on leap changes [#2202] > > changeset fb509abb1d1583315f585663fd75bf73e35211a6 > Author: A V Mahesh > Date: Tue, 29 Nov 2016 16:02:58 +0530 > > mqsv : update mqsv error handing based on leap changes [#2202] > > > Complete diffstat: > -- > osaf/libs/common/cpsv/include/cpnd_cb.h | 4 ++-- > osaf/libs/common/cpsv/include/cpnd_init.h | 8 > osaf/libs/common/cpsv/include/cpnd_sec.h | 2 +- > osaf/libs/core/include/ncs_osprm.h| 2 +- > osaf/libs/core/leap/os_defs.c | 20 ++-- > osaf/services/saf/cpsv/cpnd/cpnd_db.c | 12 ++-- > osaf/services/saf/cpsv/cpnd/cpnd_evt.c| 82 > +--- > -- > osaf/services/saf/cpsv/cpnd/cpnd_proc.c | 31 > ++- > osaf/services/saf/cpsv/cpnd/cpnd_res.c| 24 > osaf/services/saf/cpsv/cpnd/cpnd_sec.cc | 12 ++---
Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]
Dear Mahesh, Unfortunately, I have just receive information that the same core dump still occur after applying patch. Here is dump information in short, please tell me if I can do anything in support: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6 Missing separate debuginfos, use: zypper install opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64 (gdb) where #0 0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6 #1 0x7fe315c26082 in memcpy (__len=, __src=, __dest=) at /usr/include/x86_64-linux-gnu/bits/string3.h:51 #2 ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874 #3 0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0, sec_info=sec_info@entry=0xb8ff60, cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880 #4 0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0, cp_node=0xb8e8c0, id=0x7fe30c002390, exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at cpnd_db.c:457 #5 0x0040d17c in cpnd_evt_proc_ckpt_sect_create (cb=cb@entry=0x9e57f0, evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) at cpnd_evt.c:2267 #6 0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at cpnd_evt.c:227 #7 0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at cpnd_init.c:579 #8 0x00405383 in main (argc=, argv=) at cpnd_main.c:79 Sincerely, Hoang -Original Message- From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] Sent: Tuesday, November 29, 2016 5:37 PM To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com Cc: opensaf-devel@lists.sourceforge.net Subject: [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202] Summary:leap : now leap library ensure shm availability before writing [#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): Hoang / Ramesh Pull request to: <> Affected branch(es): <> Development branch: <> Impacted area Impact y/n Docsn Build systemn RPM/packaging n Configuration files n Startup scripts n SAF servicesn OpenSAF servicesy Core libraries y Samples n Tests n Other n Comments (indicate scope for each "y" above): - changeset 7b53e1b3754622fe90c22c801adeb7df6d808c30 Author: A V Mahesh Date: Tue, 29 Nov 2016 15:59:21 +0530 leap : now leap library ensure shm availability before writing [#2202] Issue : If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in system , pnd Segmentation fault (core dumped) at LEAP memcpy(). Fix : Now LEAP library ensures shm free space before writing This may degrade some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is set, cpsv give natural performance. changeset 083114e13c00c9c4267ffe65a86c1a97a951b876 Author: A V Mahesh Date: Tue, 29 Nov 2016 16:02:06 +0530 cpsv : update cpsv error handing based on leap changes [#2202] changeset fb509abb1d1583315f585663fd75bf73e35211a6 Author: A V Mahesh Date: Tue, 29 Nov 2016 16:02:58 +0530 mqsv : update mqsv error handing based on leap changes [#2202] Complete diffstat: -- osaf/libs/common/cpsv/include/cpnd_cb.h | 4 ++-- osaf/libs/common/cpsv/include/cpnd_init.h | 8 osaf/libs/common/cpsv/include/cpnd_sec.h | 2 +- osaf/libs/core/include/ncs_osprm.h| 2 +- osaf/libs/core/leap/os_defs.c | 20 ++-- osaf/services/saf/cpsv/cpnd/cpnd_db.c | 12 ++-- osaf/services/saf/cpsv/cpnd/cpnd_evt.c| 82 +--- -- osaf/services/saf/cpsv/cpnd/cpnd_proc.c | 31 ++- osaf/services/saf/cpsv/cpnd/cpnd_res.c| 24 osaf/services/saf/cpsv/cpnd/cpnd_sec.cc | 12 ++-- osaf/services/saf/glsv/glnd/glnd_shm.c| 2 +- osaf/services/saf/mqsv/mqnd/mqnd_shm.c| 2 +- 12 files changed, 123 insertions(+), 78 deletions(-) Testing Commands: - Create situation that node SHM reaches 100% usage and then perform any CPSV operation which writes to SHM Testing, Expected Results: -- <> Conditions of Submission: - <> Arch Built StartedLinux distro --- mipsn n mips64 n n x86 n n x86_64 y y powerpc n n powerpc64 n n Reviewer Checklist: --- [Submitters: make sure that your review doesn't trigger any checkmarks!] Your checkin has not pa