Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]

2016-11-30 Thread Vo Minh Hoang
Dear Mahesh,

ACK all three patches, tested, found no problem.

Sincerely,
Hoang

-Original Message-
From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] 
Sent: Tuesday, November 29, 2016 5:37 PM
To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: [PATCH 0 of 3] Review Request for leap : now leap library ensure
shm availability before writing [#2202]

Summary:leap : now leap library ensure shm availability before writing
[#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): Hoang /
Ramesh Pull request to: <> Affected
branch(es): <> Development branch: <>


Impacted area   Impact y/n

 Docsn
 Build systemn
 RPM/packaging   n
 Configuration files n
 Startup scripts n
 SAF servicesn
 OpenSAF servicesy
 Core libraries  y
 Samples n
 Tests   n
 Other   n


Comments (indicate scope for each "y" above):
-

changeset 7b53e1b3754622fe90c22c801adeb7df6d808c30
Author: A V Mahesh 
Date:   Tue, 29 Nov 2016 15:59:21 +0530

leap : now leap library ensure shm availability before writing
[#2202]
 Issue  :

If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in
system
, pnd Segmentation fault (core dumped) at LEAP memcpy().

Fix :

Now LEAP library ensures shm free space before writing This may
degrade
some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is set,
cpsv
give natural performance.

changeset 083114e13c00c9c4267ffe65a86c1a97a951b876
Author: A V Mahesh 
Date:   Tue, 29 Nov 2016 16:02:06 +0530

cpsv : update cpsv error handing based on leap changes [#2202]

changeset fb509abb1d1583315f585663fd75bf73e35211a6
Author: A V Mahesh 
Date:   Tue, 29 Nov 2016 16:02:58 +0530

mqsv : update mqsv error handing based on leap changes [#2202]


Complete diffstat:
--
 osaf/libs/common/cpsv/include/cpnd_cb.h   |   4 ++--
 osaf/libs/common/cpsv/include/cpnd_init.h |   8 
 osaf/libs/common/cpsv/include/cpnd_sec.h  |   2 +-
 osaf/libs/core/include/ncs_osprm.h|   2 +-
 osaf/libs/core/leap/os_defs.c |  20 ++--
 osaf/services/saf/cpsv/cpnd/cpnd_db.c |  12 ++--
 osaf/services/saf/cpsv/cpnd/cpnd_evt.c|  82
+---
--
 osaf/services/saf/cpsv/cpnd/cpnd_proc.c   |  31
++-
 osaf/services/saf/cpsv/cpnd/cpnd_res.c|  24 
 osaf/services/saf/cpsv/cpnd/cpnd_sec.cc   |  12 ++--
 osaf/services/saf/glsv/glnd/glnd_shm.c|   2 +-
 osaf/services/saf/mqsv/mqnd/mqnd_shm.c|   2 +-
 12 files changed, 123 insertions(+), 78 deletions(-)


Testing Commands:
-
Create situation that node SHM  reaches 100% usage and then perform any CPSV
operation which writes to SHM

Testing, Expected Results:
--
 <>


Conditions of Submission:
-
 <>


Arch  Built StartedLinux distro
---
mipsn  n
mips64  n  n
x86 n  n
x86_64  y  y
powerpc n  n
powerpc64   n  n


Reviewer Checklist:
---
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to put in a proper Trac Ticket # into your commits.

___ You have incorrectly put/left internal data in your comments/files
(i.e. internal bug tracking tool IDs, product names etc)

___ You have not given any evidence of testing beyond basic build tests.
Demonstrate some level of runtime or other sanity testing.

___ You have ^M present in some of your files. These have to be removed.

___ You have needlessly changed whitespace or added whitespace crimes
like trailing spaces, or spaces before tabs.

___ You have mixed real technical changes with whitespace and other
cosmetic code cleanup changes. These have to be separate commits.

___ You need to refactor your submission into logical chunks; there is
too much content into a single commit.

___ You have extraneous garbage in your review (merge commits etc)

___ You have giant attachments which should never have been sent;
Instead you should place your content in a pu

Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]

2016-11-30 Thread A V Mahesh

Hi Hoang,

We are some how able to simulated your test case.
Following are the detailed steps how we reproduced  ,
this test is generating same core dumb as below.

But the provided patch resolved the issue,  can you please test your self
provide your observations , if the test is different please update .

Attached :035_cpsv_2202_V2_debug.patch  & 036_cpsv_2207_debug.patch

Test application  : cpsv_shm_2202.c

==

1)  /etc/init.d/opensafd stop

2)  Change the defaults /dev/shm  size to 3MB

  # vi /etc/fstab  tmpfs

 And add following line

 ` tmpfs   /dev/shm tmpfs 
defaults,size=3m   0 0`


3)  Remount /dev/shm

#mount -o remount /dev/shm

4)  Check /dev/shm reflected with new value

# df -k /dev/shm/

Filesystem 1K-blocks  Used Available Use% Mounted on
tmpfs   3072 0  3072   0% /dev/shm


5) set  ulimit to   unlimited

#ulimit -c  unlimited

6) #/etc/init.d/opensafd start


7)  Compile & run attached test application ( cpsv_shm_2202.c )

#gcc cpsv_shm_2202.c -o ckpt_shm -lSaCkpt

# ./ckpt_shm

8)  Once /dev/shm/ reach 100% Use you will see core dump  same as yours

# df -k /dev/shm/


7) Then we applied the patch  test again with 
`cpsv_2202_V2_debug.patch`  & `cpsv_2207_debug.patch`)  no core dump


   saCkptSectionCreate 1  returned 18. ( no core dump )
==

-AVM



On 11/30/2016 11:37 AM, Vo Minh Hoang wrote:

Dear Mahesh,

Unfortunately, I have just receive information that the same core dump still
occur after applying patch.

Here is dump information in short, please tell me if I can do anything in
support:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install
opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64
(gdb) where
#0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6
#1  0x7fe315c26082 in memcpy (__len=, __src=, __dest=)
 at /usr/include/x86_64-linux-gnu/bits/string3.h:51
#2  ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874
#3  0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0,
sec_info=sec_info@entry=0xb8ff60,
 cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880
#4  0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0,
cp_node=0xb8e8c0, id=0x7fe30c002390,
 exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at
cpnd_db.c:457
#5  0x0040d17c in cpnd_evt_proc_ckpt_sect_create
(cb=cb@entry=0x9e57f0,
 evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) at
cpnd_evt.c:2267
#6  0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at
cpnd_evt.c:227
#7  0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at
cpnd_init.c:579
#8  0x00405383 in main (argc=, argv=)
at cpnd_main.c:79

Sincerely,
Hoang

-Original Message-
From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com]
Sent: Tuesday, November 29, 2016 5:37 PM
To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: [PATCH 0 of 3] Review Request for leap : now leap library ensure
shm availability before writing [#2202]

Summary:leap : now leap library ensure shm availability before writing
[#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): Hoang /
Ramesh Pull request to: <> Affected
branch(es): <> Development branch: <>


Impacted area   Impact y/n

  Docsn
  Build systemn
  RPM/packaging   n
  Configuration files n
  Startup scripts n
  SAF servicesn
  OpenSAF servicesy
  Core libraries  y
  Samples n
  Tests   n
  Other   n


Comments (indicate scope for each "y" above):
-

changeset 7b53e1b3754622fe90c22c801adeb7df6d808c30
Author: A V Mahesh 
Date:   Tue, 29 Nov 2016 15:59:21 +0530

leap : now leap library ensure shm availability before writing
[#2202]
  Issue :

If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in
system
, pnd Segmentation fault (core dumped) at LEAP memcpy().

Fix :

Now LEAP library ensures shm free space before writing This may
degrade
some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is set,
cpsv
give natural performance.

changeset 083114e13c00c9c4267ffe65a86c1a97a951b876
Author: A V Mahesh 
Date:   Tue, 29 Nov 2016 16:02:06 +0530

cpsv : update cpsv error handing based on leap changes [#2202]

changeset fb509abb1d1583315f585663fd75bf73e35211a6
Author: A V Mahesh 
Date:   Tue, 29 Nov 2016 16:02:58 +0530

mqsv : update mqsv error ha

Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]

2016-11-30 Thread ramesh betham
Hi Hoang,

Can you please provide the snapshot of "df -h /dev/shm" when the issue 
occurs.

Also.. can you please provide the corresponding core dump file and 
"osafckptnd" process file (process file not required if this image is 
build for SLES x86_64)

Thanks,
Ramesh.

On 11/30/2016 12:10 PM, Vo Minh Hoang wrote:
> Dear Mahesh,o
>
> I will do following your request.
> Please note that test case executing time is very long so result might not
> be available in today.
>
> Thank you and best regards,
> Hoang
>
> -Original Message-
> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
> Sent: Wednesday, November 30, 2016 1:35 PM
> To: Vo Minh Hoang 
> Cc: opensaf-devel@lists.sourceforge.net; ramesh.bet...@oracle.com
> Subject: Re: [PATCH 0 of 3] Review Request for leap : now leap library
> ensure shm availability before writing [#2202]
>
> Hi Hoang,
>
> Apply  these change test and provide sys log at the time issue occurred.
>
> 
> 
>
> diff --git a/osaf/libs/core/leap/os_defs.c b/osaf/libs/core/leap/os_defs.c
> --- a/osaf/libs/core/leap/os_defs.c
> +++ b/osaf/libs/core/leap/os_defs.c
> @@ -865,11 +865,15 @@ uint32_t ncs_os_posix_shm(NCS_OS_POSIX_S
>   }
>
>   /* Checking whether sufficient shared memory is
> available to write the data, to be safer side ffree reduced to 1 block size
> */
> -   if (req->info.write.i_write_size >
> (statsvfs.f_bfree * statsvfs.f_frsize)) {
> +   if (req->info.write.i_write_size >
> ((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) {
>   syslog(LOG_ERR, "Insufficient shared memory
> space (%ld) available to write the data of size: %ld \n",
>   (statsvfs.f_bfree *
> statsvfs.f_frsize), req->info.write.i_write_size);
>   return NCSCC_RC_FAILURE;
> +   } else {
> +   syslog(LOG_ERR, "Sufficient shared
> memory space (%ld) available to write the data of size: %ld \n",
> +(statsvfs.f_bfree *
> statsvfs.f_frsize), req->info.write.i_write_size);
>   }
> +
>   }
>   memcpy((void *)((char *)req->info.write.i_addr +
> req->info.write.i_offset), req->info.write.i_from_buff,
>   req->info.write.i_write_size);
>
> 
> 
>
> -AVM
>
> On 11/30/2016 11:56 AM, A V Mahesh wrote:
>> Hi Hoang,
>>
>> Thansk for the test .
>>
>> Then it looks issue is not  related to  SHM deficiency ( 100 % used by
>> other application ) can you please  re-test with below changes and
>> that will confirm us it is completely not related to SHM  free size.
>>
>> replacing:
>>
>> `if (req->info.write.i_write_size > (statsvfs.f_bfree *
>> statsvfs.f_frsize)) {`
>>
>> with below, to be safer side ffree reduced to 1 block size :
>>
>> `if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) *
>> statsvfs.f_frsize)) {
>>
>>
>> -AVM
>>
>>
>> On 11/30/2016 11:37 AM, Vo Minh Hoang wrote:
>>> Dear Mahesh,
>>>
>>> Unfortunately, I have just receive information that the same core
>>> dump still occur after applying patch.
>>>
>>> Here is dump information in short, please tell me if I can do
>>> anything in
>>> support:
>>>
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from
>>> /lib64/libc.so.6
>>> Missing separate debuginfos, use: zypper install
>>> opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64
>>> (gdb) where
>>> #0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from
>>> /lib64/libc.so.6
>>> #1  0x7fe315c26082 in memcpy (__len=,
>>> __src=>> out>, __dest=)
>>>   at /usr/include/x86_64-linux-gnu/bits/string3.h:51
>>> #2  ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874
>>> #3  0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0,
>>> sec_info=sec_info@entry=0xb8ff60,
>>>   cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880
>>> #4  0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0,
>>> cp_node=0xb8e8c0, id=0x7fe30c002390,
>>>   exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at
>>> cpnd_db.c:457
>>> #5  0x0040d17c in cpnd_evt_proc_ckpt_sect_create
>>> (cb=cb@entry=0x9e57f0,
>>>   evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828)
>>> at
>>> cpnd_evt.c:2267
>>> #6  0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at
>>> cpnd_evt.c:227
>>> #7  0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at
>>> cpnd_init.c:579
>>> #8  0x00405383 in main (argc=, argv=>> out>)
>>> at cpnd_main.c:79
>>>
>>> Sincerely,
>>> Hoang
>>>
>>> -Original Message-
>>> From: mahesh.va

Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]

2016-11-29 Thread A V Mahesh

Hi Hoang,

We have no option that , why because reproducible steps are not easy .

Please find the additional `2202_addtional.patch` patch on top of 
published patch


this  will confirm us it is completely not related to SHM free size.

-AVM

On 11/30/2016 12:10 PM, Vo Minh Hoang wrote:

Dear Mahesh,

I will do following your request.
Please note that test case executing time is very long so result might not
be available in today.

Thank you and best regards,
Hoang

-Original Message-
From: A V Mahesh [mailto:mahesh.va...@oracle.com]
Sent: Wednesday, November 30, 2016 1:35 PM
To: Vo Minh Hoang 
Cc: opensaf-devel@lists.sourceforge.net; ramesh.bet...@oracle.com
Subject: Re: [PATCH 0 of 3] Review Request for leap : now leap library
ensure shm availability before writing [#2202]

Hi Hoang,

Apply  these change test and provide sys log at the time issue occurred.




diff --git a/osaf/libs/core/leap/os_defs.c b/osaf/libs/core/leap/os_defs.c
--- a/osaf/libs/core/leap/os_defs.c
+++ b/osaf/libs/core/leap/os_defs.c
@@ -865,11 +865,15 @@ uint32_t ncs_os_posix_shm(NCS_OS_POSIX_S
  }

  /* Checking whether sufficient shared memory is
available to write the data, to be safer side ffree reduced to 1 block size
*/
-   if (req->info.write.i_write_size >
(statsvfs.f_bfree * statsvfs.f_frsize)) {
+   if (req->info.write.i_write_size >
((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) {
  syslog(LOG_ERR, "Insufficient shared memory
space (%ld) available to write the data of size: %ld \n",
  (statsvfs.f_bfree *
statsvfs.f_frsize), req->info.write.i_write_size);
  return NCSCC_RC_FAILURE;
+   } else {
+   syslog(LOG_ERR, "Sufficient shared
memory space (%ld) available to write the data of size: %ld \n",
+(statsvfs.f_bfree *
statsvfs.f_frsize), req->info.write.i_write_size);
  }
+
  }
  memcpy((void *)((char *)req->info.write.i_addr +
req->info.write.i_offset), req->info.write.i_from_buff,
  req->info.write.i_write_size);




-AVM

On 11/30/2016 11:56 AM, A V Mahesh wrote:

Hi Hoang,

Thansk for the test .

Then it looks issue is not  related to  SHM deficiency ( 100 % used by
other application ) can you please  re-test with below changes and
that will confirm us it is completely not related to SHM  free size.

replacing:

`if (req->info.write.i_write_size > (statsvfs.f_bfree *
statsvfs.f_frsize)) {`

with below, to be safer side ffree reduced to 1 block size :

`if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) *
statsvfs.f_frsize)) {


-AVM


On 11/30/2016 11:37 AM, Vo Minh Hoang wrote:

Dear Mahesh,

Unfortunately, I have just receive information that the same core
dump still occur after applying patch.

Here is dump information in short, please tell me if I can do
anything in
support:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from
/lib64/libc.so.6
Missing separate debuginfos, use: zypper install
opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64
(gdb) where
#0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from
/lib64/libc.so.6
#1  0x7fe315c26082 in memcpy (__len=,
__src=, __dest=)
  at /usr/include/x86_64-linux-gnu/bits/string3.h:51
#2  ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874
#3  0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0,
sec_info=sec_info@entry=0xb8ff60,
  cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880
#4  0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0,
cp_node=0xb8e8c0, id=0x7fe30c002390,
  exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at
cpnd_db.c:457
#5  0x0040d17c in cpnd_evt_proc_ckpt_sect_create
(cb=cb@entry=0x9e57f0,
  evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828)
at
cpnd_evt.c:2267
#6  0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at
cpnd_evt.c:227
#7  0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at
cpnd_init.c:579
#8  0x00405383 in main (argc=, argv=)
at cpnd_main.c:79

Sincerely,
Hoang

-Original Message-
From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com]
Sent: Tuesday, November 29, 2016 5:37 PM
To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: [PATCH 0 of 3] Review Request for leap : now leap library
ensure shm availability before writing [#2202]

Summary:leap : now leap library ensure shm availability before
writing [#2202] Rev

Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]

2016-11-29 Thread Vo Minh Hoang
Dear Mahesh,

I will do following your request.
Please note that test case executing time is very long so result might not
be available in today.

Thank you and best regards,
Hoang

-Original Message-
From: A V Mahesh [mailto:mahesh.va...@oracle.com] 
Sent: Wednesday, November 30, 2016 1:35 PM
To: Vo Minh Hoang 
Cc: opensaf-devel@lists.sourceforge.net; ramesh.bet...@oracle.com
Subject: Re: [PATCH 0 of 3] Review Request for leap : now leap library
ensure shm availability before writing [#2202]

Hi Hoang,

Apply  these change test and provide sys log at the time issue occurred.




diff --git a/osaf/libs/core/leap/os_defs.c b/osaf/libs/core/leap/os_defs.c
--- a/osaf/libs/core/leap/os_defs.c
+++ b/osaf/libs/core/leap/os_defs.c
@@ -865,11 +865,15 @@ uint32_t ncs_os_posix_shm(NCS_OS_POSIX_S
 }

 /* Checking whether sufficient shared memory is
available to write the data, to be safer side ffree reduced to 1 block size
*/
-   if (req->info.write.i_write_size > 
(statsvfs.f_bfree * statsvfs.f_frsize)) {
+   if (req->info.write.i_write_size >
((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) {
 syslog(LOG_ERR, "Insufficient shared memory
space (%ld) available to write the data of size: %ld \n",
 (statsvfs.f_bfree *
statsvfs.f_frsize), req->info.write.i_write_size);
 return NCSCC_RC_FAILURE;
+   } else {
+   syslog(LOG_ERR, "Sufficient shared
memory space (%ld) available to write the data of size: %ld \n",
+(statsvfs.f_bfree *
statsvfs.f_frsize), req->info.write.i_write_size);
 }
+
 }
 memcpy((void *)((char *)req->info.write.i_addr + 
req->info.write.i_offset), req->info.write.i_from_buff,
 req->info.write.i_write_size);




-AVM

On 11/30/2016 11:56 AM, A V Mahesh wrote:
> Hi Hoang,
>
> Thansk for the test .
>
> Then it looks issue is not  related to  SHM deficiency ( 100 % used by 
> other application ) can you please  re-test with below changes and 
> that will confirm us it is completely not related to SHM  free size.
>
> replacing:
>
> `if (req->info.write.i_write_size > (statsvfs.f_bfree *
> statsvfs.f_frsize)) {`
>
> with below, to be safer side ffree reduced to 1 block size :
>
> `if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) *
> statsvfs.f_frsize)) {
>
>
> -AVM
>
>
> On 11/30/2016 11:37 AM, Vo Minh Hoang wrote:
>> Dear Mahesh,
>>
>> Unfortunately, I have just receive information that the same core 
>> dump still occur after applying patch.
>>
>> Here is dump information in short, please tell me if I can do 
>> anything in
>> support:
>>
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from
>> /lib64/libc.so.6
>> Missing separate debuginfos, use: zypper install
>> opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64
>> (gdb) where
>> #0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from
>> /lib64/libc.so.6
>> #1  0x7fe315c26082 in memcpy (__len=, 
>> __src=> out>, __dest=)
>>  at /usr/include/x86_64-linux-gnu/bits/string3.h:51
>> #2  ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874
>> #3  0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0, 
>> sec_info=sec_info@entry=0xb8ff60,
>>  cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880
>> #4  0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0, 
>> cp_node=0xb8e8c0, id=0x7fe30c002390,
>>  exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at
>> cpnd_db.c:457
>> #5  0x0040d17c in cpnd_evt_proc_ckpt_sect_create 
>> (cb=cb@entry=0x9e57f0,
>>  evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) 
>> at
>> cpnd_evt.c:2267
>> #6  0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at
>> cpnd_evt.c:227
>> #7  0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at
>> cpnd_init.c:579
>> #8  0x00405383 in main (argc=, argv=> out>)
>> at cpnd_main.c:79
>>
>> Sincerely,
>> Hoang
>>
>> -Original Message-
>> From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com]
>> Sent: Tuesday, November 29, 2016 5:37 PM
>> To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com
>> Cc: opensaf-devel@lists.sourceforge.net
>> Subject: [PATCH 0 of 3] Review Request for leap : now leap library 
>> ensure shm availability before writing [#2202]
>>
>> Summary:leap : now leap library ensure shm availability before 
>> writing [#2202] Review request for Trac Ticket(s): #2202 Peer
Reviewer(s):
>> Hoang /
>> Ramesh Pull request t

Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]

2016-11-29 Thread A V Mahesh
Hi Hoang,

Apply  these change test and provide sys log at the time issue occurred.



diff --git a/osaf/libs/core/leap/os_defs.c b/osaf/libs/core/leap/os_defs.c
--- a/osaf/libs/core/leap/os_defs.c
+++ b/osaf/libs/core/leap/os_defs.c
@@ -865,11 +865,15 @@ uint32_t ncs_os_posix_shm(NCS_OS_POSIX_S
 }

 /* Checking whether sufficient shared memory is 
available to write the data, to be safer side ffree reduced to 1 block 
size */
-   if (req->info.write.i_write_size > 
(statsvfs.f_bfree * statsvfs.f_frsize)) {
+   if (req->info.write.i_write_size > 
((statsvfs.f_bfree - 1) * statsvfs.f_frsize)) {
 syslog(LOG_ERR, "Insufficient shared 
memory space (%ld) available to write the data of size: %ld \n",
 (statsvfs.f_bfree * 
statsvfs.f_frsize), req->info.write.i_write_size);
 return NCSCC_RC_FAILURE;
+   } else {
+   syslog(LOG_ERR, "Sufficient shared 
memory space (%ld) available to write the data of size: %ld \n",
+(statsvfs.f_bfree * 
statsvfs.f_frsize), req->info.write.i_write_size);
 }
+
 }
 memcpy((void *)((char *)req->info.write.i_addr + 
req->info.write.i_offset), req->info.write.i_from_buff,
 req->info.write.i_write_size);



-AVM

On 11/30/2016 11:56 AM, A V Mahesh wrote:
> Hi Hoang,
>
> Thansk for the test .
>
> Then it looks issue is not  related to  SHM deficiency ( 100 % used by 
> other application )
> can you please  re-test with below changes and that will confirm us it 
> is completely not related to SHM  free size.
>
> replacing:
>
> `if (req->info.write.i_write_size > (statsvfs.f_bfree * 
> statsvfs.f_frsize)) {`
>
> with below, to be safer side ffree reduced to 1 block size :
>
> `if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * 
> statsvfs.f_frsize)) {
>
>
> -AVM
>
>
> On 11/30/2016 11:37 AM, Vo Minh Hoang wrote:
>> Dear Mahesh,
>>
>> Unfortunately, I have just receive information that the same core 
>> dump still
>> occur after applying patch.
>>
>> Here is dump information in short, please tell me if I can do 
>> anything in
>> support:
>>
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from 
>> /lib64/libc.so.6
>> Missing separate debuginfos, use: zypper install
>> opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64
>> (gdb) where
>> #0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from 
>> /lib64/libc.so.6
>> #1  0x7fe315c26082 in memcpy (__len=, 
>> __src=> out>, __dest=)
>>  at /usr/include/x86_64-linux-gnu/bits/string3.h:51
>> #2  ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874
>> #3  0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0,
>> sec_info=sec_info@entry=0xb8ff60,
>>  cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880
>> #4  0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0,
>> cp_node=0xb8e8c0, id=0x7fe30c002390,
>>  exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at
>> cpnd_db.c:457
>> #5  0x0040d17c in cpnd_evt_proc_ckpt_sect_create
>> (cb=cb@entry=0x9e57f0,
>>  evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) at
>> cpnd_evt.c:2267
>> #6  0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at
>> cpnd_evt.c:227
>> #7  0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at
>> cpnd_init.c:579
>> #8  0x00405383 in main (argc=, argv=> out>)
>> at cpnd_main.c:79
>>
>> Sincerely,
>> Hoang
>>
>> -Original Message-
>> From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com]
>> Sent: Tuesday, November 29, 2016 5:37 PM
>> To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com
>> Cc: opensaf-devel@lists.sourceforge.net
>> Subject: [PATCH 0 of 3] Review Request for leap : now leap library 
>> ensure
>> shm availability before writing [#2202]
>>
>> Summary:leap : now leap library ensure shm availability before writing
>> [#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): 
>> Hoang /
>> Ramesh Pull request to: <> 
>> Affected
>> branch(es): <> Development branch: <> ANY
>> GIVE THE REPO URL>>
>>
>> 
>> Impacted area   Impact y/n
>> 
>>   Docsn
>>   Build systemn
>>   RPM/packaging   n
>>   Configuration files n
>>   Startup scripts n
>>   SAF servicesn
>>   OpenSAF servicesy
>>   Core libraries  y
>>   Samples n
>>   Tests

Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]

2016-11-29 Thread A V Mahesh
Hi Hoang,

Thansk for the test .

Then it looks issue is not  related to  SHM deficiency ( 100 % used by 
other application )
can you please  re-test with below changes and that will confirm us it 
is completely not related to SHM  free size.

replacing:

`if (req->info.write.i_write_size > (statsvfs.f_bfree * 
statsvfs.f_frsize)) {`

with below, to be safer side ffree reduced to 1 block size :

`if (req->info.write.i_write_size > ((statsvfs.f_bfree - 1) * 
statsvfs.f_frsize)) {


-AVM


On 11/30/2016 11:37 AM, Vo Minh Hoang wrote:
> Dear Mahesh,
>
> Unfortunately, I have just receive information that the same core dump still
> occur after applying patch.
>
> Here is dump information in short, please tell me if I can do anything in
> support:
>
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6
> Missing separate debuginfos, use: zypper install
> opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64
> (gdb) where
> #0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6
> #1  0x7fe315c26082 in memcpy (__len=, __src= out>, __dest=)
>  at /usr/include/x86_64-linux-gnu/bits/string3.h:51
> #2  ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874
> #3  0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0,
> sec_info=sec_info@entry=0xb8ff60,
>  cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880
> #4  0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0,
> cp_node=0xb8e8c0, id=0x7fe30c002390,
>  exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at
> cpnd_db.c:457
> #5  0x0040d17c in cpnd_evt_proc_ckpt_sect_create
> (cb=cb@entry=0x9e57f0,
>  evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) at
> cpnd_evt.c:2267
> #6  0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at
> cpnd_evt.c:227
> #7  0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at
> cpnd_init.c:579
> #8  0x00405383 in main (argc=, argv=)
> at cpnd_main.c:79
>
> Sincerely,
> Hoang
>
> -Original Message-
> From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com]
> Sent: Tuesday, November 29, 2016 5:37 PM
> To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: [PATCH 0 of 3] Review Request for leap : now leap library ensure
> shm availability before writing [#2202]
>
> Summary:leap : now leap library ensure shm availability before writing
> [#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): Hoang /
> Ramesh Pull request to: <> Affected
> branch(es): <> Development branch: < GIVE THE REPO URL>>
>
> 
> Impacted area   Impact y/n
> 
>   Docsn
>   Build systemn
>   RPM/packaging   n
>   Configuration files n
>   Startup scripts n
>   SAF servicesn
>   OpenSAF servicesy
>   Core libraries  y
>   Samples n
>   Tests   n
>   Other   n
>
>
> Comments (indicate scope for each "y" above):
> -
>
> changeset 7b53e1b3754622fe90c22c801adeb7df6d808c30
> Author:   A V Mahesh 
> Date: Tue, 29 Nov 2016 15:59:21 +0530
>
>   leap : now leap library ensure shm availability before writing
> [#2202]
>   Issue   :
>
>   If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in
> system
>   , pnd Segmentation fault (core dumped) at LEAP memcpy().
>
> Fix :
>
>   Now LEAP library ensures shm free space before writing This may
> degrade
>   some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is set,
> cpsv
>   give natural performance.
>
> changeset 083114e13c00c9c4267ffe65a86c1a97a951b876
> Author:   A V Mahesh 
> Date: Tue, 29 Nov 2016 16:02:06 +0530
>
>   cpsv : update cpsv error handing based on leap changes [#2202]
>
> changeset fb509abb1d1583315f585663fd75bf73e35211a6
> Author:   A V Mahesh 
> Date: Tue, 29 Nov 2016 16:02:58 +0530
>
>   mqsv : update mqsv error handing based on leap changes [#2202]
>
>
> Complete diffstat:
> --
>   osaf/libs/common/cpsv/include/cpnd_cb.h   |   4 ++--
>   osaf/libs/common/cpsv/include/cpnd_init.h |   8 
>   osaf/libs/common/cpsv/include/cpnd_sec.h  |   2 +-
>   osaf/libs/core/include/ncs_osprm.h|   2 +-
>   osaf/libs/core/leap/os_defs.c |  20 ++--
>   osaf/services/saf/cpsv/cpnd/cpnd_db.c |  12 ++--
>   osaf/services/saf/cpsv/cpnd/cpnd_evt.c|  82
> +---
> --
>   osaf/services/saf/cpsv/cpnd/cpnd_proc.c   |  31
> ++-
>   osaf/services/saf/cpsv/cpnd/cpnd_res.c|  24 
>   osaf/services/saf/cpsv/cpnd/cpnd_sec.cc   |  12 ++---

Re: [devel] [PATCH 0 of 3] Review Request for leap : now leap library ensure shm availability before writing [#2202]

2016-11-29 Thread Vo Minh Hoang
Dear Mahesh,

Unfortunately, I have just receive information that the same core dump still
occur after applying patch.

Here is dump information in short, please tell me if I can do anything in
support:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install
opensaf-ckpt-nodedirector-debuginfo-5.1.0-.0.4997518.sle12.x86_64
(gdb) where
#0  0x7fe314aa0109 in __memcpy_sse2_unaligned () from /lib64/libc.so.6
#1  0x7fe315c26082 in memcpy (__len=, __src=, __dest=)
at /usr/include/x86_64-linux-gnu/bits/string3.h:51
#2  ncs_os_posix_shm (req=req@entry=0x7ffecb80adb0) at os_defs.c:874
#3  0x00415a80 in cpnd_sec_hdr_update (cb=cb@entry=0x9e57f0,
sec_info=sec_info@entry=0xb8ff60,
cp_node=cp_node@entry=0xb8e8c0) at cpnd_proc.c:1880
#4  0x00406047 in cpnd_ckpt_sec_add (cb=cb@entry=0x9e57f0,
cp_node=0xb8e8c0, id=0x7fe30c002390,
exp_time=1480480471343486000, gen_flag=gen_flag@entry=0) at
cpnd_db.c:457
#5  0x0040d17c in cpnd_evt_proc_ckpt_sect_create
(cb=cb@entry=0x9e57f0,
evt=evt@entry=0x7fe30c01e1d0, sinfo=sinfo@entry=0x7fe30c01e828) at
cpnd_evt.c:2267
#6  0x0040eaf4 in cpnd_process_evt (evt=0x7fe30c01e1c0) at
cpnd_evt.c:227
#7  0x004106cd in cpnd_main_process (cb=cb@entry=0x9e57f0) at
cpnd_init.c:579
#8  0x00405383 in main (argc=, argv=)
at cpnd_main.c:79

Sincerely,
Hoang

-Original Message-
From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] 
Sent: Tuesday, November 29, 2016 5:37 PM
To: hoang.m...@dektech.com.au; ramesh.bet...@oracle.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: [PATCH 0 of 3] Review Request for leap : now leap library ensure
shm availability before writing [#2202]

Summary:leap : now leap library ensure shm availability before writing
[#2202] Review request for Trac Ticket(s): #2202 Peer Reviewer(s): Hoang /
Ramesh Pull request to: <> Affected
branch(es): <> Development branch: <>


Impacted area   Impact y/n

 Docsn
 Build systemn
 RPM/packaging   n
 Configuration files n
 Startup scripts n
 SAF servicesn
 OpenSAF servicesy
 Core libraries  y
 Samples n
 Tests   n
 Other   n


Comments (indicate scope for each "y" above):
-

changeset 7b53e1b3754622fe90c22c801adeb7df6d808c30
Author: A V Mahesh 
Date:   Tue, 29 Nov 2016 15:59:21 +0530

leap : now leap library ensure shm availability before writing
[#2202]
 Issue  :

If OSAF_CKPT_SHM_ALLOC_GUARANTEE is NOT set and SHM is 100% used in
system
, pnd Segmentation fault (core dumped) at LEAP memcpy().

Fix :

Now LEAP library ensures shm free space before writing This may
degrade
some performance of cpsv , if OSAF_CKPT_SHM_ALLOC_GUARANTEE is set,
cpsv
give natural performance.

changeset 083114e13c00c9c4267ffe65a86c1a97a951b876
Author: A V Mahesh 
Date:   Tue, 29 Nov 2016 16:02:06 +0530

cpsv : update cpsv error handing based on leap changes [#2202]

changeset fb509abb1d1583315f585663fd75bf73e35211a6
Author: A V Mahesh 
Date:   Tue, 29 Nov 2016 16:02:58 +0530

mqsv : update mqsv error handing based on leap changes [#2202]


Complete diffstat:
--
 osaf/libs/common/cpsv/include/cpnd_cb.h   |   4 ++--
 osaf/libs/common/cpsv/include/cpnd_init.h |   8 
 osaf/libs/common/cpsv/include/cpnd_sec.h  |   2 +-
 osaf/libs/core/include/ncs_osprm.h|   2 +-
 osaf/libs/core/leap/os_defs.c |  20 ++--
 osaf/services/saf/cpsv/cpnd/cpnd_db.c |  12 ++--
 osaf/services/saf/cpsv/cpnd/cpnd_evt.c|  82
+---
--
 osaf/services/saf/cpsv/cpnd/cpnd_proc.c   |  31
++-
 osaf/services/saf/cpsv/cpnd/cpnd_res.c|  24 
 osaf/services/saf/cpsv/cpnd/cpnd_sec.cc   |  12 ++--
 osaf/services/saf/glsv/glnd/glnd_shm.c|   2 +-
 osaf/services/saf/mqsv/mqnd/mqnd_shm.c|   2 +-
 12 files changed, 123 insertions(+), 78 deletions(-)


Testing Commands:
-
Create situation that node SHM  reaches 100% usage and then perform any CPSV
operation which writes to SHM

Testing, Expected Results:
--
 <>


Conditions of Submission:
-
 <>


Arch  Built StartedLinux distro
---
mipsn  n
mips64  n  n
x86 n  n
x86_64  y  y
powerpc n  n
powerpc64   n  n


Reviewer Checklist:
---
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not pa