If you are not using the last release of valgrind, you might try with the last 
release.

Wondering if the problem also happens with other tools (e.g.  --tool=none).


Otherwise, you could try to debug your application when running under valgrind
when it encounters the problem.
Eg. use arguments --vgdb=full --vgdb-error=1 --vgdb-stop-at=exit,valgrindabexit
(assuming the below is the first error you encounter. If not, you should first 
fix your
code to solve the errors previously reported by valgrind).


You could also compare the valgrind trace between a succesful run and an 
unsuccesful run,
with e.g. the valgrind debug switches -v -v -v -d -d -d --trace-signals=yes
and see if you detect a difference between the 2 runs.
Note that with the above switches, you should see some debug log of the signal 
handling
and of the stack extension mechanism.

Hope this helps

Philippe


On Mon, 2022-03-14 at 11:30 -0400, Narayanan Iyer via Valgrind-users wrote:
> One correction (not sure it matters). I believe the application uses 1.25Mb 
> of stack space at the time of the failure (not .25 as I had originally 
> mentioned).
> 
> Narayanan.
> 
> -----Original Message-----
> From: Narayanan Iyer [mailto:n...@yottadb.com] 
> Sent: Monday, March 14, 2022 11:27 AM
> To: valgrind-users@lists.sourceforge.net
> Cc: 'Narayanan Iyer' <n...@yottadb.com>
> Subject: Can't extend stack during signal delivery : too small or bad 
> protection modes
> 
> Hi,
> 
> While running the automated test suite (which has hundreds of tests) for my 
> application with valgrind, I occasionally see failures like the following in 
> some of the tests.
> 
> ==29753== Can't extend stack to 0x1ffeec7948 during signal delivery for 
> thread 1:
> ==29753==   too small or bad protection modes
> ==29753== 
> ==29753== Process terminating with default action of signal 11 (SIGSEGV): 
> dumping core
> ==29753==  Access not within mapped region at address 0x1FFEEC7948
> ==29753==    at 0x4849FD8: strncpy (in 
> /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==29753==    by 0x489AE7C: cli_get_sub_quals (sr_unix/cli_parse.c:593)
> ==29753==    by 0x489ABC3: parse_arg (sr_unix/cli_parse.c:0)
> ==29753==    by 0x489BD6E: parse_triggerfile_cmd (sr_unix/cli_parse.c:1128)
> ==29753==    by 0x4BD2377: trigger_parse (sr_unix/trigger_parse.c:1416)
> ==29753==    by 0x4B12152: trigger_update_rec (sr_unix/trigger_update.c:1386)
> ==29753==    by 0x4B16171: trigger_update_rec_helper 
> (sr_unix/trigger_update.c:2171)
> ==29753==    by 0x4B163B9: trigger_update (sr_unix/trigger_update.c:2224)
> ==29753==    by 0x4B86385: op_fnztrigger (sr_port/op_fnztrigger.c:248)
> ==29753==    by 0x5ABA384: _ydboctoplanhelpers (in 
> YDBOcto/build/src/_ydbocto.so)
> ==29753==    by 0x1774F1EF: ???
> ==29753==    by 0xAAAAAAAAAAAAAAA9: ???
> ==29753==  If you believe this happened as a result of a stack
> ==29753==  overflow in your program's main thread (unlikely but
> ==29753==  possible), you can try to increase the size of the
> ==29753==  main thread stack using the --main-stacksize= flag.
> ==29753==  The main thread stack size used in this run was 268435456.
> ==29753== Invalid write of size 8
> ==29753==    at 0x483A124: _vgnU_freeres (in 
> /usr/libexec/valgrind/vgpreload_core-amd64-linux.so)
> ==29753==  Address 0x1ffeec8808 is on thread 1's stack
> 
> If I rerun just the failing test, it passes fine. Every time the list of 
> tests that fail keeps changing. If I run the test without valgrind, it passes 
> all the time.
> 
> Originally I got a failure with the --main-stacksize set to 16Mb so I bumped 
> it to 256Mb. And I still keep getting this failure at different tests. I also 
> set the ulimit for stacksize to 256Mb just in case and I still see the 
> failures.
> 
> The application is a single-threaded application and I know for sure it does 
> not use anywhere near 256Mb of stack space. The stack trace shown above keeps 
> changing across the many random failures but in all of those stack traces, I 
> believe only around .25Mb of stack space would be used at the most. 
> 
> In this application, a SIGALRM signal would happen every 1 second or so. The 
> application does not set up any alternate stack (i.e. no sigaltstack() call). 
> Not sure if that can be related to the random failure or not.
> 
> This is on a Ubuntu 20.04 system. And my application was compiled with gcc.
> 
> Not sure how to debug this further. Any help in this regard is appreciated.
> 
> Thanks,
> Narayanan.
> 
> 
> 
> 
> _______________________________________________
> Valgrind-users mailing list
> Valgrind-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/valgrind-users




_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to