Re: Segmentation fault in pipecb

2019-10-30 Thread Marc Lehmann
On Wed, Oct 30, 2019 at 01:22:22AM +, Calum McPherson  
wrote:
> As noted this was 1st usage after a restart among otherwise stable, heavy 
> production use.I have scanned the mailing list from release 4.25 through 
> 4.27 and didn't see this issue raised therein.

Likely because the issue is not in libev but in the rest of the program
somehow causing memory corruption - for example, freeing memory that
is still in use after forgetting to stop an active watcher, or make
thread-unsafe calls, or...

libev has "verification" mode that *might* be able to diagnose the problem
a bit earlier by having a lot more and more frequent checks - look for
EV_VERIFY in the documentation - you need to recompile/relink libev/your
program for this. I'd try that first if the effort isn't prohibitive.

valgrind would also be a very good thing to use, with luck, it can
immediatelly pinpoint the bug.

Otherwise, when running out of magic tools, you need to debug your program
and find the issue, just like in the old days.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/mailman/listinfo/libev


Segmentation fault in pipecb

2019-10-30 Thread Calum McPherson
Hi,

We have a server with libev-4.25 statically linked in and the first service it 
provided after a recent restart produced the following core dump:

Core was generated by `/var/lib/comcent/dbin/comcentd --log'.
Program terminated with signal 11, Segmentation fault.
#0  0x00741b04 in pipecb (loop=0xd5b5b0, iow=, 
revents=) at ev.c:2552
2552ev.c: No such file or directory.
Missing separate debuginfos, use: debuginfo-install 
cyrus-sasl-lib-2.1.26-23.el7.x86_64 glibc-2.17-260.el7.x86_64 
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-34.el7.x86_64 
libcom_err-1.42.9-13.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 
libpqxx-4.0.1-1.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 
libstdc++-4.8.5-36.el7.x86_64 nspr-4.19.0-1.el7_5.x86_64 
nss-3.36.0-7.el7_5.x86_64 nss-softokn-freebl-3.36.0-5.el7_5.x86_64 
nss-util-3.36.0-1.el7_5.x86_64 openldap-2.4.44-20.el7.x86_64 
openssl-libs-1.0.2k-16.el7.x86_64 pcre-8.32-17.el7.x86_64 
postgresql-libs-9.2.24-1.el7_5.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) where
#0  0x00741b04 in pipecb (loop=0xd5b5b0, iow=, 
revents=) at ev.c:2552
#1  0x00740c95 in ev_invoke_pending (loop=0xd5b5b0) at ev.c:3322
#2  0x007444d5 in ev_run (loop=0xd5b5b0, flags=) at 
ev.c:3726
#3  0x00637521 in ChannelWorker::main_ (arg=0x0) at 
ChannelWorker.cpp:198
#4  0x7fb225f74dd5 in start_thread () from /lib64/libpthread.so.0
#5  0x7fb224855ead in clone () from /lib64/libc.so.6
(gdb) quit

Unfortunately the stack isn't examinable due to optimization but the source 
code line 2552 maps to this snippet in pipecd() at ev.c:

2544 #if EV_ASYNC_ENABLE
2545   if (async_pending)
2546 {
2547   async_pending = 0;
2548
2549   ECB_MEMORY_FENCE;
2550
2551   for (i = asynccnt; i--; )
2552 if (asyncs [i]->sent)
2553   {
2554 asyncs [i]->sent = 0;
2555 ECB_MEMORY_FENCE_RELEASE;
2556 ev_feed_event (EV_A_ asyncs [i], EV_ASYNC);
2557   }
2558 }
2559 #endif


If not obvious from the above the source line in question appears to be:

2552 if (asyncs [i]->sent)


As noted this was 1st usage after a restart among otherwise stable, heavy 
production use.I have scanned the mailing list from release 4.25 through 
4.27 and didn't see this issue raised therein.

Regards,

Cal McPherson
___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/mailman/listinfo/libev