Hello, libev team, sorry to bother you.

I am a small development team here, about 20 users connect to the ocserv server 
through cisco anyconnect. I don't know how to manually reproduce this problem, 
but in my scenario, the ocserv-main process will exit with segment-fault. All 
users will drop from the vpn at the same time. Every day, it will fail 1-2 
times. It is not a fault at startup. Failure after running for a while.


From the results of coredump, I think this is a problem with libev.



As you can see from the dmesg-T log, this is the list of faults in the most 
recent week.
This problem makes us very depressed and will cause work to be interrupted.
This is the latest fault today.



[admin@vpn ~]$ uname -a
Linux vpn.kofo.io 3.10.0-957.5.1.el7.x86_64 #1 SMP Fri Feb 1 14:54:57 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux
[admin@vpn ~]$ cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 
[admin@vpn ~]$ rpm -qa | grep ocserv
ocserv-0.12.3-1.el7.x86_64
ocserv-debuginfo-0.12.3-1.el7.x86_64




I tested two versions of libev



==========with libev-4.15-7.el7.x86_64.rpm??==========
[admin@vpn ~]$ gdb /usr/sbin/ocserv 
/tmp/core-ocserv-main-sig11-user0-group0-pid1778-time1561007979
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/ocserv...Reading symbols from 
/usr/lib/debug/usr/sbin/ocserv.debug...done.
done.


warning: core file may not match specified executable file.
[New LWP 1778]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ocserv-main                                              
                     '.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000000000 in ?? ()
(gdb) where
#0  0x0000000000000000 in ?? ()
#1  0x00007f305fc553d5 in ev_invoke_pending (loop=0x7f305fe5ea40 
<default_loop_struct>) at ev.c:3322
#2  0x00007f305fc585b5 in ev_run (loop=0x7f305fe5ea40 <default_loop_struct>, 
flags=flags@entry=0) at ev.c:3726
#3  0x000055d4d269d7da in main (argc=<optimized out>, argv=<optimized out>) at 
main.c:1440
(gdb) l
1222    static void syserr_cb (const char *msg)
1223    {
1224            main_server_st *s = ev_userdata(loop);
1225    
1226            mslog(s, NULL, LOG_ERR, "libev fatal error: %s", msg);
1227            abort();
1228    }
1229    
1230    int main(int argc, char** argv)
1231    {
(gdb) quit


[admin@vpn ~]$ dmesg -T | tail
[Mon Jun 17 20:07:38 2019] traps: ocserv-main[4398] general protection 
ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:12:10 2019] traps: ocserv-main[4708] general protection 
ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:12:28 2019] traps: ocserv-main[4743] general protection 
ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:12:56 2019] traps: ocserv-main[4767] general protection 
ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:13:34 2019] traps: ocserv-main[4819] general protection 
ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:16:38 2019] ocserv-main[14426]: segfault at 55686ec0add8 ip 
00007fb9b653aabc sp 00007ffcfdb6cec0 error 6 in 
libev.so.4.0.0[7fb9b6536000+d000]
[Tue Jun 18 03:30:28 2019] traps: ocserv-main[5392] general protection 
ip:7f5e1e8bcc48 sp:7ffc2e520af0 error:0 in libev.so.4.0.0[7f5e1e8b8000+d000]
[Tue Jun 18 12:47:01 2019] ocserv-main[6841]: segfault at 0 ip           (null) 
sp 00007fffd1ace668 error 14 in ocserv[558a01f44000+5c000]
[Tue Jun 18 20:20:22 2019] traps: ocserv-main[25818] general protection 
ip:7f49ca78cc48 sp:7ffe69d8fc50 error:0 in libev.so.4.0.0[7f49ca788000+d000]
[Thu Jun 20 13:18:57 2019] ocserv-main[1778]: segfault at 0 ip           (null) 
sp 00007ffe0e0a4858 error 14 in ocserv[55d4d2691000+5c000]





==========with libev 4.25??Manual compilation and installation==========


dmesg -T??
[Tue Jun 18 20:20:21 2019] traps: ocserv-main[25818] general protection 
ip:7f49ca78cc48 sp:7ffe69d8fc50 error:0 in libev.so.4.0.0[7f49ca788000+d000]


[admin@vpn tmp]$ sudo file 
/tmp/core-ocserv-main-sig11-user0-group0-pid25818-time1560860462 
/tmp/core-ocserv-main-sig11-user0-group0-pid25818-time1560860462: ELF 64-bit 
LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'ocserv-main', real 
uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: 
'/usr/sbin/ocserv', platform: 'x86_64'


Unix Time??1560860462 = 2019/6/18 20:21:2 CST


[admin@vpn tmp]$ sudo chmod +r 
core-ocserv-main-sig11-user0-group0-pid25818-time1560860462   


[admin@vpn ~]$ gdb /usr/sbin/ocserv 
/tmp/core-ocserv-main-sig11-user0-group0-pid25818-time1560860462 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/ocserv...Reading symbols from 
/usr/lib/debug/usr/sbin/ocserv.debug...done.
done.


warning: core file may not match specified executable file.
[New LWP 25818]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ocserv-main                                              
                     '.
Program terminated with signal 11, Segmentation fault.
#0  child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 
<default_loop_struct>) at ev.c:2658
2658          if ((w->pid == pid || !w->pid)
(gdb) where
#0  child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 
<default_loop_struct>) at ev.c:2658
#1  childcb (loop=0x7f49ca995a40 <default_loop_struct>, sw=<optimized out>, 
revents=<optimized out>) at ev.c:2690
#2  0x00007f49ca78c3d5 in ev_invoke_pending (loop=0x7f49ca995a40 
<default_loop_struct>) at ev.c:3322
#3  0x00007f49ca78f5b5 in ev_run (loop=0x7f49ca995a40 <default_loop_struct>, 
flags=flags@entry=0) at ev.c:3726
#4  0x0000559f444867da in main (argc=<optimized out>, argv=<optimized out>) at 
main.c:1440
(gdb) l
2653      ev_child *w;
2654      int traced = WIFSTOPPED (status) || WIFCONTINUED (status);
2655    
2656      for (w = (ev_child *)childs [chain & ((EV_PID_HASHSIZE) - 1)]; w; w = 
(ev_child *)((WL)w)->next)
2657        {
2658          if ((w->pid == pid || !w->pid)
2659              && (!traced || (w->flags & 1)))
2660            {
2661              ev_set_priority (w, EV_MAXPRI); /* need to do it *now*, this 
*must* be the same prio as the signal watcher itself */
2662              w->rpid    = pid;
(gdb) p w
$2 = (ev_child *) 0x2d3832312d534541
(gdb) p w->pid
Cannot access memory at address 0x2d3832312d53456d


p w->pid is a wild pointer.
_______________________________________________
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/mailman/listinfo/libev

Reply via email to