I guess James missed the email :-) In January he posted some builds for further investigation:
https://www.mail-archive.com/discuss@openvswitch.org/msg12049.html It sounds like it's the exact same code, but with less compiler optimizations, to see whether this fixed the issue. This might provide a bit more information. On 31 March 2015 at 14:43, Marco Kuendig <ma...@nuvula.ch> wrote: > ok, I have tested it and I can reproduce it. > > For testing to reproduce that core: > > I have a very small VM on KVM, I start and shutdown that VM every 30 > seconds. With that I get: > > Mar 31 16:10:13 nuv-vir-kvm-server-1 ovs-vswitchd: > ovs|00010|daemon(monitor)|ERR|9 crashes: pid 10918 died, killed > (Segmentation fault), core dumped, restarting > Mar 31 16:29:20 nuv-vir-kvm-server-1 ovs-vswitchd: > ovs|00011|daemon(monitor)|ERR|10 crashes: pid 22038 died, killed > (Segmentation fault), core dumped, restarting > Mar 31 23:33:04 nuv-vir-kvm-server-1 ovs-vswitchd: > ovs|00012|daemon(monitor)|ERR|11 crashes: pid 25067 died, killed > (Segmentation fault), core dumped, restarting > Mar 31 23:34:43 nuv-vir-kvm-server-1 ovs-vswitchd: > ovs|00013|daemon(monitor)|ERR|12 crashes: pid 30254 died, killed > (Segmentation fault), core dumped, restarting > Mar 31 23:38:49 nuv-vir-kvm-server-1 ovs-vswitchd: > ovs|00014|daemon(monitor)|ERR|13 crashes: pid 31612 died, killed > (Segmentation fault), core dumped, restarting > > The ovs-vswitchd does not restart every cycle but it cores quite often as > you can see in the log above. > > > > [image: Nuvula AG] <http://www.nuvula.ch/> > > Marco Kuendig / CEO / Founder > ma...@nuvula.ch / +41 78 751 99 71 > > Marco's Google Hangout > <https://plus.google.com/hangouts/_/nuvula.ch/marco> > > Nuvula AG - Hybrid Clouds > Weierbachstrasse 7b 8193 Eglisau Switzerland > http://www.nuvula.ch > > On 31 Mar 2015, at 23:15, Marco Kuendig <ma...@nuvula.ch> wrote: > > I think I can reproduce that bug. Not 100% sure but today I think I had it > several times. > > Interesting is that it happens when a VM on kvm boots. Certainly strange > is that several hosts crash simultaneously. > > My setup is a lab setup and can be accessed if it helps in troubleshooting. > > > [image: Nuvula AG] <http://www.nuvula.ch/> > > Marco Kuendig / CEO / Founder > ma...@nuvula.ch / +41 78 751 99 71 > > Marco's Google Hangout > <https://plus.google.com/hangouts/_/nuvula.ch/marco> > > Nuvula AG - Hybrid Clouds > Weierbachstrasse 7b 8193 Eglisau Switzerland > http://www.nuvula.ch > > On 31 Mar 2015, at 23:12, Joe Stringer <joestrin...@nicira.com> wrote: > > James, I believe you were involved last time this bug came up, I wonder if > you ever got to the bottom of this? > > --- > > This looks the same as a bug reported in October: > > http://openvswitch.org/pipermail/discuss/2014-October/015429.html > > Ben's assessment was that there is no logical issue in the code, so > perhaps there was weird code generation caused by GCC. > > > On 31 March 2015 at 13:05, Marco Kuendig <ma...@nuvula.ch> wrote: > >> Reading symbols from /usr/sbin/ovs-vswitchd...Reading symbols from >> /usr/lib/debug//usr/sbin/ovs-vswitchd...done. >> done. >> [New LWP 32725] >> [New LWP 32732] >> [New LWP 32726] >> [New LWP 32730] >> [New LWP 32727] >> [New LWP 32728] >> [New LWP 32729] >> [New LWP 32731] >> [Thread debugging using libthread_db enabled] >> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". >> Core was generated by `ovs-vswitchd unix:/var/run/openvswitch/db.sock >> -vconsole:emer -vsyslog:err -vfi'. >> Program terminated with signal SIGSEGV, Segmentation fault. >> #0 nl_attr_get_size (nla=nla@entry=0x0) at ../lib/netlink.c:506 >> 506 ../lib/netlink.c: No such file or directory. >> (gdb) bt >> #0 nl_attr_get_size (nla=nla@entry=0x0) at ../lib/netlink.c:506 >> #1 0x0000000000460473 in format_generic_odp_key (a=a@entry=0x0, >> ds=ds@entry=0x7fff0408f3b0) at ../lib/odp-util.c:767 >> #2 0x0000000000460cd2 in format_odp_key_attr (a=a@entry=0xc485a4, >> ma=ma@entry=0x0, ds=ds@entry=0x7fff0408f3b0, verbose=verbose@entry=true) >> at ../lib/odp-util.c:1332 >> #3 0x00000000004609d7 in odp_flow_format (key=<optimized out>, >> key_len=40, mask=0x0, mask_len=0, ds=0x7fff0408f3b0, verbose=true) at >> ../lib/odp-util.c:1402 >> #4 0x0000000000460fc4 in format_odp_key_attr (a=a@entry=0xc48580, >> ma=ma@entry=0x0, ds=ds@entry=0x7fff0408f3b0, verbose=verbose@entry=true) >> at ../lib/odp-util.c:987 >> #5 0x00000000004609d7 in odp_flow_format (key=key@entry=0xc48520, >> key_len=key_len@entry=140, mask=mask@entry=0x0, mask_len=mask_len@entry >> =0, >> ds=ds@entry=0x7fff0408f3b0, verbose=verbose@entry=true) at >> ../lib/odp-util.c:1402 >> #6 0x00000000004450f3 in log_flow_message (error=error@entry=2, >> operation=operation@entry=0x4d0e73 "flow_del", key=0xc48520, >> key_len=140, mask=mask@entry=0x0, >> mask_len=mask_len@entry=0, stats=0x0, actions=actions@entry=0x0, >> actions_len=actions_len@entry=0, dpif=<optimized out>) at >> ../lib/dpif.c:1354 >> #7 0x00000000004453c9 in log_flow_del_message (dpif=dpif@entry=0xc489c0, >> del=del@entry=0x7fff0408f460, error=error@entry=2) at ../lib/dpif.c:1397 >> #8 0x0000000000445433 in log_flow_del_message (error=2, >> del=0x7fff0408f460, dpif=0xc489c0) at ../lib/dpif.c:1396 >> #9 dpif_flow_del__ (dpif=0xc489c0, del=del@entry=0x7fff0408f460) at >> ../lib/dpif.c:945 >> #10 0x00000000004455ca in dpif_flow_del (dpif=<optimized out>, >> key=<optimized out>, key_len=<optimized out>, >> stats=stats@entry=0x7fff0408f490) >> at ../lib/dpif.c:965 >> #11 0x000000000041b423 in subfacet_uninstall (subfacet=0xbd76a0) at >> ../ofproto/ofproto-dpif.c:4686 >> #12 0x0000000000420f18 in facet_remove (facet=facet@entry=0xbd72a0) at >> ../ofproto/ofproto-dpif.c:4014 >> #13 0x0000000000422f52 in facet_revalidate (facet=facet@entry=0xbd72a0) >> at ../ofproto/ofproto-dpif.c:4321 >> #14 0x0000000000424b5a in facet_lookup_valid (flow=0x7f3e700020a8, >> ofproto=0xc52600) at ../ofproto/ofproto-dpif.c:4203 >> #15 handle_flow_miss (n_ops=<synthetic pointer>, ops=0x7fff0408fb60, >> miss=0x7f3e70002090) at ../ofproto/ofproto-dpif.c:3339 >> #16 handle_flow_misses (fmb=fmb@entry=0x7f3e700008e0, backer=<optimized >> out>) at ../ofproto/ofproto-dpif.c:3410 >> #17 0x0000000000425196 in handle_upcalls (backer=<optimized out>) at >> ../ofproto/ofproto-dpif.c:3565 >> #18 dpif_backer_run_fast (backer=<optimized out>) at >> ../ofproto/ofproto-dpif.c:1007 >> #19 type_run_fast (type=<optimized out>) at ../ofproto/ofproto-dpif.c:1024 >> #20 0x00000000004122cf in ofproto_type_run_fast (datapath_type=<optimized >> out>, datapath_type@entry=0xc4ef20 "system") at ../ofproto/ofproto.c:1326 >> #21 0x00000000004081a5 in bridge_run_fast () at ../vswitchd/bridge.c:2318 >> #22 0x00000000004059c5 in main (argc=<optimized out>, argv=<optimized >> out>) at ../vswitchd/ovs-vswitchd.c:119 >> (gdb) >> >> >> >> [image: Nuvula AG] <http://www.nuvula.ch/> >> >> Marco Kuendig / CEO / Founder >> ma...@nuvula.ch / +41 78 751 99 71 >> >> Marco's Google Hangout >> <https://plus.google.com/hangouts/_/nuvula.ch/marco> >> >> Nuvula AG - Hybrid Clouds >> Weierbachstrasse 7b 8193 Eglisau Switzerland >> http://www.nuvula.ch >> >> On 31 Mar 2015, at 22:04, Joe Stringer <joestrin...@nicira.com> wrote: >> >> Great, we're moving. Looks like the gdb version of this is working below. >> Do you get the gdb prompt from there? the command 'bt' should provide the >> backtrace we're after. >> >> On 31 March 2015 at 12:52, Marco Kuendig <ma...@nuvula.ch> wrote: >> >>> that brought us a step forward. thank Sab. >>> >>> Important to know is: >>> >>> I got 4 kvm servers, meshed with openvswitch. I use vxlan for tunnelling. >>> >>> Sometimes when I restart a domain in kvm, 3 or 4 hosts crash at the same >>> time. >>> >>> I have STP enabled to avoid loops. >>> >>> >>> this is the output now: >>> >>> root@nuv-vir-kvm-server-1 ~ # gdb /usr/sbin/ovs-vswitchd >>> /var/crash/ovs/CoreDump >>> GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 >>> Copyright (C) 2014 Free Software Foundation, Inc. >>> License GPLv3+: GNU GPL version 3 or later < >>> http://gnu.org/licenses/gpl.html> >>> This is free software: you are free to change and redistribute it. >>> There is NO WARRANTY, to the extent permitted by law. Type "show >>> copying" >>> and "show warranty" for details. >>> This GDB was configured as "x86_64-linux-gnu". >>> Type "show configuration" for configuration details. >>> For bug reporting instructions, please see: >>> <http://www.gnu.org/software/gdb/bugs/>. >>> Find the GDB manual and other documentation resources online at: >>> <http://www.gnu.org/software/gdb/documentation/>. >>> For help, type "help". >>> Type "apropos word" to search for commands related to "word"... >>> Reading symbols from /usr/sbin/ovs-vswitchd...Reading symbols from >>> /usr/lib/debug//usr/sbin/ovs-vswitchd...done. >>> done. >>> [New LWP 32725] >>> [New LWP 32732] >>> [New LWP 32726] >>> [New LWP 32730] >>> [New LWP 32727] >>> [New LWP 32728] >>> [New LWP 32729] >>> [New LWP 32731] >>> [Thread debugging using libthread_db enabled] >>> Using host libthread_db library >>> "/lib/x86_64-linux-gnu/libthread_db.so.1". >>> Core was generated by `ovs-vswitchd unix:/var/run/openvswitch/db.sock >>> -vconsole:emer -vsyslog:err -vfi'. >>> Program terminated with signal SIGSEGV, Segmentation fault. >>> #0 nl_attr_get_size (nla=nla@entry=0x0) at ../lib/netlink.c:506 >>> 506 ../lib/netlink.c: No such file or directory. >>> >>> >>> root@nuv-vir-kvm-server-1 ~ # crash /usr/sbin/ovs-vswitchd >>> /var/crash/ovs/CoreDump >>> >>> crash 7.0.3 >>> Copyright (C) 2002-2013 Red Hat, Inc. >>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation >>> Copyright (C) 1999-2006 Hewlett-Packard Co >>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited >>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. >>> Copyright (C) 2005, 2011 NEC Corporation >>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. >>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. >>> This program is free software, covered by the GNU General Public License, >>> and you are welcome to change it and/or distribute copies of it under >>> certain conditions. Enter "help copying" to see the conditions. >>> This program has absolutely no warranty. Enter "help warranty" for >>> details. >>> >>> >>> crash: /usr/sbin/ovs-vswitchd: no debugging data available >>> >>> root@nuv-vir-kvm-server-1 ~ # ll /var/crash/ovs/ >>> Architecture Date ExecutableTimestamp ProcCwd >>> ProcStatus UserGroups >>> CoreDump DistroRelease ProblemType >>> ProcEnviron Signal >>> CrashCounter ExecutablePath ProcCmdline ProcMaps >>> Uname >>> >>> >>> [image: Nuvula AG] <http://www.nuvula.ch/> >>> >>> Marco Kuendig / CEO / Founder >>> ma...@nuvula.ch / +41 78 751 99 71 >>> >>> Marco's Google Hangout >>> <https://plus.google.com/hangouts/_/nuvula.ch/marco> >>> >>> Nuvula AG - Hybrid Clouds >>> Weierbachstrasse 7b 8193 Eglisau Switzerland >>> http://www.nuvula.ch >>> >>> On 31 Mar 2015, at 21:45, Sabyasachi Sengupta < >>> sabyasachi.sengu...@alcatel-lucent.com> wrote: >>> >>> >>> Typically Ubuntu does not unpack the crashes. Can you try apport-unpack? >>> # apport-unpack /var/crash/<name> <crash-dir> >>> >>> On Tue, 31 Mar 2015, Marco Kuendig wrote: >>> >>> thanks Joe and Ben >>> have done: >>> 1. installed dgb symbols for kernel....doesn't help >>> 2. installed debug symbols for openvswitch >>> no change, gdb and crash still don't work for me. I'm not a dev, need >>> more help to get that backtrace done. >>> here some output: >>> root@nuv-vir-kvm-server-1 ~ # crash >>> /usr/lib/debug/boot/vmlinux-3.13.0-48-generic >>> /var/crash/_usr_sbin_ovs-vswitchd.0.crash >>> crash 7.0.3 >>> Copyright (C) 2002-2013 Red Hat, Inc. >>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation >>> Copyright (C) 1999-2006 Hewlett-Packard Co >>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited >>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. >>> Copyright (C) 2005, 2011 NEC Corporation >>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. >>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. >>> This program is free software, covered by the GNU General Public License, >>> and you are welcome to change it and/or distribute copies of it under >>> certain conditions. Enter "help copying" to see the conditions. >>> This program has absolutely no warranty. Enter "help warranty" for >>> details. >>> crash: /var/crash/_usr_sbin_ovs-vswitchd.0.crash: not a supported file >>> format >>> Usage: >>> >>> crash [OPTION]... NAMELIST MEMORY-IMAGE (dumpfile form) >>> crash [OPTION]... [NAMELIST] (live system form) >>> Enter "crash -h" for details. >>> root@nuv-vir-kvm-server-1 ~ # gdb /usr/sbin/ovs-vswitchd >>> /var/crash/_usr_sbin_ovs-vswitchd.0.crash >>> GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 >>> Copyright (C) 2014 Free Software Foundation, Inc. >>> License GPLv3+: GNU GPL version 3 or later >>> <http://gnu.org/licenses/gpl.html> >>> This is free software: you are free to change and redistribute it. >>> There is NO WARRANTY, to the extent permitted by law. Type "show >>> copying" >>> and "show warranty" for details. >>> This GDB was configured as "x86_64-linux-gnu". >>> Type "show configuration" for configuration details. >>> For bug reporting instructions, please see: >>> <http://www.gnu.org/software/gdb/bugs/>. >>> Find the GDB manual and other documentation resources online at: >>> <http://www.gnu.org/software/gdb/documentation/>. >>> For help, type "help". >>> Type "apropos word" to search for commands related to "word"... >>> Reading symbols from /usr/sbin/ovs-vswitchd...Reading symbols from >>> /usr/lib/debug//usr/sbin/ovs-vswitchd...done. >>> done. >>> "/var/crash/_usr_sbin_ovs-vswitchd.0.crash" is not a core dump: File >>> format not recognized >>> (gdb) q >>> root@nuv-vir-kvm-server-1 ~ # >>> Nuvula AG >>> Marco Kuendig / CEO / Founder ma...@nuvula.ch / +41 78 751 99 71 >>> Marco's Google Hangout >>> Nuvula AG - Hybrid Clouds Weierbachstrasse 7b 8193 Eglisau Switzerland >>> http://www.nuvula.ch >>> >>> On 31 Mar 2015, at 19:00, Joe Stringer >>> <joestrin...@nicira.com> wrote: >>> For the 'File format not recognized' problem, you might have better >>> luck with the 'crash' utility. >>> $ crash <binary> <crashdump> >>> On 31 March 2015 at 08:16, Marco Kuendig <ma...@nuvula.ch> wrote: >>> Have tried this: >>> http://openvswitch.org/pipermail/discuss/2015-February/016582.html >>> this is the output, so doesn't seem to be correct: >>> root@nuv-vir-kvm-server-2 ~ # gdb /usr/sbin/ovs-vswitchd >>> /var/crash/_usr_sbin_ovs-vswitchd.0.crash >>> GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 >>> Copyright (C) 2014 Free Software Foundation, Inc. >>> License GPLv3+: GNU GPL version 3 or later >>> <http://gnu.org/licenses/gpl.html> >>> This is free software: you are free to change and >>> redistribute it. >>> There is NO WARRANTY, to the extent permitted by law. Type >>> "show copying" >>> and "show warranty" for details. >>> This GDB was configured as "x86_64-linux-gnu". >>> Type "show configuration" for configuration details. >>> For bug reporting instructions, please see: >>> <http://www.gnu.org/software/gdb/bugs/>. >>> Find the GDB manual and other documentation resources online >>> at: >>> <http://www.gnu.org/software/gdb/documentation/>. >>> For help, type "help". >>> Type "apropos word" to search for commands related to >>> "word"... >>> Reading symbols from /usr/sbin/ovs-vswitchd...(no debugging >>> symbols found)...done. >>> "/var/crash/_usr_sbin_ovs-vswitchd.0.crash" is not a core >>> dump: File format not recognized >>> (gdb) bt >>> No stack. >>> (gdb) quit >>> any more hints please ? >>> thanks >>> marco >>> Nuvula AG >>> Marco Kuendig / CEO / Founder ma...@nuvula.ch / +41 78 751 99 71 >>> Marco's Google Hangout >>> Nuvula AG - Hybrid Clouds Weierbachstrasse 7b 8193 Eglisau Switzerland >>> http://www.nuvula.ch >>> >>> On 31 Mar 2015, at 17:00, Ben Pfaff >>> <b...@nicira.com> wrote: >>> Can you get a backtrace for these? >>> On Tue, Mar 31, 2015 at 7:09 AM, Marco Kuendig >>> <ma...@nuvula.ch> wrote: >>> Folks, >>> any chance of having somebody look at these crash >>> files ? >>> I have several servers that are loosing network >>> connectivity because of this. >>> Downloads: >>> >>> https://drive.google.com/file/d/0Bx_w1Tf2B5VSRU9yUmRpTDJLVEU/view?usp=sharing >>> Thanks for any hint or fix >>> marco >>> Nuvula AG >>> Marco Kuendig / CEO / Founder ma...@nuvula.ch / +41 78 751 99 71 >>> Marco's Google Hangout >>> Nuvula AG - Hybrid Clouds Weierbachstrasse 7b 8193 Eglisau Switzerland >>> http://www.nuvula.ch >>> _______________________________________________ >>> discuss mailing list >>> discuss@openvswitch.org >>> http://openvswitch.org/mailman/listinfo/discuss >>> -- >>> "I don't normally do acked-by's. I think it's my way >>> of avoiding >>> getting blamed when it all blows up." Andrew Morton >>> _______________________________________________ >>> discuss mailing list >>> discuss@openvswitch.org >>> http://openvswitch.org/mailman/listinfo/discuss >>> >>> >>> >> >> > > >
_______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss