On Tue, Dec 13, 2016 at 9:52 AM, Felix Fietkau <n...@nbd.name> wrote: > On 2016-12-13 08:41, Alexandru Ardelean wrote: >> Every once in a while, we'll get stacktrace: >> ``` >> (gdb) bt >> \#0 0xb7bc4668 in _list_del (entry=0x10015688 <conn+116>) at >> /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/list.h:83 >> \#1 list_del (entry=0x10015688 <conn+116>) at >> /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/list.h:90 >> \#2 uloop_timeout_cancel (timeout=timeout@entry=0x10015688 <conn+116>) at >> /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/uloop.c:474 >> \#3 0x10003794 in ubus_auto_shutdown (conn=0x10015614 <conn>) at >> /home/sandu/work/Wrt/openwrt/staging_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/usr/include/libubus.h:249 >> \#4 module_ubus_fini () at ubus.c:64 >> \#5 0x100013fc in main (argc=<optimized out>, argv=<optimized out>) at >> module.c:128 >> ``` >> >> In our code, `ubus_auto_connect()` is called, then due to some logic, >> the module has to quickly stop, calling ubus_auto_shutdown(). >> >> It seems that there is case in where the `timer` timeout, is not yet >> registered with the internal list of timeouts from uloop, which >> ends up trying to delete an invalid list. >> >> So, one solution is to init the list element of the `timer` of >> the `ubus_auto_conn` struct. >> >> Signed-off-by: Alexandru Ardelean <ardeleana...@gmail.com> > I think this patch is a workaround that simply hides the real issue. > uloop_timeout_cancel will only call list_del if the timer 'pending' flag > is set. > > It seems to me that you might be using ubus_auto_connect on > uninitialized memory. If that's the case, you might run into issues in > other places as well, and you should just add a memset to your application. > > - Felix
Hmmm, you're right. It is only one module I'm seeing this in. Maybe it's some weird memory corruption, or race ; since that ubus_auto_conn struct is init-ed to 0 [since it's static]. Will dig deeper into that. Sorry for the noise. _______________________________________________ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev