Re: [LEDE-DEV] [PATCH] libubus: initialize list member for ubus_auto_conn's timer
On Tue, Dec 13, 2016 at 9:52 AM, Felix Fietkauwrote: > On 2016-12-13 08:41, Alexandru Ardelean wrote: >> Every once in a while, we'll get stacktrace: >> ``` >> (gdb) bt >> \#0 0xb7bc4668 in _list_del (entry=0x10015688 ) at >> /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/list.h:83 >> \#1 list_del (entry=0x10015688 ) at >> /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/list.h:90 >> \#2 uloop_timeout_cancel (timeout=timeout@entry=0x10015688 ) at >> /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/uloop.c:474 >> \#3 0x10003794 in ubus_auto_shutdown (conn=0x10015614 ) at >> /home/sandu/work/Wrt/openwrt/staging_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/usr/include/libubus.h:249 >> \#4 module_ubus_fini () at ubus.c:64 >> \#5 0x100013fc in main (argc=, argv=) at >> module.c:128 >> ``` >> >> In our code, `ubus_auto_connect()` is called, then due to some logic, >> the module has to quickly stop, calling ubus_auto_shutdown(). >> >> It seems that there is case in where the `timer` timeout, is not yet >> registered with the internal list of timeouts from uloop, which >> ends up trying to delete an invalid list. >> >> So, one solution is to init the list element of the `timer` of >> the `ubus_auto_conn` struct. >> >> Signed-off-by: Alexandru Ardelean > I think this patch is a workaround that simply hides the real issue. > uloop_timeout_cancel will only call list_del if the timer 'pending' flag > is set. > > It seems to me that you might be using ubus_auto_connect on > uninitialized memory. If that's the case, you might run into issues in > other places as well, and you should just add a memset to your application. > > - Felix Hmmm, you're right. It is only one module I'm seeing this in. Maybe it's some weird memory corruption, or race ; since that ubus_auto_conn struct is init-ed to 0 [since it's static]. Will dig deeper into that. Sorry for the noise. ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] [PATCH] libubus: initialize list member for ubus_auto_conn's timer
On 2016-12-13 08:41, Alexandru Ardelean wrote: > Every once in a while, we'll get stacktrace: > ``` > (gdb) bt > \#0 0xb7bc4668 in _list_del (entry=0x10015688) at > /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/list.h:83 > \#1 list_del (entry=0x10015688 ) at > /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/list.h:90 > \#2 uloop_timeout_cancel (timeout=timeout@entry=0x10015688 ) at > /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/uloop.c:474 > \#3 0x10003794 in ubus_auto_shutdown (conn=0x10015614 ) at > /home/sandu/work/Wrt/openwrt/staging_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/usr/include/libubus.h:249 > \#4 module_ubus_fini () at ubus.c:64 > \#5 0x100013fc in main (argc=, argv=) at > module.c:128 > ``` > > In our code, `ubus_auto_connect()` is called, then due to some logic, > the module has to quickly stop, calling ubus_auto_shutdown(). > > It seems that there is case in where the `timer` timeout, is not yet > registered with the internal list of timeouts from uloop, which > ends up trying to delete an invalid list. > > So, one solution is to init the list element of the `timer` of > the `ubus_auto_conn` struct. > > Signed-off-by: Alexandru Ardelean I think this patch is a workaround that simply hides the real issue. uloop_timeout_cancel will only call list_del if the timer 'pending' flag is set. It seems to me that you might be using ubus_auto_connect on uninitialized memory. If that's the case, you might run into issues in other places as well, and you should just add a memset to your application. - Felix ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
[LEDE-DEV] [PATCH] libubus: initialize list member for ubus_auto_conn's timer
Every once in a while, we'll get stacktrace: ``` (gdb) bt \#0 0xb7bc4668 in _list_del (entry=0x10015688) at /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/list.h:83 \#1 list_del (entry=0x10015688 ) at /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/list.h:90 \#2 uloop_timeout_cancel (timeout=timeout@entry=0x10015688 ) at /home/sandu/work/Wrt/openwrt/build_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/libubox-2015-06-29/uloop.c:474 \#3 0x10003794 in ubus_auto_shutdown (conn=0x10015614 ) at /home/sandu/work/Wrt/openwrt/staging_dir/target-powerpc_8540_uClibc-0.9.33.2_ewok/usr/include/libubus.h:249 \#4 module_ubus_fini () at ubus.c:64 \#5 0x100013fc in main (argc=, argv=) at module.c:128 ``` In our code, `ubus_auto_connect()` is called, then due to some logic, the module has to quickly stop, calling ubus_auto_shutdown(). It seems that there is case in where the `timer` timeout, is not yet registered with the internal list of timeouts from uloop, which ends up trying to delete an invalid list. So, one solution is to init the list element of the `timer` of the `ubus_auto_conn` struct. Signed-off-by: Alexandru Ardelean --- libubus.c | 1 + 1 file changed, 1 insertion(+) diff --git a/libubus.c b/libubus.c index 8163ff7..faa30d3 100644 --- a/libubus.c +++ b/libubus.c @@ -337,6 +337,7 @@ static void ubus_auto_connect_cb(struct uloop_timeout *timeout) void ubus_auto_connect(struct ubus_auto_conn *conn) { conn->timer.cb = ubus_auto_connect_cb; + INIT_LIST_HEAD(>timer.list); ubus_auto_connect_cb(>timer); } -- 2.6.6 ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev