On Tue, Jun 02, 2015 at 02:40:01PM +0200, Pavel Březina wrote: > On 06/02/2015 10:39 AM, Pavel Březina wrote: > >On 05/27/2015 10:34 AM, Lukas Slebodnik wrote: > >>On (26/05/15 11:04), Pavel Březina wrote: > >>>On 05/26/2015 10:57 AM, Lukas Slebodnik wrote: > >>>>On (26/05/15 10:44), Pavel Březina wrote: > >>>>>On 05/25/2015 02:58 PM, Lukas Slebodnik wrote: > >>>>>>On (22/05/15 11:02), Pavel Březina wrote: > >>>>>>>On 05/20/2015 02:20 PM, Lukas Slebodnik wrote: > >>>>>>>>ehlo, > >>>>>>>> > >>>>>>>>attached patch fixes a crash if return code of proxy_child is > >>>>>>>>not 0. > >>>>>>>>You can reproduce it with sssd in non-root mode or > >>>>>>>>you can replace proxy_child with bash script. > >>>>>>>> > >>>>>>>>I'm not sure about the name of tevent_immediate callback > >>>>>>>>so feel free to propose better name. > >>>>>>>> > >>>>>>>>LS > >>>>>>> > >>>>>>>Hi, > >>>>>>>the whole proxy auth code does not really follow tevent style but > >>>>>>>I think > >>>>>>>your name is good enough. > >>>>>>>However, I don't see why it should crash, can you > >>>>>>>explain it in more detail please? > >>>>>> > >>>>>>a) proxy_child_init_send is called > >>>>>> -- request with state "struct pc_init_ctx" is created > >>>>>> -- process is forked > >>>>>> -- parent register signal SIGCHLD with clallback > >>>>>>pc_init_sig_handler > >>>>>>b) child return non-zero > >>>>>> -- parent recieve pc_init_sig_handler > >>>>>>c) pc_init_sig_handler is executed > >>>>>> -- unused argument "__siginfo" contains NON null data > >>>>>> -- tevent_req_error(req, EIO) is called > >>>>>> -- proxy_child_init_done callback is called > >>>>>> -- request is released (including "struct tevent_signal" and > >>>>>>all related data) > >>>>>> -- unused argument "__siginfo" is NULL now > >>>>>>d) function pc_init_sig_handler finished > >>>>>> -- we are back in tevent code from singnal handler > >>>>>> -- tevent wants to do some clean-up and dereference NULL in > >>>>>>memset > >>>>>> > >>>>>>It is not clear from back trace but the 1st argument of memset is > >>>>>>NULL. > >>>>>> > >>>>>>(gdb) bt > >>>>>>#0 0x00007fd7ba400505 in memset (__len=<optimized out>, > >>>>>>__ch=<optimized out>, __dest=<optimized out>) at > >>>>>>/usr/include/bits/string3.h:84 > >>>>>>#1 tevent_common_check_signal (ev=0x7fd7bfddf670) at > >>>>>>../tevent_signal.c:459 > >>>>>>#2 0x00007fd7ba40228c in epoll_event_loop (tvalp=0x7fff85536430, > >>>>>>epoll_ev=0x7fd7bfddf8b0) at ../tevent_epoll.c:647 > >>>>>>#3 epoll_event_loop_once (ev=<optimized out>, location=<optimized > >>>>>>out>) at ../tevent_epoll.c:926 > >>>>>>#4 0x00007fd7ba4007d7 in std_event_loop_once (ev=0x7fd7bfddf670, > >>>>>>location=0x7fd7bdb417c3 "src/util/server.c:668") at > >>>>>>../tevent_standard.c:114 > >>>>>>#5 0x00007fd7ba3fcfbd in _tevent_loop_once > >>>>>>(ev=ev@entry=0x7fd7bfddf670, > >>>>>>location=location@entry=0x7fd7bdb417c3 "src/util/server.c:668") at > >>>>>>../tevent.c:530 > >>>>>>#6 0x00007fd7ba3fd15b in tevent_common_loop_wait > >>>>>>(ev=0x7fd7bfddf670, location=0x7fd7bdb417c3 > >>>>>>"src/util/server.c:668") at ../tevent.c:634 > >>>>>>#7 0x00007fd7ba400777 in std_event_loop_wait (ev=0x7fd7bfddf670, > >>>>>>location=0x7fd7bdb417c3 "src/util/server.c:668") at > >>>>>>../tevent_standard.c:140 > >>>>>>#8 0x00007fd7bdb29343 in server_loop (main_ctx=0x7fd7bfde0ac0) at > >>>>>>src/util/server.c:668 > >>>>>>#9 0x00007fd7be39ca42 in main (argc=8, argv=<optimized out>) at > >>>>>>src/providers/data_provider_be.c:2909 > >>>>>> > >>>>>> > >>>>>>BTW it's easily reproducible with proxy provider. I can give you > >>>>>>an access > >>>>>>to such machine. > >>>>>> > >>>>>>I hope it is clear now. > >>>>>> > >>>>>>LS > >>>>> > >>>>>Hi, > >>>>>I'm sorry, but I don't think this is the problem. I'm not saying > >>>>>the solution > >>>>>may not be right, but the cause is different. > >>>>> > >>>>>1) siginfo is not attached to tevent_signal context, so freeing > >>>>>tevent_signal > >>>>>does not do anything with siginfo > >>>>> > >>>>>>static struct tevent_sig_state { > >>>>>> struct tevent_common_signal_list > >>>>>>*sig_handlers[TEVENT_NUM_SIGNALS+1]; > >>>>>> struct sigaction *oldact[TEVENT_NUM_SIGNALS+1]; > >>>>>> struct tevent_sigcounter signal_count[TEVENT_NUM_SIGNALS+1]; > >>>>>> struct tevent_sigcounter got_signal; > >>>>>>#ifdef SA_SIGINFO > >>>>>> /* with SA_SIGINFO we get quite a lot of info per signal */ > >>>>>> siginfo_t *sig_info[TEVENT_NUM_SIGNALS+1]; > >>>>>> struct tevent_sigcounter sig_blocked[TEVENT_NUM_SIGNALS+1]; > >>>>>>#endif > >>>>>>} *sig_state; > >>>>> > >>>>>2) it is a ring buffer, that is the reason why it is cleared in > >>>>>tevent after > >>>>>sighandler is finished > >>>>> > >>>>Yes, it is pointer to ringbuffer. > >>>>But argument "__siginfo" was valid pointer to ringbuffer before calling > >>>>tevent_req_error and after it was NULL. That's the reason why it > >>>>crashed. > >>>>Dereference of null pointer (in memset) > >>> > >>>Yes. But the reason why it is NULL should not be that we freed > >>>tevent_signal > >>>from the handler (unless it is a bug in tevent). > >>> > >>It was cleared by tevent_signal_destructor > >>so it is not bug in tevent. it is by design. > >> > >>IMHO it make sense. If you release tevent_signal > >>You do not care anymore about sig_info > >> > >>Maybe it could be documented in tevent that you should not release > >>tevent_signal within handler if you created tevent signal with > >>SA_SIGINFO. > >> > >> > >> > >>(gdb) bt > >>#0 0x00007f5d4d86cc74 in tevent_signal_destructor (se=0x7f5d5370f920) > >>at ../tevent_signal.c:213 > >>#1 0x00007f5d4d65f233 in _talloc_free_internal () from > >>/lib64/libtalloc.so.2 > >>#2 0x00007f5d4d6593a3 in _talloc_free () from /lib64/libtalloc.so.2 > >>#3 0x00007f5d4342f3d4 in proxy_child_init_done > >>(subreq=0x7f5d5370f600) at src/providers/proxy/proxy_auth.c:436 > >>#4 0x00007f5d4d86b0c2 in _tevent_req_error > >>(req=req@entry=0x7f5d5370f600, error=error@entry=5, > >>location=location@entry=0x7f5d43433010 > >>"src/providers/proxy/proxy_auth.c:356") > >> at ../tevent_req.c:167 > >>#5 0x00007f5d4342ef5e in pc_init_sig_handler (ev=<optimized out>, > >>sige=<optimized out>, signum=<optimized out>, count=<optimized out>, > >>__siginfo=<optimized out>, pvt=<optimized out>) > >> at src/providers/proxy/proxy_auth.c:356 > >>#6 0x00007f5d4d86d48c in tevent_common_check_signal > >>(ev=0x7f5d536de670) at ../tevent_signal.c:428 > >>#7 0x00007f5d4d86f28c in epoll_event_loop (tvalp=0x7fff7b568490, > >>epoll_ev=0x7f5d536de8b0) at ../tevent_epoll.c:647 > >>#8 epoll_event_loop_once (ev=<optimized out>, location=<optimized > >>out>) at ../tevent_epoll.c:926 > >>#9 0x00007f5d4d86d7d7 in std_event_loop_once (ev=0x7f5d536de670, > >>location=0x7f5d50faedc3 "src/util/server.c:668") at > >>../tevent_standard.c:114 > >>#10 0x00007f5d4d869fbd in _tevent_loop_once > >>(ev=ev@entry=0x7f5d536de670, location=location@entry=0x7f5d50faedc3 > >>"src/util/server.c:668") at ../tevent.c:530 > >>#11 0x00007f5d4d86a15b in tevent_common_loop_wait (ev=0x7f5d536de670, > >>location=0x7f5d50faedc3 "src/util/server.c:668") at ../tevent.c:634 > >>#12 0x00007f5d4d86d777 in std_event_loop_wait (ev=0x7f5d536de670, > >>location=0x7f5d50faedc3 "src/util/server.c:668") at > >>../tevent_standard.c:140 > >>#13 0x00007f5d50f96863 in server_loop (main_ctx=0x7f5d536dfac0) at > >>src/util/server.c:668 > >>#14 0x00007f5d5180aa42 in main (argc=8, argv=<optimized out>) at > >>src/providers/data_provider_be.c:2909 > > > >Then it a bug in tevent imho (access after free). We should bring this > >to samba list. > > > >Freeing tevent_signal in a handler is a perfectly valid operation, it is > >said so even in the code. > > Turned out to be a tevent bug, but it is still a good think to not use > siginfo when not needed. > > https://lists.samba.org/archive/samba-technical/2015-June/107628.html
Thanks for the follow up and the idea of getting rid if siginfo! _______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel