On 06/02/2015 10:39 AM, Pavel Březina wrote:
On 05/27/2015 10:34 AM, Lukas Slebodnik wrote:
On (26/05/15 11:04), Pavel Březina wrote:
On 05/26/2015 10:57 AM, Lukas Slebodnik wrote:
On (26/05/15 10:44), Pavel Březina wrote:
On 05/25/2015 02:58 PM, Lukas Slebodnik wrote:
On (22/05/15 11:02), Pavel Březina wrote:
On 05/20/2015 02:20 PM, Lukas Slebodnik wrote:
ehlo,
attached patch fixes a crash if return code of proxy_child is
not 0.
You can reproduce it with sssd in non-root mode or
you can replace proxy_child with bash script.
I'm not sure about the name of tevent_immediate callback
so feel free to propose better name.
LS
Hi,
the whole proxy auth code does not really follow tevent style but
I think
your name is good enough.
However, I don't see why it should crash, can you
explain it in more detail please?
a) proxy_child_init_send is called
-- request with state "struct pc_init_ctx" is created
-- process is forked
-- parent register signal SIGCHLD with clallback
pc_init_sig_handler
b) child return non-zero
-- parent recieve pc_init_sig_handler
c) pc_init_sig_handler is executed
-- unused argument "__siginfo" contains NON null data
-- tevent_req_error(req, EIO) is called
-- proxy_child_init_done callback is called
-- request is released (including "struct tevent_signal" and
all related data)
-- unused argument "__siginfo" is NULL now
d) function pc_init_sig_handler finished
-- we are back in tevent code from singnal handler
-- tevent wants to do some clean-up and dereference NULL in
memset
It is not clear from back trace but the 1st argument of memset is
NULL.
(gdb) bt
#0 0x00007fd7ba400505 in memset (__len=<optimized out>,
__ch=<optimized out>, __dest=<optimized out>) at
/usr/include/bits/string3.h:84
#1 tevent_common_check_signal (ev=0x7fd7bfddf670) at
../tevent_signal.c:459
#2 0x00007fd7ba40228c in epoll_event_loop (tvalp=0x7fff85536430,
epoll_ev=0x7fd7bfddf8b0) at ../tevent_epoll.c:647
#3 epoll_event_loop_once (ev=<optimized out>, location=<optimized
out>) at ../tevent_epoll.c:926
#4 0x00007fd7ba4007d7 in std_event_loop_once (ev=0x7fd7bfddf670,
location=0x7fd7bdb417c3 "src/util/server.c:668") at
../tevent_standard.c:114
#5 0x00007fd7ba3fcfbd in _tevent_loop_once
(ev=ev@entry=0x7fd7bfddf670,
location=location@entry=0x7fd7bdb417c3 "src/util/server.c:668") at
../tevent.c:530
#6 0x00007fd7ba3fd15b in tevent_common_loop_wait
(ev=0x7fd7bfddf670, location=0x7fd7bdb417c3
"src/util/server.c:668") at ../tevent.c:634
#7 0x00007fd7ba400777 in std_event_loop_wait (ev=0x7fd7bfddf670,
location=0x7fd7bdb417c3 "src/util/server.c:668") at
../tevent_standard.c:140
#8 0x00007fd7bdb29343 in server_loop (main_ctx=0x7fd7bfde0ac0) at
src/util/server.c:668
#9 0x00007fd7be39ca42 in main (argc=8, argv=<optimized out>) at
src/providers/data_provider_be.c:2909
BTW it's easily reproducible with proxy provider. I can give you
an access
to such machine.
I hope it is clear now.
LS
Hi,
I'm sorry, but I don't think this is the problem. I'm not saying
the solution
may not be right, but the cause is different.
1) siginfo is not attached to tevent_signal context, so freeing
tevent_signal
does not do anything with siginfo
static struct tevent_sig_state {
struct tevent_common_signal_list
*sig_handlers[TEVENT_NUM_SIGNALS+1];
struct sigaction *oldact[TEVENT_NUM_SIGNALS+1];
struct tevent_sigcounter signal_count[TEVENT_NUM_SIGNALS+1];
struct tevent_sigcounter got_signal;
#ifdef SA_SIGINFO
/* with SA_SIGINFO we get quite a lot of info per signal */
siginfo_t *sig_info[TEVENT_NUM_SIGNALS+1];
struct tevent_sigcounter sig_blocked[TEVENT_NUM_SIGNALS+1];
#endif
} *sig_state;
2) it is a ring buffer, that is the reason why it is cleared in
tevent after
sighandler is finished
Yes, it is pointer to ringbuffer.
But argument "__siginfo" was valid pointer to ringbuffer before calling
tevent_req_error and after it was NULL. That's the reason why it
crashed.
Dereference of null pointer (in memset)
Yes. But the reason why it is NULL should not be that we freed
tevent_signal
from the handler (unless it is a bug in tevent).
It was cleared by tevent_signal_destructor
so it is not bug in tevent. it is by design.
IMHO it make sense. If you release tevent_signal
You do not care anymore about sig_info
Maybe it could be documented in tevent that you should not release
tevent_signal within handler if you created tevent signal with
SA_SIGINFO.
(gdb) bt
#0 0x00007f5d4d86cc74 in tevent_signal_destructor (se=0x7f5d5370f920)
at ../tevent_signal.c:213
#1 0x00007f5d4d65f233 in _talloc_free_internal () from
/lib64/libtalloc.so.2
#2 0x00007f5d4d6593a3 in _talloc_free () from /lib64/libtalloc.so.2
#3 0x00007f5d4342f3d4 in proxy_child_init_done
(subreq=0x7f5d5370f600) at src/providers/proxy/proxy_auth.c:436
#4 0x00007f5d4d86b0c2 in _tevent_req_error
(req=req@entry=0x7f5d5370f600, error=error@entry=5,
location=location@entry=0x7f5d43433010
"src/providers/proxy/proxy_auth.c:356")
at ../tevent_req.c:167
#5 0x00007f5d4342ef5e in pc_init_sig_handler (ev=<optimized out>,
sige=<optimized out>, signum=<optimized out>, count=<optimized out>,
__siginfo=<optimized out>, pvt=<optimized out>)
at src/providers/proxy/proxy_auth.c:356
#6 0x00007f5d4d86d48c in tevent_common_check_signal
(ev=0x7f5d536de670) at ../tevent_signal.c:428
#7 0x00007f5d4d86f28c in epoll_event_loop (tvalp=0x7fff7b568490,
epoll_ev=0x7f5d536de8b0) at ../tevent_epoll.c:647
#8 epoll_event_loop_once (ev=<optimized out>, location=<optimized
out>) at ../tevent_epoll.c:926
#9 0x00007f5d4d86d7d7 in std_event_loop_once (ev=0x7f5d536de670,
location=0x7f5d50faedc3 "src/util/server.c:668") at
../tevent_standard.c:114
#10 0x00007f5d4d869fbd in _tevent_loop_once
(ev=ev@entry=0x7f5d536de670, location=location@entry=0x7f5d50faedc3
"src/util/server.c:668") at ../tevent.c:530
#11 0x00007f5d4d86a15b in tevent_common_loop_wait (ev=0x7f5d536de670,
location=0x7f5d50faedc3 "src/util/server.c:668") at ../tevent.c:634
#12 0x00007f5d4d86d777 in std_event_loop_wait (ev=0x7f5d536de670,
location=0x7f5d50faedc3 "src/util/server.c:668") at
../tevent_standard.c:140
#13 0x00007f5d50f96863 in server_loop (main_ctx=0x7f5d536dfac0) at
src/util/server.c:668
#14 0x00007f5d5180aa42 in main (argc=8, argv=<optimized out>) at
src/providers/data_provider_be.c:2909
Then it a bug in tevent imho (access after free). We should bring this
to samba list.
Freeing tevent_signal in a handler is a perfectly valid operation, it is
said so even in the code.
Turned out to be a tevent bug, but it is still a good think to not use
siginfo when not needed.
https://lists.samba.org/archive/samba-technical/2015-June/107628.html
5) I'm not sure why it crashes, however we can stop using flag
SA_SIGINFO
since it is unused anyway, it should do the trick
That might be a solution too.
I tried it and it works.
6) freeing tevent_signal causes to run destructor of pc_init_ctx
which sends
kill(SIGKILL). Maybe it triggers some tevent bug?
Do you want to access to a machine to confirm wheter it's tevent bug
or no?
I do not want to dig in tevent :-)
Sure.
will do in private IRC message when you will be online :-)
It's better to double check.
LS
_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/sssd-devel