OK, I checked that... The error returned by SO_ERROR is always 0.
The socket is actually "alive": it would accept messages if sent to. I tried to change it to recvmsg. No changes... This is what I see from strace: .................................. [pid 24100] recvmsg(8, 0x7fff5e4803a0, MSG_PEEK) = -1 EAGAIN (Resource temporarily unavailable) [pid 24100] epoll_wait(4, {{EPOLLERR, {u32=8, u64=8}}}, 32, 314) = 1 [pid 24100] clock_gettime(CLOCK_MONOTONIC, {254928, 717051324}) = 0 [pid 24100] gettimeofday({1374289798, 822877}, NULL) = 0 [pid 24100] recvmsg(8, 0x7fff5e4803a0, MSG_PEEK) = -1 EAGAIN (Resource temporarily unavailable) [pid 24100] epoll_wait(4, {{EPOLLERR, {u32=8, u64=8}}}, 32, 313) = 1 [pid 24100] clock_gettime(CLOCK_MONOTONIC, {254928, 718103692}) = 0 [pid 24100] gettimeofday({1374289798, 823914}, NULL) = 0 .................................... Basically, the socket goes into a "gray" state - non-dead and non-totally-alive. I wonder if I see the results of the "new" UDP Linux weird behavior (RFC 1122) that many are complaining about, for example: http://web.mit.edu/Ghudson/info/linux.icmp I do not see anything like that in non-Linux *NIXes. Does it make any sense ? I am trying to figure out how it can be fixed at all. Thanks Oleg On Fri, Jul 19, 2013 at 8:28 AM, Nick Mathewson <ni...@freehaven.net> wrote: > On Fri, Jul 19, 2013 at 9:31 AM, Oleg Moskalenko <mom040...@gmail.com> > wrote: > > Thank you Azat for the suggestion. It seems to me that UDP sockets are > > offenders, somehow it happens only in Linux (I know Linux has some weird > UDP > > behavior): > > > > Process 20828 attached with 5 threads - interrupt to quit > > [pid 20831] clock_gettime(CLOCK_MONOTONIC, <unfinished ...> > > [pid 20832] clock_gettime(CLOCK_MONOTONIC, <unfinished ...> > > [pid 20831] <... clock_gettime resumed> {205614, 271115090}) = 0 > > [pid 20831] gettimeofday( <unfinished ...> > > [pid 20832] <... clock_gettime resumed> {205614, 271926086}) = 0 > > [pid 20831] <... gettimeofday resumed> {1374240484, 377784}, NULL) = 0 > > [pid 20832] gettimeofday( <unfinished ...> > > [pid 20831] epoll_wait(20, <unfinished ...> > > [pid 20829] clock_gettime(CLOCK_MONOTONIC, <unfinished ...> > > [pid 20830] clock_gettime(CLOCK_MONOTONIC, <unfinished ...> > > [pid 20832] <... gettimeofday resumed> {1374240484, 378418}, NULL) = 0 > > [pid 20832] epoll_wait(16, <unfinished ...> > > [pid 20830] <... clock_gettime resumed> {205614, 273231001}) = 0 > > [pid 20829] <... clock_gettime resumed> {205614, 272801617}) = 0 > > [pid 20829] gettimeofday( <unfinished ...> > > [pid 20830] gettimeofday( <unfinished ...> > > [pid 20829] <... gettimeofday resumed> {1374240484, 379094}, NULL) = 0 > > [pid 20829] epoll_wait(28, <unfinished ...> > > [pid 20830] <... gettimeofday resumed> {1374240484, 379317}, NULL) = 0 > > [pid 20830] epoll_wait(24, <unfinished ...> > > [pid 20828] recvfrom(8, 0x7fff61df20c0, 4, 2, 0xa9bc20, 0x7fff61df20bc) > = -1 > > EAGAIN (Resource temporarily unavailable) > > [pid 20828] epoll_wait(4, {{EPOLLERR, {u32=8, u64=8}}}, 32, 19) = 1 > > [pid 20828] clock_gettime(CLOCK_MONOTONIC, {205614, 277088474}) = 0 > > [pid 20828] gettimeofday({1374240484, 386338}, NULL) = 0 > > [pid 20828] recvfrom(8, 0x7fff61df20c0, 4, 2, 0xa9bc20, 0x7fff61df20bc) > = -1 > > EAGAIN (Resource temporarily unavailable) > > [pid 20828] epoll_wait(4, {{EPOLLERR, {u32=8, u64=8}}}, 32, 12) = 1 > > [pid 20828] clock_gettime(CLOCK_MONOTONIC, {205614, 286419826}) = 0 > > [pid 20828] gettimeofday({1374240484, 392232}, NULL) = 0 > > [pid 20828] recvfrom(8, 0x7fff61df20c0, 4, 2, 0xa9bc20, 0x7fff61df20bc) > = -1 > > Hm. So, epoll_wait is reporting EPOLLERR on fd 8. The Libevent > epoll.c code treats EPOLLERR as (EV_READ|EV_WRITE). But when you > recvfrom on the socket, it only says EAGAIN. > > So your program sensibly decides to keep listening for events on fd 8, > and epoll keeps telling you that there was an error. > > Assuming that this recvfrom is in your code, I'll echo Vsevolod's > question: what happens when you call getsockopt(...SO_ERROR...) on > the socket in the event handler that calls the recvfrom, to see what > the queued error is? > > -- > Nick > *********************************************************************** > To unsubscribe, send an e-mail to majord...@freehaven.net with > unsubscribe libevent-users in the body. >