On Fri, Jan 13, 2012 at 11:46 AM, Ralph Castain <r...@open-mpi.org> wrote:
>> Adding some assertions in event_add_internal might track this down. >> Trivially, you could do >> if (tv && !tv_is_absolute) { >> /* waiting one billion seconds should be enough for anyone */ >> EVUTIL_ASSERT(tv->tv_sec < 1000000000); >> } >> >> to try to detect 1 and 2. > > Interesting. The above code never tripped, so I dug a little further and > found that event_add_internal is never being called with a tv value that is > large. I did find it to be a race condition - sometimes the code completes > and exits before I get the error condition report. > > The timeout value clearly isn't a garbage value - I dumped the values out, > compared to current time as of the error: > > warn] select: Invalid argument > TV OUT OF SPEC AT CNT 2: value 1326472513:976848 curtime 1326472513:977043 > Ralph > [warn] select: Invalid argument > TV OUT OF SPEC AT CNT 3: value 1326472513:977327 curtime 1326472513:977413 > > So the value is getting updated and appears valid. What's strange is why > libevent is passing an absolute time to select as it is supposed to be a > relative value per the man page: Right. [...] > Any easy way I can output an identifier that would tell us something about > which event is involved? I see that I'm not getting output from the > event_debug calls in the code, even though I've configured with debug enabled > and called: > > event_enable_debug_mode(); > event_set_debug_output(1); > > Anything else required to get that output? Would it help? Maybe. I'd look at adding debugging logs or printfs to event_base_loop and timeout_next() and timeout_correct(): Those are the ones that determine the value of the timeout to be passed to evsel->dispatch. The event that's getting the weird value is set by ev = min_heap_top(&base->timeheap); in timeout_next. -- Nick *********************************************************************** To unsubscribe, send an e-mail to majord...@freehaven.net with unsubscribe libevent-users in the body.