> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote: > > tags/11.7.0/main/stdtime/localtime.c, lines 395-403 > > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line395> > > > > Using a global static struct without locking is not thread safe. More > > on this below.
This should be guarded by the zonelist lock, like the list itsel and all struct state. It is only touched in kqueue_daemon(), where in the next patch all handling of sp and psx_sp is with the lock, and in add_notify() to which all code paths should be entered only with the lock held. More below. > On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote: > > tags/11.7.0/main/stdtime/localtime.c, lines 287-302 > > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line287> > > > > Having the various blocks of SP_INIT_EX/SP_FREE_EX defined around the > > code is actually a bit confusing, particularly since the structure they > > initialize/destroy already has various #defines for the various system > > options it has to support. How about having this defined immediately after > > struct state, and handling the "flavors" there: > > > > struct state { > > ... > > }; > > > > #if defined(HAVE_INOTIFY) > > #define SP_INIT_EX ... > > #define SP_FREE_EX ... > > #elif defined(HAVE_KQUEUE) > > #define SP_INIT_EX ... > > #define SP_FREE_EX ... > > #else > > #define SP_INIT_EX ... > > #define SP_FREE_EX ... > > #endif > > > > That way it's clear why this is getting defined multiple times, and the > > declaration of the macro immediately following the definition of the > > structure makes it easy to see why the various variants exist. Done. Also, names changed to SP_HEAP_{INIT,FREE}, and an additional macro SP_STACK_INIT for one instance of a temporary struct state as an automatic -- a bug I hadn't spotted before. > On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote: > > tags/11.7.0/main/stdtime/localtime.c, line 408 > > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line408> > > > > If you move kqueue_daemon_freestate(struct state *sp) above > > kqueue_daemon, then you won't need this forward declaration. I've removed the prototype; also, references to the proc are now only in the SP_HEAP_FREE macro which is only expanded below the definition. > On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote: > > tags/11.7.0/main/stdtime/localtime.c, lines 482-484 > > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line482> > > > > I'm not sure why we are printing out to stderr here, but generally > > Asterisk does not do that unless (a) the logger couldn't be created and (b) > > things have gone horribly wrong. That doesn't seem like that would > > necessarily be the case here. > > Ed Hynan wrote: > I agree, no need for a message there. Re. stderr, it's used several other > places in original code, so it seemed preferred in this context. BTW, don't the ast log functions use time formatting? Are they suitable within localtime.c? > On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote: > > tags/11.7.0/main/stdtime/localtime.c, lines 765-768 > > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line765> > > > > Generally, we tend to prefer allocations to be written as: > > > > p = ast_calloc(1, sizeof(*p)); > > if (!p) { > > /* Handle memory error */ > > } > > > > It generally prevents paranthesis errors, and newlines are relatively > > inexpensive :-) Done. > On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote: > > tags/11.7.0/main/stdtime/localtime.c, lines 775-778 > > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line775> > > > > Since ast_free is NULL tolerant, I would go ahead and make SP_FREE_EX > > NULL tolerant as well. That makes this: > > > > static void sstate_free(struct state *p) > > { > > SP_FREE_EX(p); > > ast_free(p); > > } Done. > On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote: > > tags/11.7.0/main/stdtime/localtime.c, lines 1609-1610 > > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line1609> > > > > This shouldn't be necessary. Called in a loop, AST_LIST_REMOVE_HEAD > > will: > > > > (1) Remove items until (cur = head->first) == NULL. That implies > > head->first is NULL when complete. > > (2) Will set head->last = NULL when the (cur == head->last), which > > should happen on the last loop iteration. > > > > Since AST_LIST_HEAD_INIT_NOLOCK simply sets both of those to NULL, it > > is redundant with the final loop state. Removed. Had been added with a poor picture of the cause of corruption, and seemed to do the trick, but only due to coincidence with other changes. Works without it. > On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote: > > tags/11.7.0/main/stdtime/localtime.c, lines 566-587 > > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line566> > > > > This is not actually thread safe. You cannot guarantee that two > > threads, at the same time, won't attempt to test and allocate the structure > > and assign it to psx_sp. > > > > What's more, the initialization of localtime happens very early in the > > Asterisk startup, and has some very interesting race conditions with > > forking/threading. I would definitely protect initialization of the psx_sp > > pointer with a lock. > > Ed Hynan wrote: > Good eyes. I proceeded thinking calls to tzload() (which calls > aadd_notify()) were guarded by AST_LIST_LOCK(&zonelist) but another look now > shows a place where tzload()(and tzparse() and gmtload()) are called between > an unlock and another lock (ast_tzset() line 1451 unpatched). Yet other > places make the calls while the lock is held. > > This is an extra complication. There was one 'hole' where code paths to add_notify() were entered without the zonelist lock: ast_tzset() was unlocking before tzparse/tzload/gmtload and re-locking after. That is very likely the source of corruption that had been causing error returns leading to repeated tzload("posixrules"). In the revised patch, the lock is held throughout ast_tzset(). I've studied the code, and checked with GNU cflow, to be sure paths to add_notify() are protected. Also, if you try the revised test program (against 11.7.0 source), have a look at the logs it makes. The test really hammers localtime.c with threads. Build with "make hack", run ./run_test.sh, and after a long run, try grep -E '(BAD|TZSET)' /tmp/testlocaltimed.log (or view logs from menu) -- there should be no "BAD" but *many* "TZSET" showing how much the lock is needed. In addition to ensuring the relevant code paths have the lock, the use of psx_sp within add_notify() is now guarded with AST_LIST_TRYLOCK(&zonelist); which should be redundant, but better safe than sorry. (It was only yesterday I noticed AST mutexes are set RECURSIVE, and that misunderstanding made me check the code even more, which is not a bad thing.) - Ed ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviewboard.asterisk.org/r/4450/#review14553 ----------------------------------------------------------- On March 3, 2015, 2:47 p.m., Ed Hynan wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviewboard.asterisk.org/r/4450/ > ----------------------------------------------------------- > > (Updated March 3, 2015, 2:47 p.m.) > > > Review request for Asterisk Developers. > > > Bugs: ASTERISK-24739 > https://issues.asterisk.org/jira/browse/ASTERISK-24739 > > > Repository: Asterisk > > > Description > ------- > > File descriptor leak on kqueue(2) systems (found on OpenBSD 5.5) on > main/stdtime/localtime.c -- TZ data files change watch code within the time > functions. (Issue at https://issues.asterisk.org/jira/browse/ASTERISK-24739.) > > > Diffs > ----- > > tags/11.7.0/main/stdtime/localtime.c 432443 > > Diff: https://reviewboard.asterisk.org/r/4450/diff/ > > > Testing > ------- > > Patch initially developed against OpenBSD 5.5 ports package sources, and > Asterisk in use. Subsequently developed test program that can run original or > patched 11.7.0 code. > > > File Attachments > ---------------- > > localtime.c (11.7.0) test program > > https://reviewboard.asterisk.org/media/uploaded/files/2015/02/25/f7a19f2a-4a8a-4aff-b7d6-44a4146c358c__patchtest.tgz > > > Thanks, > > Ed Hynan > >
-- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev