> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 395-403
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line395>
> >
> >     Using a global static struct without locking is not thread safe. More 
> > on this below.

This should be guarded by the zonelist lock, like the list itsel and all struct 
state. It is only touched in kqueue_daemon(), where in the next patch all 
handling of sp and psx_sp is with the lock, and in add_notify() to which all 
code paths should be entered only with the lock held. More below.


> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 287-302
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line287>
> >
> >     Having the various blocks of SP_INIT_EX/SP_FREE_EX defined around the 
> > code is actually a bit confusing, particularly since the structure they 
> > initialize/destroy already has various #defines for the various system 
> > options it has to support. How about having this defined immediately after 
> > struct state, and handling the "flavors" there:
> >     
> >     struct state {
> >         ...
> >     };
> >     
> >     #if defined(HAVE_INOTIFY)
> >     #define SP_INIT_EX ...
> >     #define SP_FREE_EX ...
> >     #elif defined(HAVE_KQUEUE)
> >     #define SP_INIT_EX ...
> >     #define SP_FREE_EX ...
> >     #else
> >     #define SP_INIT_EX ...
> >     #define SP_FREE_EX ...
> >     #endif
> >     
> >     That way it's clear why this is getting defined multiple times, and the 
> > declaration of the macro immediately following the definition of the 
> > structure makes it easy to see why the various variants exist.

Done. Also, names changed to SP_HEAP_{INIT,FREE}, and an additional macro 
SP_STACK_INIT for one instance of a temporary struct state as an automatic -- a 
bug I hadn't spotted before.


> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, line 408
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line408>
> >
> >     If you move kqueue_daemon_freestate(struct state *sp) above 
> > kqueue_daemon, then you won't need this forward declaration.

I've removed the prototype; also, references to the proc are now only in the 
SP_HEAP_FREE macro which is only expanded below the definition.


> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 482-484
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line482>
> >
> >     I'm not sure why we are printing out to stderr here, but generally 
> > Asterisk does not do that unless (a) the logger couldn't be created and (b) 
> > things have gone horribly wrong. That doesn't seem like that would 
> > necessarily be the case here.
> 
> Ed Hynan wrote:
>     I agree, no need for a message there. Re. stderr, it's used several other 
> places in original code, so it seemed preferred in this context.

BTW, don't the ast log functions use time formatting? Are they suitable within 
localtime.c?


> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 765-768
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line765>
> >
> >     Generally, we tend to prefer allocations to be written as:
> >     
> >     p = ast_calloc(1, sizeof(*p));
> >     if (!p) {
> >         /* Handle memory error */
> >     }
> >     
> >     It generally prevents paranthesis errors, and newlines are relatively 
> > inexpensive :-)

Done.


> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 775-778
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line775>
> >
> >     Since ast_free is NULL tolerant, I would go ahead and make SP_FREE_EX 
> > NULL tolerant as well. That makes this:
> >     
> >     static void sstate_free(struct state *p)
> >     {
> >         SP_FREE_EX(p);
> >         ast_free(p);
> >     }

Done.


> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 1609-1610
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line1609>
> >
> >     This shouldn't be necessary. Called in a loop, AST_LIST_REMOVE_HEAD 
> > will:
> >     
> >     (1) Remove items until (cur = head->first) == NULL. That implies 
> > head->first is NULL when complete.
> >     (2) Will set head->last = NULL when the (cur == head->last), which 
> > should happen on the last loop iteration.
> >     
> >     Since AST_LIST_HEAD_INIT_NOLOCK simply sets both of those to NULL, it 
> > is redundant with the final loop state.

Removed. Had been added with a poor picture of the cause of corruption, and 
seemed to do the trick, but only due to coincidence with other changes. Works 
without it.


> On Feb. 25, 2015, 6:15 p.m., Matt Jordan wrote:
> > tags/11.7.0/main/stdtime/localtime.c, lines 566-587
> > <https://reviewboard.asterisk.org/r/4450/diff/1/?file=71624#file71624line566>
> >
> >     This is not actually thread safe. You cannot guarantee that two 
> > threads, at the same time, won't attempt to test and allocate the structure 
> > and assign it to psx_sp.
> >     
> >     What's more, the initialization of localtime happens very early in the 
> > Asterisk startup, and has some very interesting race conditions with 
> > forking/threading. I would definitely protect initialization of the psx_sp 
> > pointer with a lock.
> 
> Ed Hynan wrote:
>     Good eyes. I proceeded thinking calls to tzload() (which calls 
> aadd_notify()) were guarded by AST_LIST_LOCK(&zonelist) but another look now 
> shows a place where tzload()(and tzparse() and gmtload()) are called between 
> an unlock and another lock (ast_tzset() line 1451 unpatched).  Yet other 
> places make the calls while the lock is held.
>     
>     This is an extra complication.

There was one 'hole' where code paths to add_notify() were entered without the 
zonelist lock: ast_tzset() was unlocking before tzparse/tzload/gmtload and 
re-locking after. That is very likely the source of corruption that had been 
causing error returns leading to repeated tzload("posixrules").

In the revised patch, the lock is held throughout ast_tzset().  I've studied 
the code, and checked with GNU cflow, to be sure paths to add_notify() are 
protected. Also, if you try the revised test program (against 11.7.0 source), 
have a look at the logs it makes. The test really hammers localtime.c with 
threads. Build with "make hack", run ./run_test.sh, and after a long run, try 
grep -E '(BAD|TZSET)' /tmp/testlocaltimed.log (or view logs from menu) -- there 
should be no "BAD" but *many* "TZSET" showing how much the lock is needed.

In addition to ensuring the relevant code paths have the lock, the use of 
psx_sp within add_notify() is now guarded with AST_LIST_TRYLOCK(&zonelist); 
which should be redundant, but better safe than sorry. (It was only yesterday I 
noticed AST mutexes are set RECURSIVE, and that misunderstanding made me check 
the code even more, which is not a bad thing.)


- Ed


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/4450/#review14553
-----------------------------------------------------------


On March 3, 2015, 2:47 p.m., Ed Hynan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/4450/
> -----------------------------------------------------------
> 
> (Updated March 3, 2015, 2:47 p.m.)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Bugs: ASTERISK-24739
>     https://issues.asterisk.org/jira/browse/ASTERISK-24739
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> File descriptor leak on kqueue(2) systems (found on OpenBSD 5.5) on 
> main/stdtime/localtime.c -- TZ data files change watch code within the time 
> functions. (Issue at https://issues.asterisk.org/jira/browse/ASTERISK-24739.)
> 
> 
> Diffs
> -----
> 
>   tags/11.7.0/main/stdtime/localtime.c 432443 
> 
> Diff: https://reviewboard.asterisk.org/r/4450/diff/
> 
> 
> Testing
> -------
> 
> Patch initially developed against OpenBSD 5.5 ports package sources, and 
> Asterisk in use. Subsequently developed test program that can run original or 
> patched 11.7.0 code.
> 
> 
> File Attachments
> ----------------
> 
> localtime.c (11.7.0) test program
>   
> https://reviewboard.asterisk.org/media/uploaded/files/2015/02/25/f7a19f2a-4a8a-4aff-b7d6-44a4146c358c__patchtest.tgz
> 
> 
> Thanks,
> 
> Ed Hynan
> 
>

-- 
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

Reply via email to