Whilst trying to debug a probable race causing "module reload app_queue" to 
lose a queue when it clashed with offering a call to an agent channel, I found 
a potential race condition that still seems to exist in trunk.  Unfortunately 
this race condition seems to fail towards not setting the queue to dead, rather 
than leaving it dead when it isn't, so it doesn't explain our problem.  
Nonetheless, I think it needs recording.

mark_dead_and_unfound executes the following with no lock on the queue:

                q->dead = 1;

The problem with this is that "dead" is a bit field.  In particular it shares a 
byte with "wrapped", which is a bit field that does get updated in normal 
operation.  This means it actually compiles as Load, Or, Store.  If an update 
of wrapped spans the Store, the Store can get wiped out leaving the value 
unchanged.  Similarly this code could negate an update of "wrapped".

Note that this failure mode is not dependent on the processor processing memory 
accesses out of sequence, so memory fences won't help.  Similarly "volatile" 
won't help.

The exact bit allocations will vary between our version and the trunk one.  The 
former definitely shares a byte.  Based on bit counting, I would expect the 
same for trunk.

-- 
David Woolley
BTS Holdings Plc
Tel: +44 (0)20 8401 9000 Fax: +44 (0)20 8401 9100
http://www.bts.co.uk 

BTS Holdings PLC - Registered office: BTS House, Manor Road, Wallington, SM6 
0DD - Registered in England: 1517630

-- 
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

Reply via email to