Hi Martin,

Many thanks for your explanation. Now I understand why logging must be 
protected by the same mutex that is
used by the other listening events in bus.c. It is to ensure that while 
linked_list_t is enumerating the list and invoking
log_cb callback, we won't have nodes removed from the linked list by other 
threads.

You suggested disabling updown plugin or replacing mutex in kernel interface. 
But the way I see it, this deadlock
has a much wider scope. E.g. if kernel interface allows multiple readers, it 
will fix this particular incident but 

deadlock may still occur in other places. 


There are 15 mutexes in StrongSwan 4.3.2. The moment oneof these listening 
events in bus.c locks 

bus_t.mutex, other threads may be holding one the other mutexes. Theseother 
threads cannot finish 

their tasks and release the mutexes they are holding because every step of 
thecode involve logging.

Now if the listeners try to acquire the other mutexes, deadlock.

Perhaps the loggers should be put in a separate linked list, separated from the 
dynamic listeners?
Thanks again for your help.
Simon



________________________________
 From: Martin Willi <[email protected]>
To: Simon Chan <[email protected]> 
Cc: "[email protected]" <[email protected]> 
Sent: Tuesday, December 6, 2011 12:45:05 AM
Subject: Re: [strongSwan] what could cause strongswan 4.3.2 to freeze up
 
Hi,

> If the above is all the mutex is trying to protect, then it would make
> my change simpler.

No, the important part is to protect the list of registered listeners.
There are not only loggers, but other dynamically registered listeners
that use this interface. Invoking a listener function while it gets
unregistered is problematic.

> The backtrace indicate the following events:
> - thread 9 passed through child_state_change() in bus.c:406 and owned
>   private_bus_t->mutex
> - thread 4 passed through build_address_list() in ike_mobike.c, called 
>   kernel_interface->create_address_enumerator and owned 
>   private_kernel_netlink_net_t->mutex
> - thread 9 invoked kernel_interface->get_interface and have to wait for 
>   the private_kernel_netlink_net_t->mutex
> - thread 4 called DBG2(DBG_ENC ,"added payload of type %N to message",...)
>   and have to wait for private_bus_t->mutex

I see. One option you have is to disable the updown plugin if you don't
need it. It is probably the only listener that locks the kernel
interface, disabling it will probably solve your issue.

A second option is to replace the kernel-interface mutex with a
read/write-lock, allowing multiple readers to get the interface list and
name. Shouldn't be too hard.

The third option, making bus_t to invoke listener functions in parallel,
would be preferable. But it is very very difficult to implement
properly. But that's what we should target for the long term.

Regards
Martin
_______________________________________________
Users mailing list
[email protected]
https://lists.strongswan.org/mailman/listinfo/users

Reply via email to