On Mon, Aug 4, 2014 at 8:21 AM, Mindaugas Rasiukevicius <rm...@netbsd.org> wrote: > Ryota Ozaki <ozak...@netbsd.org> wrote: >> Hi, >> >> This is another work toward MPSAFE networking. >> sys/net/if.c contains several global variables >> (e.g., ifnet_list) that should be protected >> from parallel accesses somehow once we get rid >> of the big kernel lock. >> >> Currently there is a mutex for serializing >> accesses to index_gen, however, that's not >> enough; we have to protect the other variables >> too. >> >> The global variables are read-mostly, so I >> replace the mutex with a rwlock and use it >> for all. Unfortunately, ifnet_list may be >> accessed from interrupt context (only read >> though) so that I add a spin mutex for it; >> we hold the mutex when we modify ifnet_list >> as well as the rwlock. >> >> <...> > > I generally agree with Dennis that is not the way we want to take in > the long-term. The cost of read-write lock is very high. The plan > is to use passive serialisation to protect the interfaces and their > addresses. Also, the ultimate goal would also be to use a better > data structure (linked lists are not really efficient) and change the > way interfaces are referenced i.e. instead of referencing ifnet_t, > the network stack should use a unique ID.
I have no objection to the direction. My concern is an intermediate solution. > Note that the code paths > looking up the interface or its address(es) should not block (if they > do, the code can be rearranged). Some codes under sys/compat can be blocked during the iterations, for example linux_getifconf at [1] that may block due to copyout. [1] http://nxr.netbsd.org/xref/src/sys/compat/linux/common/linux_socket.c#1134 > Also, in the long run, ifnet list > should not be accessed from the hard interrupt context -- all users > ought to be running in the softintr(9) context. The ifnet list is accessed in m_reclaim that may be called from hardware interrupt context via say MCL_GET. > > We may need to take an intermediate solution, but I think we can > already switch to pserialize(9) + reference counting on ifnet_t for > the ip_input/ip_output() paths. I need to resume my work on the > routing subsystem patch-up, though. I think we need to get rid of blockable operations mentioned the above. Thanks, ozaki-r > > -- > Mindaugas