On Fri, Apr 28, 2023 at 02:51:23PM +0300, Vitaliy Makkoveev wrote: > > On 28 Apr 2023, at 14:03, Alexander Bluhm <alexander.bl...@gmx.net> wrote: > > > > That basically means we must never call one of the pool get or put > > functions with kernel lock. This may be the case currently. But > > at this stage while we are pushing locks around the network code, > > I would like to keep it as it is. > > > > Allowing net interrupts during route pool get mutex gains nearly > > nothing. It makes placing a kernel lock in our transition process > > harder. > > > > bluhm > > This is not true. The IPL is not kerne lock or MP related. Each pool(9) > uses its own mutex(9) or rwlock(9) for protection.
mpi@ told me the following background of IPL_MPFLOOR: The reasoning is lock ordering with mutex and kernel lock. If you take mutex sometimes with and sometimes without kernel lock the following can happen. - cpu 1 takes kernel lock - cpu 1 takes mutex - cpu 2 takes mutex without kernel lock - cpu 2 is interrupted - cpu 2 interrupt handler takes kernel lock And here you have a deadlock. There are three solutions 1. always take mutex with kernel lock 2. always take mutex without kernel lock 3. set spl in mutex to IPL_MPFLOOR 3 works as all interrupt handlers with higher priority than IPL_MPFLOOR do not take kernel lock. All other interrupts are blocked during the critical section of the mutex. When we totally unlock all IPL_SOFTTTY and IPL_NET interrupt handlers IPL_MPFLOOR can be moved to lower priority. Fo pool_init the same rules as for mtx_enter() apply as pool_get() and pool_put() take the pool mutex internally. As we were in a mixed kernel lock environment I have chosen solution 3 and added the IPL_MPFLOOR in rev 1.404 a year ago. bluhm > This depends on > PR_RWLOCK flag passed to pool_init(9). The spl*(9) only disables > interrupts with level less or equal priority on this CPU. Non system > wide. On the uniprocessor machine you can???t spin in mutex(9) because it > is the deadlock, so you only disable interrupts within mutex(9) protected > critical section. But SMP machine is the different case. In the ancient > MP times interrupts were enabled on boot cpu only, so spl*(9) really > provided protection, but since interrupts were enabled on application > CPUs too this stopped to work, because the interrupts you disabled on > this cpu could happen on any other. IPL_MPFLOOR has very high level and > I guess it was chosen mostly for insurance. > > We do rt allocation and release in the sockets, ioctl and softnet > context, so I see no reason to disable interrupts with priority greater > than IPL_SOFTNET. > > > > > > On Thu, Apr 27, 2023 at 02:18:11AM +0300, Vitaliy Makkoveev wrote: > >> Index: sys/net/route.c > >> =================================================================== > >> RCS file: /cvs/src/sys/net/route.c,v > >> retrieving revision 1.418 > >> diff -u -p -r1.418 route.c > >> --- sys/net/route.c 26 Apr 2023 16:09:44 -0000 1.418 > >> +++ sys/net/route.c 26 Apr 2023 23:00:02 -0000 > >> @@ -176,7 +176,7 @@ route_init(void) > >> { > >> rtcounters = counters_alloc(rts_ncounters); > >> > >> - pool_init(&rtentry_pool, sizeof(struct rtentry), 0, IPL_MPFLOOR, 0, > >> + pool_init(&rtentry_pool, sizeof(struct rtentry), 0, IPL_SOFTNET, 0, > >> "rtentry", NULL); > >> > >> while (rt_hashjitter == 0) > > >