On Fri, Apr 28, 2023 at 02:51:23PM +0300, Vitaliy Makkoveev wrote:
> > On 28 Apr 2023, at 14:03, Alexander Bluhm <alexander.bl...@gmx.net> wrote:
> > 
> > That basically means we must never call one of the pool get or put
> > functions with kernel lock.  This may be the case currently.  But
> > at this stage while we are pushing locks around the network code,
> > I would like to keep it as it is.
> > 
> > Allowing net interrupts during route pool get mutex gains nearly
> > nothing.  It makes placing a kernel lock in our transition process
> > harder.
> > 
> > bluhm
> 
> This is not true. The IPL is not kerne lock or MP related. Each pool(9)
> uses its own mutex(9) or rwlock(9) for protection.

mpi@ told me the following background of IPL_MPFLOOR:

The reasoning is lock ordering with mutex and kernel lock.  If you
take mutex sometimes with and sometimes without kernel lock the
following can happen.

- cpu 1 takes kernel lock
- cpu 1 takes mutex

- cpu 2 takes mutex without kernel lock
- cpu 2 is interrupted
- cpu 2 interrupt handler takes kernel lock

And here you have a deadlock.  There are three solutions

1. always take mutex with kernel lock
2. always take mutex without kernel lock
3. set spl in mutex to IPL_MPFLOOR

3 works as all interrupt handlers with higher priority than IPL_MPFLOOR
do not take kernel lock.  All other interrupts are blocked during
the critical section of the mutex.

When we totally unlock all IPL_SOFTTTY and IPL_NET interrupt handlers
IPL_MPFLOOR can be moved to lower priority.

Fo pool_init the same rules as for mtx_enter() apply as pool_get()
and pool_put() take the pool mutex internally.

As we were in a mixed kernel lock environment I have chosen solution
3 and added the IPL_MPFLOOR in rev 1.404 a year ago.

bluhm

> This depends on
> PR_RWLOCK flag passed to pool_init(9). The spl*(9) only disables
> interrupts with level less or equal priority on this CPU. Non system
> wide. On the uniprocessor machine you can???t spin in mutex(9) because it
> is the deadlock, so you only disable interrupts within mutex(9) protected
> critical section. But SMP machine is the different case. In the ancient
> MP times interrupts were enabled on boot cpu only, so spl*(9) really
> provided protection, but since interrupts were enabled on application
> CPUs too this stopped to work, because the interrupts you disabled on
> this cpu could happen on any other. IPL_MPFLOOR has very high level and
> I guess it was chosen mostly for insurance.
> 
> We do rt allocation and release in the sockets, ioctl and softnet
> context, so I see no reason to disable interrupts with priority greater
> than IPL_SOFTNET.
> 
> 
> > 
> > On Thu, Apr 27, 2023 at 02:18:11AM +0300, Vitaliy Makkoveev wrote:
> >> Index: sys/net/route.c
> >> ===================================================================
> >> RCS file: /cvs/src/sys/net/route.c,v
> >> retrieving revision 1.418
> >> diff -u -p -r1.418 route.c
> >> --- sys/net/route.c        26 Apr 2023 16:09:44 -0000      1.418
> >> +++ sys/net/route.c        26 Apr 2023 23:00:02 -0000
> >> @@ -176,7 +176,7 @@ route_init(void)
> >> {
> >>    rtcounters = counters_alloc(rts_ncounters);
> >> 
> >> -  pool_init(&rtentry_pool, sizeof(struct rtentry), 0, IPL_MPFLOOR, 0,
> >> +  pool_init(&rtentry_pool, sizeof(struct rtentry), 0, IPL_SOFTNET, 0,
> >>        "rtentry", NULL);
> >> 
> >>    while (rt_hashjitter == 0)
> > 
> 

Reply via email to