On 07/09/21(Tue) 21:47, Patrick Wildt wrote: > Am Tue, Sep 07, 2021 at 02:43:22PM +0200 schrieb Patrick Wildt: > > Am Mon, Sep 06, 2021 at 09:43:29PM +0200 schrieb Patrick Wildt: > > > Am Fri, Jul 30, 2021 at 07:55:29PM +0200 schrieb Alexander Bluhm: > > > > On Mon, Jul 26, 2021 at 08:12:39AM -0500, Scott Cheloha wrote: > > > > > On Fri, Jun 25, 2021 at 06:09:27PM -0500, Scott Cheloha wrote: > > > > > 1 month bump. I really appreciate the tests I've gotten so far, thank > > > > > you. > > > > > > > > On my Xeon machine it works and all regress tests pass. > > > > > > > > But it fails on my old Opteron machine. It hangs after attaching > > > > cpu1. > > > > > > This seems to be caused by contention on the mutex in i8254's gettick(). > > > > > > With Scott's diff, delay_func is i8254_delay() on that old AMD machine. > > > Its gettick() implementation uses a mutex to protect I/O access to the > > > i8254. > > > > > > When secondary CPUs come up, they will wait for CPU0 to let them boot up > > > further by checking for a flag: > > > > > > /* > > > * We need to wait until we can identify, otherwise dmesg > > > * output will be messy. > > > */ > > > while ((ci->ci_flags & CPUF_IDENTIFY) == 0) > > > delay(10); > > > > > > Now that machine has 3 secondary cores that are spinning like that. At > > > the same time CPU0 waits for the core to come up: > > > > > > /* wait for it to identify */ > > > for (i = 2000000; (ci->ci_flags & CPUF_IDENTIFY) && i > 0; i--) > > > delay(10); > > > > > > That means we have 3-4 cores spinning just to be able to delay(). Our > > > mutex implementation isn't fair, which means whoever manages to claim > > > the free mutex wins. Now if CPU2 and CPU3 are spinning all the time, > > > CPU1 identifies and needs delay() and CPU0 waits for CPU1, maybe the > > > one that needs to make progress never gets it. > > > > > > I changed those delay(10) in cpu_hatch() to CPU_BUSY_CYCLE() and it went > > > ahead a bit better instead of hanging forever. > > > > > > Then I remembered an idea something from years ago: fair kernel mutexes, > > > so basically mutexes implemented as ticket lock, like our kerne lock. > > > > > > I did a quick diff, which probably contains a million bugs, but with > > > this bluhm's machine boots up well. > > > > > > I'm not saying this is the solution, but it might be. > > > > > > Patrick > > > > Cleaned the diff up a little, changes since last time: > > > > * Rename the struct members to be the same as mplock. > > * Change the code to use ticket/user numbers like mplock. This > > has one obvious downside: If a mutex is not initialized, trying > > to get this mutex will result in a hang. At least that just let > > me find some uninitialized mutexes. > > * More consistent use of the 'ci' variable. > > * Definitely compiles with/without DIAGNOSTIC. > > * Made sure mtx_enter() still has the membar. > > * No need for READ_ONCE() when members are volatile. > > > > Apart from being fair, this diff also changes the behaviour while > > spinning for a lock. Previously mtx_enter called mtx_enter_try > > in a loop until it got the lock. mtx_enter_try does splraise, > > try lock, splx. This diff currently spins with the SPL raised, > > so that's a change in behaviour. I'm sure I can change the diff > > to splraise/splx while looping, if we prefer that behaviour. > > > > Patrick
This change makes sense on its own as the contention is switching away from KERNEL_LOCK() to mutexes. Note that hppa has its own mutex implementation in case somebody wants to keep in sync. > make -j17 seems to have used less system time, so that seemed to have > made the machine slightly faster: > > old: make -j17 1160.01s user 3244.58s system 1288% cpu 5:41.96 total > new: make -j17 1171.80s user 3059.67s system 1295% cpu 5:26.65 total Is it with -current or with the UVM unlocking diff that put more pressure on mutxes? > I'll change the diff to do splraise/splx while looping, to make the > behaviour more similar to before, and then re-do my testing. That'd be nice. You could also start a new thread to get attention of more people, maybe dlg@, visa@ or kettenis@ have an opinion on this.