On Sat, Sep 01, 2018 at 11:51:26AM +0930, Kevin Shanahan wrote: > On Thu, Aug 30, 2018 at 03:04:39PM +0200, Peter Zijlstra wrote: > > On Thu, Aug 30, 2018 at 12:55:30PM +0200, Siegfried Metz wrote: > > > Dear kernel developers, > > > > > > since mainline kernel 4.18 (up to the latest mainline kernel 4.18.5) > > > Intel Core 2 Duo processors are affected by boot stalling early in the > > > boot process. As it is so early there is no dmesg output (or any log). > > > > > > A few users in the Arch Linux community used git bisect and tracked the > > > issue down to this the bad commit: > > > 7197e77abcb65a71d0b21d67beb24f153a96055e clocksource: Remove kthread > > > > I just dug out my core2duo laptop (Lenovo T500) and build a tip/master > > kernel for it (x86_64 debian distro .config). > > > > Seems to boot just fine.. 3/3 so far. > > > > Any other clues? > > One additional data point, my affected system is a Dell Latitude E6400 > laptop which has a P8400 CPU: > > vendor_id : GenuineIntel > cpu family : 6 > model : 23 > model name : Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz > stepping : 6 > microcode : 0x610 > > Judging from what is being discussed in the Arch forums, it does seem > to related to the CPU having unstable TSC and transitioning to another > clock source.
Yes; Core2 doesn't have stable TSC. > Workarounds that seem to be reliable are either booting > with clocksource=<something_not_tsc> or with nosmp. nosmp is weird; because even on UP TSC should stop in C state. processor_idle (acpi_idle) should mark the TSC as unstable on Core2 when it loads (does so on my T500). > One person did point out that the commit that introduced the kthread > did so to remove a deadlock - is the circular locking dependency > mentioned in that commit still relevant? > > commit 01548f4d3e8e94caf323a4f664eb347fd34a34ab > Author: Martin Schwidefsky <schwidef...@de.ibm.com> > Date: Tue Aug 18 17:09:42 2009 +0200 > > clocksource: Avoid clocksource watchdog circular locking dependency > > stop_machine from a multithreaded workqueue is not allowed because > of a circular locking dependency between cpu_down and the workqueue > execution. Use a kernel thread to do the clocksource downgrade. I cannot find stop_machine usage there; either it went away or I need to like wake up.