The following reply was made to PR kern/145385; it has been noted by GNATS.
From: Garrett Cooper <[email protected]> To: Jeff Roberson <[email protected]> Cc: [email protected], [email protected], Attilio Rao <[email protected]>, [email protected] Subject: Re: kern/145385: [cpu] Logical processor cannot be disabled for some SMT-enabled Intel procs Date: Wed, 25 Aug 2010 21:08:32 -0700 On Tue, Aug 24, 2010 at 9:53 PM, Jeff Roberson <[email protected]> wr= ote: > On Tue, 24 Aug 2010, Garrett Cooper wrote: > >> On Tue, Aug 24, 2010 at 3:45 PM, Garrett Cooper <[email protected]> >> wrote: >>> >>> On Tue, Aug 24, 2010 at 2:51 PM, Garrett Cooper <[email protected]> >>> wrote: >>>> >>>> On Aug 24, 2010, at 2:03 PM, Jeff Roberson wrote: >>>> >>>> >>>> On Tue, 24 Aug 2010, Garrett Cooper wrote: >>>> >>>> On Tue, Aug 24, 2010 at 12:22 PM, Jeff Roberson >>>> <[email protected]> >>>> wrote: >>>> >>>> On Tue, 24 Aug 2010, Garrett Cooper wrote: >>>> >>>> On Mon, Aug 23, 2010 at 6:33 AM, John Baldwin <[email protected]> wrote: >>>> >>>> On Sunday, August 22, 2010 4:17:37 am Garrett Cooper wrote: >>>> >>>> =A0 =A0 =A0 The following trivial patch fixes the issue on my W3520 pr= ocessor; >>>> >>>> AFAICS >>>> >>>> it's what should be done after reading several of the specs because th= e >>>> >>>> logical count that's tracked with ebx is exactly what is needed for >>>> >>>> logical_cpus (it's an absolute quantity). I need to verify it with a >>>> >>>> multi-cpu >>>> >>>> topology at work (the two r710s I was testing with E-series Xeons on >>>> >>>> aren't >>>> >>>> available remotely right now). >>>> >>>> Thanks! >>>> >>>> -Garrett >>>> >>>> Jung-uk Kim and Attilio Rao have both been looking at this code recent= ly >>>> >>>> and >>>> >>>> are in a better position to review the patch in the PR. >>>> >>>> (Moving jhb@ to BCC, adding jeff@ for possible input on ULE) >>>> >>>> The patch works as expected (it now properly detects the SMIT CPUs as >>>> >>>> logical CPUs), but setting machdep.hlt_logical_cpus=3D1 causes other >>>> >>>> problems with scheduling tasks because certain kernel threads get >>>> >>>> stuck at boot when netbooting (in particular I've seen problems with >>>> >>>> usbhub* and a few others bits), so in order for >>>> >>>> machdep.hlt_logical_cpus to be fixed on SMT processors, it might >>>> >>>> require some changes to the ULE scheduler to shuffle around the >>>> >>>> threads to available cores/processors? >>>> >>>> >>>> hlt_logical_cpus should be rewritten to use cpusets to change the >>>> default >>>> >>>> system set rather than specifically halting those cpus. =A0There are a >>>> number >>>> >>>> of loops in the kernel that iterate over all cpus and attempt to bind >>>> and >>>> >>>> perform some task. =A0I think there are a number of other reasons to >>>> prefer a >>>> >>>> less aggressive approach to avoiding the logical cpus as well. Simply >>>> >>>> preventing user thread schedule will achieve the intent of the sysctl = in >>>> any >>>> >>>> event. >>>> >>>> =A0=A0Ok... in that event then the bug is ok, but maybe I should add >>>> >>>> some code to the patch to warn the user about functional issues >>>> >>>> associated with halting logical CPUs? >>>> >>>> I don't think the bug is ok. =A0We probably shouldn't have sysctls whi= ch >>>> readily break the kernel. =A0As I said we should instead have the sysc= tl >>>> backend to cpuset. =A0It shouldn't take more than an hour to code and >>>> test. >>> >>> =A0 =A0Ok.. I'll look at this once I have my other system back online s= o >>> I can actively break something until I get it to work. >> >> =A0 BTW... there's a lot of code in machdep.c that does the same thing >> to idle the CPU, for instance, cpu_idle_hlt, cpu_idle_acpi, >> cpu_idle_amdc1e (on amd64). What should be done about those cases >> (same thing, or different)? > > Those are the actual idle functions that the scheduler uses. =A0Those are > safe. I'll look into running this on a Nehalem processor machine, but this appears to as expected on my Penryn processor test machine with machdep.hlt_cpus =3D { 110, 101, 11, 0 } and with machdep.idle=3Dacpi; I'm not sure if the if the loop is supposed to be there still, but it wouldn't make sense because the CPU would be spinning in the kernel. Thanks, -Garrett _______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "[email protected]"
