在 2017/12/29 下午8:48, Ingo Molnar 写道:
> 
> * Jia Zhang <qianyue...@alibaba-inc.com> wrote:
> 
>>
>>
>> 在 2017/12/28 下午8:24, Ingo Molnar 写道:
>>>
>>> * Jia Zhang <qianyue...@alibaba-inc.com> wrote:
>>>
>>>> Instead of blacklisting all types of Broadwell processor when running
>>>> a late loading, only BDW-EP (signature 0x406f1, aka family 6, model 79,
>>>> stepping 1) with the microcode version less than 0x0b000021 needs to
>>>> be blacklisted.
>>>>
>>>> The erratum is documented in the the public documentation #334165 (See
>>>> the item BDF90 for details).
>>>>
>>>> Signed-off-by: Jia Zhang <qianyue...@alibaba-inc.com>
>>>> ---
>>>>  arch/x86/kernel/cpu/microcode/intel.c | 12 ++++++++++--
>>>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kernel/cpu/microcode/intel.c 
>>>> b/arch/x86/kernel/cpu/microcode/intel.c
>>>> index 8ccdca6..79cad85 100644
>>>> --- a/arch/x86/kernel/cpu/microcode/intel.c
>>>> +++ b/arch/x86/kernel/cpu/microcode/intel.c
>>>> @@ -910,8 +910,16 @@ static bool is_blacklisted(unsigned int cpu)
>>>>  {
>>>>    struct cpuinfo_x86 *c = &cpu_data(cpu);
>>>>  
>>>> -  if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) {
>>>> -          pr_err_once("late loading on model 79 is disabled.\n");
>>>> +  /*
>>>> +   * The Broadwell-EP processor with the microcode version less
>>>> +   * then 0x0b000021 may result in system hang when running a late
>>>> +   * loading. This behavior is documented in item BDF90, #334165
>>>> +   * (Intel Xeon Processor E7-8800/4800 v4 Product Family).
>>>> +   */
>>>> +  if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X &&
>>>> +      c->x86_mask == 0x01 && c->microcode < 0x0b000021) {
>>>> +          pr_err_once("late loading on cpu (sig 0x406f1) is disabled "
>>>> +                      "due to erratum causing system hang.\n");
>>>
>>> Please never break user-readable messages mid-sentence!
>>>
>>> This should be something like:
>>>
>>>             pr_err_once("Late loading of the CPU microcode (sig 0x406f1) is 
>>> disabled due to Intel erratum BDF90 causing system hangs.\n");
>>>
>>> (note the spelling and readability improvements as well)
>>>
>>> Btw., what does 'sig 0x406f1' refer to?
>>
>> It is so-called processor signature which can be used to identify a
>> model of x86 processor uniquely. It's the return value of cpuid
>> instruction with leaf 1(eax == 1).
> 
> Ah, indeed, the (somewhat weird) encoding described in arch/x86/lib/cpu.c, 
> which 
> is essentially family+model+stepping encoded into a single integer, right?

Totally correct.

> 
> That whole area needs a good cleanup to be less confusing (we refer to the 
> CPU 
> stepping as x86_stepping(), but the field is called ->x86_mask?), but in the 

Yes. This is a confusing name. I will send another patch to clean up it.

> meanwhile, let's please make it more obvious in user facing message what's 
> happening.
> 
> Instead of using the microcode signature of the CPU model, please write out 
> what's 
> going on:
> 
>       pr_err_once("Not loading old microcode version: erratum BDF90 on Intel 
> Broadwell-EP stepping 1 CPUs may cause system hangs.\n");
> 
> ... and please also tell the user what to do about it:
> 
>       pr_err_once("Please update your microcode files.\n");

Let me give a full background and we will have a best description for
this erratum clearly.

If current processor signature matches the problematic Broadwell-EP
model (0x406f1) *AND* current version of microcode is less than
0x0b000021, launching a microcode update in Linux runtime (or so-called
late loading) must be prohibited in order to prevent from system hang
due to the erratum. Namely, the end user has to make a BIOS update to
uprev the microcode. The code of microcode update loader in BIOS can
safely issus an microcode update without the concern about this erratum.
This is the so-called manner of early loading.

Thanks,
Jia

> 
> !!
> 
> Agreed?
> 
> Thanks,
> 
>       Ingo
> 

Reply via email to