Re: [PATCH v3 1/4] x86/mce: Add Zhaoxin MCE support

2019-09-13 Thread Borislav Petkov
On Fri, Sep 13, 2019 at 11:10:31AM -0700, Luck, Tony wrote:
> Is it time to have a big cleanup on how we handle similarities
> and oddities in the MCE subsystem?  We've been adding ad-hoc
> tests like this in random places ... and it all looks very
> messy.

Hohum, it has been bothering me for a while now too. ;-\

> Or should we make a big table of CPU vendors/families/models and use
> x86_match_cpu() to pick out what are running on and set some bits/flags
> (like X86_FEATURE/X86_BUG) which we can use in the code to do the
> right thing in each place?

Yes, that. And I have started doing something along those lines, see
struct mce_vendor_flags.

If we did the X86_FEATURE/BUG things, we would still end up using those
new definitions in the MCA code only so I think having our own bits in a
bitfield would be cleaner/nicer.

Anyway, detection can be all done in __mcheck_cpu_init_early() or
somewhere similar, all matching flags/bits set and then the rest of the
code would query only them.

We can also merge mce_vendor_flags into mca_cfg as that thing is used
everywhere.

Another advantage of having our own flags is that we can define them as
we like and stick them all in internal.h so no exposure to the outside.

And so on.

> E.g. default for Intel and Zhaoxin vendors would be to set MCE_INTEL_LIKE.
> 
> Thoughts?

Yah, I think that's a good idea and I think we should do it. Not
immediately but work towards it.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v3 1/4] x86/mce: Add Zhaoxin MCE support

2019-09-13 Thread Luck, Tony
On Wed, Sep 11, 2019 at 12:01:42PM +, Tony W Wang-oc wrote:
> + /* Checks after this one are Intel/Zhaoxin-specific: */
> + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL &&
> + boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN)


Is it time to have a big cleanup on how we handle similarities
and oddities in the MCE subsystem?  We've been adding ad-hoc
tests like this in random places ... and it all looks very
messy.  Lines that mention x86_vendor|x86|x86_model below
arch/x86/kernel/cpu/mce/ currently look like this:

arch/x86/kernel/cpu/mce/amd.c: (c->x86_model >= 0x10 && 
c->x86_model <= 0x2F)) {
arch/x86/kernel/cpu/mce/amd.c:  c->x86_model >= 0x10 && c->x86_model <= 
0x2F &&
arch/x86/kernel/cpu/mce/amd.c:  } else if (c->x86 == 0x17 &&
arch/x86/kernel/cpu/mce/amd.c:  if (c->x86 == 0x15 && bank == 4) {
arch/x86/kernel/cpu/mce/amd.c:  if (c->x86 == 0x17 &&
arch/x86/kernel/cpu/mce/core.c: boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
arch/x86/kernel/cpu/mce/core.c: boot_cpu_data.x86_vendor == 
X86_VENDOR_HYGON ||
arch/x86/kernel/cpu/mce/core.c:  c->x86 > 6) {
arch/x86/kernel/cpu/mce/core.c: if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
arch/x86/kernel/cpu/mce/core.c: if (boot_cpu_data.x86_vendor != 
X86_VENDOR_INTEL)
arch/x86/kernel/cpu/mce/core.c: if (boot_cpu_data.x86_vendor == 
X86_VENDOR_INTEL ||
arch/x86/kernel/cpu/mce/core.c: if (c->x86 < 0x11 && cfg->bootlog < 0) {
arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 0x15 && c->x86_model <= 
0xf)
arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 15 && 
this_cpu_read(mce_num_banks) > 4) {
arch/x86/kernel/cpu/mce/core.c: if (c->x86 != 5)
arch/x86/kernel/cpu/mce/core.c: if ((c->x86 > 6 || (c->x86 == 6 && 
c->x86_model >= 0xe)) &&
arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 6 && c->x86_model < 0x1A 
&& this_cpu_read(mce_num_banks) > 0)
arch/x86/kernel/cpu/mce/core.c: if ((c->x86 == 6 && c->x86_model == 0xf && 
c->x86_stepping >= 0xe) ||
arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 6 && c->x86_model <= 13 
&& cfg->bootlog < 0)
arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 6 && c->x86_model == 45)
arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 6 && 
this_cpu_read(mce_num_banks) > 0)
arch/x86/kernel/cpu/mce/core.c: if (c->x86_vendor == X86_VENDOR_AMD) {
arch/x86/kernel/cpu/mce/core.c: if (c->x86_vendor == X86_VENDOR_AMD || 
c->x86_vendor == X86_VENDOR_HYGON) {
arch/x86/kernel/cpu/mce/core.c: if (c->x86_vendor == X86_VENDOR_INTEL) {
arch/x86/kernel/cpu/mce/core.c: if (c->x86_vendor == X86_VENDOR_UNKNOWN) {
arch/x86/kernel/cpu/mce/core.c: m->cpuvendor = boot_cpu_data.x86_vendor;
arch/x86/kernel/cpu/mce/core.c: switch (c->x86_vendor) {
arch/x86/kernel/cpu/mce/inject.c:   boot_cpu_data.x86 < 0x17) {
arch/x86/kernel/cpu/mce/inject.c:   m->cpuvendor = boot_cpu_data.x86_vendor;
arch/x86/kernel/cpu/mce/intel.c:if (boot_cpu_data.x86_vendor != 
X86_VENDOR_INTEL)
arch/x86/kernel/cpu/mce/intel.c:switch (c->x86_model) {
arch/x86/kernel/cpu/mce/severity.c: boot_cpu_data.x86_vendor == 
X86_VENDOR_HYGON)
arch/x86/kernel/cpu/mce/severity.c: if (boot_cpu_data.x86_vendor == 
X86_VENDOR_AMD ||
arch/x86/kernel/cpu/mce/therm_throt.c:  if (c->x86 == 6 && 
(c->x86_model == 9 || c->x86_model == 13)) {

Maybe we can X86_VENDOR_ZHAOXIN to this jumble with the excuse that
it is already so ugly that this patch series only makes things 5% worse?

Or should we make a big table of CPU vendors/families/models and use
x86_match_cpu() to pick out what are running on and set some bits/flags
(like X86_FEATURE/X86_BUG) which we can use in the code to do the
right thing in each place?

E.g. default for Intel and Zhaoxin vendors would be to set MCE_INTEL_LIKE.

Thoughts?

-Tony


Re: [PATCH v3 1/4] x86/mce: Add Zhaoxin MCE support

2019-09-11 Thread Borislav Petkov
On Wed, Sep 11, 2019 at 12:01:42PM +, Tony W Wang-oc wrote:
> All Zhaoxin newer CPUs support MCE that compatible with Intel's
> "Machine-Check Architecture", so add support for Zhaoxin MCE in
> mce/core.c.
> 
> Signed-off-by: Tony W Wang-oc 
> ---
> v2->v3:
>  - Make ifelse-case to switch-case
>  - Simplify Zhaoxin CPU FMS checking

Btw, for future submissions - you don't have to do it now - search the
web for threaded mails, look at git-send-email's manpage and especially
the --thread and --chain-reply-to options. Also, look at lkml for
examples.

IOW, patchsets should have a 0/N message and all the others should
be sent as a reply to that message, i.e., shallow threading, as the
git-send-email manpage calls it.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette