Re: [PATCH v3 1/4] x86/mce: Add Zhaoxin MCE support
On Fri, Sep 13, 2019 at 11:10:31AM -0700, Luck, Tony wrote: > Is it time to have a big cleanup on how we handle similarities > and oddities in the MCE subsystem? We've been adding ad-hoc > tests like this in random places ... and it all looks very > messy. Hohum, it has been bothering me for a while now too. ;-\ > Or should we make a big table of CPU vendors/families/models and use > x86_match_cpu() to pick out what are running on and set some bits/flags > (like X86_FEATURE/X86_BUG) which we can use in the code to do the > right thing in each place? Yes, that. And I have started doing something along those lines, see struct mce_vendor_flags. If we did the X86_FEATURE/BUG things, we would still end up using those new definitions in the MCA code only so I think having our own bits in a bitfield would be cleaner/nicer. Anyway, detection can be all done in __mcheck_cpu_init_early() or somewhere similar, all matching flags/bits set and then the rest of the code would query only them. We can also merge mce_vendor_flags into mca_cfg as that thing is used everywhere. Another advantage of having our own flags is that we can define them as we like and stick them all in internal.h so no exposure to the outside. And so on. > E.g. default for Intel and Zhaoxin vendors would be to set MCE_INTEL_LIKE. > > Thoughts? Yah, I think that's a good idea and I think we should do it. Not immediately but work towards it. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v3 1/4] x86/mce: Add Zhaoxin MCE support
On Wed, Sep 11, 2019 at 12:01:42PM +, Tony W Wang-oc wrote: > + /* Checks after this one are Intel/Zhaoxin-specific: */ > + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL && > + boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN) Is it time to have a big cleanup on how we handle similarities and oddities in the MCE subsystem? We've been adding ad-hoc tests like this in random places ... and it all looks very messy. Lines that mention x86_vendor|x86|x86_model below arch/x86/kernel/cpu/mce/ currently look like this: arch/x86/kernel/cpu/mce/amd.c: (c->x86_model >= 0x10 && c->x86_model <= 0x2F)) { arch/x86/kernel/cpu/mce/amd.c: c->x86_model >= 0x10 && c->x86_model <= 0x2F && arch/x86/kernel/cpu/mce/amd.c: } else if (c->x86 == 0x17 && arch/x86/kernel/cpu/mce/amd.c: if (c->x86 == 0x15 && bank == 4) { arch/x86/kernel/cpu/mce/amd.c: if (c->x86 == 0x17 && arch/x86/kernel/cpu/mce/core.c: boot_cpu_data.x86_vendor == X86_VENDOR_AMD) arch/x86/kernel/cpu/mce/core.c: boot_cpu_data.x86_vendor == X86_VENDOR_HYGON || arch/x86/kernel/cpu/mce/core.c: c->x86 > 6) { arch/x86/kernel/cpu/mce/core.c: if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) arch/x86/kernel/cpu/mce/core.c: if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) arch/x86/kernel/cpu/mce/core.c: if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL || arch/x86/kernel/cpu/mce/core.c: if (c->x86 < 0x11 && cfg->bootlog < 0) { arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 0x15 && c->x86_model <= 0xf) arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) { arch/x86/kernel/cpu/mce/core.c: if (c->x86 != 5) arch/x86/kernel/cpu/mce/core.c: if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) && arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0) arch/x86/kernel/cpu/mce/core.c: if ((c->x86 == 6 && c->x86_model == 0xf && c->x86_stepping >= 0xe) || arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 6 && c->x86_model <= 13 && cfg->bootlog < 0) arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 6 && c->x86_model == 45) arch/x86/kernel/cpu/mce/core.c: if (c->x86 == 6 && this_cpu_read(mce_num_banks) > 0) arch/x86/kernel/cpu/mce/core.c: if (c->x86_vendor == X86_VENDOR_AMD) { arch/x86/kernel/cpu/mce/core.c: if (c->x86_vendor == X86_VENDOR_AMD || c->x86_vendor == X86_VENDOR_HYGON) { arch/x86/kernel/cpu/mce/core.c: if (c->x86_vendor == X86_VENDOR_INTEL) { arch/x86/kernel/cpu/mce/core.c: if (c->x86_vendor == X86_VENDOR_UNKNOWN) { arch/x86/kernel/cpu/mce/core.c: m->cpuvendor = boot_cpu_data.x86_vendor; arch/x86/kernel/cpu/mce/core.c: switch (c->x86_vendor) { arch/x86/kernel/cpu/mce/inject.c: boot_cpu_data.x86 < 0x17) { arch/x86/kernel/cpu/mce/inject.c: m->cpuvendor = boot_cpu_data.x86_vendor; arch/x86/kernel/cpu/mce/intel.c:if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) arch/x86/kernel/cpu/mce/intel.c:switch (c->x86_model) { arch/x86/kernel/cpu/mce/severity.c: boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) arch/x86/kernel/cpu/mce/severity.c: if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD || arch/x86/kernel/cpu/mce/therm_throt.c: if (c->x86 == 6 && (c->x86_model == 9 || c->x86_model == 13)) { Maybe we can X86_VENDOR_ZHAOXIN to this jumble with the excuse that it is already so ugly that this patch series only makes things 5% worse? Or should we make a big table of CPU vendors/families/models and use x86_match_cpu() to pick out what are running on and set some bits/flags (like X86_FEATURE/X86_BUG) which we can use in the code to do the right thing in each place? E.g. default for Intel and Zhaoxin vendors would be to set MCE_INTEL_LIKE. Thoughts? -Tony
Re: [PATCH v3 1/4] x86/mce: Add Zhaoxin MCE support
On Wed, Sep 11, 2019 at 12:01:42PM +, Tony W Wang-oc wrote: > All Zhaoxin newer CPUs support MCE that compatible with Intel's > "Machine-Check Architecture", so add support for Zhaoxin MCE in > mce/core.c. > > Signed-off-by: Tony W Wang-oc > --- > v2->v3: > - Make ifelse-case to switch-case > - Simplify Zhaoxin CPU FMS checking Btw, for future submissions - you don't have to do it now - search the web for threaded mails, look at git-send-email's manpage and especially the --thread and --chain-reply-to options. Also, look at lkml for examples. IOW, patchsets should have a 0/N message and all the others should be sent as a reply to that message, i.e., shallow threading, as the git-send-email manpage calls it. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette