Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-27 Thread Borislav Petkov
On Fri, Jan 26, 2024 at 10:01:29PM +, Luck, Tony wrote: > PPIN: Nice to have. People that send stuff to me are terrible about > providing surrounding details. The record already includes > CPUID(1).EAX ... so I can at least skip the step of asking them which > CPU family/model/stepping they

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> But no, that's not the right question to ask. > > It is rather: which bits of information are very relevant to an error > record and which are transient enough so that they cannot be gathered > from a system by other means or only gathered in a difficult way, and > should be part of that record.

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Borislav Petkov
On Fri, Jan 26, 2024 at 08:49:03PM +, Luck, Tony wrote: > Every patch that adds new code or data structures adds to the kernel > memory footprint. Each should be considered on its merits. The basic > question being: > >"Is the new functionality worth the cost?" > > Where does it end? It

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> > Is it so very different to add this to a trace record so that rasdaemon > > can have feature parity with mcelog(8)? > > I knew you were gonna say that. When someone decides that it is > a splendid idea to add more stuff to struct mce then said someone would > want it in the tracepoint too. > >

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Borislav Petkov
On Fri, Jan 26, 2024 at 07:15:50PM +, Luck, Tony wrote: > If deployment of a microcode update across a fleet always went > flawlessly, life would be simpler. But things can fail. And maybe the > failure wasn't noticed. Perhaps a node was rebooting when the sysadmin > pushed the update to the

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> > You've spent enough time with Ashok and Thomas tweaking the Linux > > microcode driver to know that going back to the machine the next day > > to ask about microcode version has a bunch of ways to get a wrong > > answer. > > Huh, what does that have to do with this? If deployment of a

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Borislav Petkov
On Fri, Jan 26, 2024 at 05:10:20PM +, Luck, Tony wrote: > 12 extra bytes divided by (say) 64GB (a very small server these days, may > laptop has that much) >= 0.0001746% > > We will need 57000 changes like this one before we get to 0.001% :-) You're forgetting that those 12 bytes

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> > 8 bytes for PPIN, 4 more for microcode. > > I know, nothing leads to bloat like 0.01% here, 0.001% there... 12 extra bytes divided by (say) 64GB (a very small server these days, may laptop has that much) = 0.0001746% We will need 57000 changes like this one before we get to 0.001%

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Borislav Petkov
On Thu, Jan 25, 2024 at 07:19:22PM +, Luck, Tony wrote: > 8 bytes for PPIN, 4 more for microcode. I know, nothing leads to bloat like 0.01% here, 0.001% there... > Number of recoverable machine checks per system I hope the > monthly rate should be countable on my fingers... That's not

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-25 Thread Naik, Avadhut
Hi, On 1/25/2024 1:19 PM, Luck, Tony wrote: >>> The first patch adds PPIN (Protected Processor Inventory Number) field to >>> the tracepoint. >>> >>> The second patch adds the microcode field (Microcode Revision) to the >>> tracepoint. >> >> This is a lot of static information to add to *every*

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-25 Thread Luck, Tony
> > The first patch adds PPIN (Protected Processor Inventory Number) field to > > the tracepoint. > > > > The second patch adds the microcode field (Microcode Revision) to the > > tracepoint. > > This is a lot of static information to add to *every* MCE. 8 bytes for PPIN, 4 more for microcode.

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-25 Thread Borislav Petkov
On Thu, Jan 25, 2024 at 12:48:55PM -0600, Avadhut Naik wrote: > This patchset updates the mce_record tracepoint so that the recently added > fields of struct mce are exported through it to userspace. > > The first patch adds PPIN (Protected Processor Inventory Number) field to > the tracepoint. >