Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-25 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 20:30 +0200, Borislav Petkov wrote: : > > So I don't want to break existing users and thus make only explicitly > known platforms load ghes_edac. In the current case, the HPE > machines. All the rest will simply use the platform drivers and > nothing will change for them.

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-25 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 20:30 +0200, Borislav Petkov wrote: : > > So I don't want to break existing users and thus make only explicitly > known platforms load ghes_edac. In the current case, the HPE > machines. All the rest will simply use the platform drivers and > nothing will change for them.

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
(Sending to your other mail address because there's some temporary resolution issue: msmtp: recipient address mche...@s-opensource.com not accepted by the server msmtp: server message: 451 4.3.0 : Temporary lookup failure msmtp: could not send mail (account alien8.de

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
(Sending to your other mail address because there's some temporary resolution issue: msmtp: recipient address mche...@s-opensource.com not accepted by the server msmtp: server message: 451 4.3.0 : Temporary lookup failure msmtp: could not send mail (account alien8.de from /home/boris/.msmtprc)

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 05:54:52PM +, Kani, Toshimitsu wrote: > Umm... I was under impression that we are adding the OSC bit check in > addition to the current GHES filtering. Read the parallel subthread again. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 05:54:52PM +, Kani, Toshimitsu wrote: > Umm... I was under impression that we are adding the OSC bit check in > addition to the current GHES filtering. Read the parallel subthread again. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 14:56 -0300, Mauro Carvalho Chehab wrote: > Em Mon, 24 Jul 2017 15:56:27 + : > That's probably too late for me as I received a new HP machine > we bought just last week, but for the next time I would need to > get a new hardware, what would be the non-RAS equivalent to >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 14:56 -0300, Mauro Carvalho Chehab wrote: > Em Mon, 24 Jul 2017 15:56:27 + : > That's probably too late for me as I received a new HP machine > we bought just last week, but for the next time I would need to > get a new hardware, what would be the non-RAS equivalent to >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Mauro Carvalho Chehab
Em Mon, 24 Jul 2017 18:44:00 +0200 Borislav Petkov escreveu: > On Mon, Jul 24, 2017 at 01:04:13PM -0300, Mauro Carvalho Chehab wrote: > > If the Kernel force those users to use ghes_edac by default, > > they they won't see the error counts anymore, but, instead, > > hardware

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Mauro Carvalho Chehab
Em Mon, 24 Jul 2017 18:44:00 +0200 Borislav Petkov escreveu: > On Mon, Jul 24, 2017 at 01:04:13PM -0300, Mauro Carvalho Chehab wrote: > > If the Kernel force those users to use ghes_edac by default, > > they they won't see the error counts anymore, but, instead, > > hardware reports that the

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Mauro Carvalho Chehab
Em Mon, 24 Jul 2017 15:56:27 + "Kani, Toshimitsu" escreveu: > On Mon, 2017-07-24 at 17:37 +0200, Borislav Petkov wrote: > > On Mon, Jul 24, 2017 at 03:25:34PM +, Kani, Toshimitsu wrote: > : > > > > > We've been providing this model for many years now. > > > >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Mauro Carvalho Chehab
Em Mon, 24 Jul 2017 15:56:27 + "Kani, Toshimitsu" escreveu: > On Mon, 2017-07-24 at 17:37 +0200, Borislav Petkov wrote: > > On Mon, Jul 24, 2017 at 03:25:34PM +, Kani, Toshimitsu wrote: > : > > > > > We've been providing this model for many years now. > > > > Dude, relax, I'm

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 20:50 +0300, Boris Petkov wrote: > On July 24, 2017 8:44:03 PM GMT+03:00, "Kani, Toshimitsu" @hpe.com> wrote: > > I assumed our platforms w/o build-in RAS do not implement GHES, > > If we make it a normal module, it will be decoupled from GHES and it > will rely only on the

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 20:50 +0300, Boris Petkov wrote: > On July 24, 2017 8:44:03 PM GMT+03:00, "Kani, Toshimitsu" @hpe.com> wrote: > > I assumed our platforms w/o build-in RAS do not implement GHES, > > If we make it a normal module, it will be decoupled from GHES and it > will rely only on the

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Boris Petkov
On July 24, 2017 8:44:03 PM GMT+03:00, "Kani, Toshimitsu" wrote: >I assumed our platforms w/o build-in RAS do not implement GHES, If we make it a normal module, it will be decoupled from GHES and it will rely only on the whitelist to load. -- Sent from a small device:

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Boris Petkov
On July 24, 2017 8:44:03 PM GMT+03:00, "Kani, Toshimitsu" wrote: >I assumed our platforms w/o build-in RAS do not implement GHES, If we make it a normal module, it will be decoupled from GHES and it will rely only on the whitelist to load. -- Sent from a small device: formatting sux and

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 18:37 +0200, Borislav Petkov wrote: > On Mon, Jul 24, 2017 at 03:56:27PM +, Kani, Toshimitsu wrote: > > Yes, Mauro has already pointed this out.  As I replied to him, we > > do have a separate series of platforms that do not have built-in > > RAS, and > > So this

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 18:37 +0200, Borislav Petkov wrote: > On Mon, Jul 24, 2017 at 03:56:27PM +, Kani, Toshimitsu wrote: > > Yes, Mauro has already pointed this out.  As I replied to him, we > > do have a separate series of platforms that do not have built-in > > RAS, and > > So this

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 01:04:13PM -0300, Mauro Carvalho Chehab wrote: > If the Kernel force those users to use ghes_edac by default, > they they won't see the error counts anymore, but, instead, > hardware reports that the memories need to be replaced. This is exactly why I'm trying to load

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 01:04:13PM -0300, Mauro Carvalho Chehab wrote: > If the Kernel force those users to use ghes_edac by default, > they they won't see the error counts anymore, but, instead, > hardware reports that the memories need to be replaced. This is exactly why I'm trying to load

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 03:56:27PM +, Kani, Toshimitsu wrote: > Yes, Mauro has already pointed this out. As I replied to him, we do > have a separate series of platforms that do not have built-in RAS, and So this whitelist entry +static struct acpi_oemlist oemlist[] = { + {"HPE ",

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 03:56:27PM +, Kani, Toshimitsu wrote: > Yes, Mauro has already pointed this out. As I replied to him, we do > have a separate series of platforms that do not have built-in RAS, and So this whitelist entry +static struct acpi_oemlist oemlist[] = { + {"HPE ",

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Mauro Carvalho Chehab
Em Mon, 24 Jul 2017 17:37:16 +0200 Borislav Petkov escreveu: > > Customers do not see error counts.  I do not think it's bogus. > > I am just trying to enable OS error reporting with ghes_edac. > > I know, you don't have to state the obvious constantly. The problem I see is

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Mauro Carvalho Chehab
Em Mon, 24 Jul 2017 17:37:16 +0200 Borislav Petkov escreveu: > > Customers do not see error counts.  I do not think it's bogus. > > I am just trying to enable OS error reporting with ghes_edac. > > I know, you don't have to state the obvious constantly. The problem I see is that, currently,

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 17:37 +0200, Borislav Petkov wrote: > On Mon, Jul 24, 2017 at 03:25:34PM +, Kani, Toshimitsu wrote: : > > > We've been providing this model for many years now. > > Dude, relax, I'm only trying to point out to you that there are > customers who want to see *every* error

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 17:37 +0200, Borislav Petkov wrote: > On Mon, Jul 24, 2017 at 03:25:34PM +, Kani, Toshimitsu wrote: : > > > We've been providing this model for many years now. > > Dude, relax, I'm only trying to point out to you that there are > customers who want to see *every* error

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 03:25:34PM +, Kani, Toshimitsu wrote: > Customers do not see error counts.  I do not think it's bogus. Not showing the real error error counts but something contrived is the definition of bogus numbers. But you're not showing anything - only when some thresholds are

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 03:25:34PM +, Kani, Toshimitsu wrote: > Customers do not see error counts.  I do not think it's bogus. Not showing the real error error counts but something contrived is the definition of bogus numbers. But you're not showing anything - only when some thresholds are

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 17:04 +0200, Borislav Petkov wrote: > On Mon, Jul 24, 2017 at 02:49:30PM +, Kani, Toshimitsu wrote: > > We do not tell the error counts to customers. > > Please read what I said: do you tell your customers that the error > counts they're seeing (or are *not* seeing) is

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 17:04 +0200, Borislav Petkov wrote: > On Mon, Jul 24, 2017 at 02:49:30PM +, Kani, Toshimitsu wrote: > > We do not tell the error counts to customers. > > Please read what I said: do you tell your customers that the error > counts they're seeing (or are *not* seeing) is

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 02:49:30PM +, Kani, Toshimitsu wrote: > We do not tell the error counts to customers. Please read what I said: do you tell your customers that the error counts they're seeing (or are *not* seeing) is bogus because the BIOS is hiding them? Not the *actual* numbers! >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 02:49:30PM +, Kani, Toshimitsu wrote: > We do not tell the error counts to customers. Please read what I said: do you tell your customers that the error counts they're seeing (or are *not* seeing) is bogus because the BIOS is hiding them? Not the *actual* numbers! >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Sat, 2017-07-22 at 08:28 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 06:38:52PM +, Kani, Toshimitsu wrote: > > Enterprise platforms have very different model (I do not say it's > > better for everyone from the cost perspective).  Typically, such > > But you do tell your

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Sat, 2017-07-22 at 08:28 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 06:38:52PM +, Kani, Toshimitsu wrote: > > Enterprise platforms have very different model (I do not say it's > > better for everyone from the cost perspective).  Typically, such > > But you do tell your

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-22 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 06:38:52PM +, Kani, Toshimitsu wrote: > Enterprise platforms have very different model (I do not say it's > better for everyone from the cost perspective). Typically, such But you do tell your customers that the error counts they see are not really what *actually*

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-22 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 06:38:52PM +, Kani, Toshimitsu wrote: > Enterprise platforms have very different model (I do not say it's > better for everyone from the cost perspective). Typically, such But you do tell your customers that the error counts they see are not really what *actually*

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 19:23 +0200, Borislav Petkov wrote: : > Not only that: thresholds depend on the DIMM types which means, BIOS > must know what DIMM types are in there which I doubt. BIOS knows DIMM model from the SPD data. > So exposing that to configuration instead of "deciding" for

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 19:23 +0200, Borislav Petkov wrote: : > Not only that: thresholds depend on the DIMM types which means, BIOS > must know what DIMM types are in there which I doubt. BIOS knows DIMM model from the SPD data. > So exposing that to configuration instead of "deciding" for

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 02:01:31PM -0300, Mauro Carvalho Chehab wrote: > I see the value of having a threshold in BIOS, provided that it is > well documented, and whose value can be adjusted, if needed. > > One of the things I wanted to implement in ras-daemon were an > algorithm that would be

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 02:01:31PM -0300, Mauro Carvalho Chehab wrote: > I see the value of having a threshold in BIOS, provided that it is > well documented, and whose value can be adjusted, if needed. > > One of the things I wanted to implement in ras-daemon were an > algorithm that would be

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 14:01 -0300, Mauro Carvalho Chehab wrote: > Em Fri, 21 Jul 2017 16:40:20 + > "Kani, Toshimitsu" escreveu: > > > On Fri, 2017-07-21 at 12:44 -0300, Mauro Carvalho Chehab wrote: > > > Em Fri, 21 Jul 2017 15:34:50 + > > > "Kani, Toshimitsu"

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 14:01 -0300, Mauro Carvalho Chehab wrote: > Em Fri, 21 Jul 2017 16:40:20 + > "Kani, Toshimitsu" escreveu: > > > On Fri, 2017-07-21 at 12:44 -0300, Mauro Carvalho Chehab wrote: > > > Em Fri, 21 Jul 2017 15:34:50 + > > > "Kani, Toshimitsu" escreveu: > > >    > > > >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Mauro Carvalho Chehab
Em Fri, 21 Jul 2017 16:40:20 + "Kani, Toshimitsu" escreveu: > On Fri, 2017-07-21 at 12:44 -0300, Mauro Carvalho Chehab wrote: > > Em Fri, 21 Jul 2017 15:34:50 + > > "Kani, Toshimitsu" escreveu: > > > > > On Fri, 2017-07-21 at 17:13 +0200,

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Mauro Carvalho Chehab
Em Fri, 21 Jul 2017 16:40:20 + "Kani, Toshimitsu" escreveu: > On Fri, 2017-07-21 at 12:44 -0300, Mauro Carvalho Chehab wrote: > > Em Fri, 21 Jul 2017 15:34:50 + > > "Kani, Toshimitsu" escreveu: > > > > > On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > > > > On Fri, Jul

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 12:44 -0300, Mauro Carvalho Chehab wrote: > Em Fri, 21 Jul 2017 15:34:50 + > "Kani, Toshimitsu" escreveu: > > > On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > > > On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu > > > wrote:  

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 12:44 -0300, Mauro Carvalho Chehab wrote: > Em Fri, 21 Jul 2017 15:34:50 + > "Kani, Toshimitsu" escreveu: > > > On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > > > On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu > > > wrote:   > > > > Yes, that is

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 17:53 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 03:34:50PM +, Kani, Toshimitsu wrote: > > I suppose it'd depend on vendors, but I do not think users can do > > it properly unless they have depth knowledge about the hardware. > > I'm talking about a menu in

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 17:53 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 03:34:50PM +, Kani, Toshimitsu wrote: > > I suppose it'd depend on vendors, but I do not think users can do > > it properly unless they have depth knowledge about the hardware. > > I'm talking about a menu in

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 03:34:50PM +, Kani, Toshimitsu wrote: > I suppose it'd depend on vendors, but I do not think users can do it > properly unless they have depth knowledge about the hardware. I'm talking about a menu in the BIOS where you can set the thresholding levels on the system.

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 03:34:50PM +, Kani, Toshimitsu wrote: > I suppose it'd depend on vendors, but I do not think users can do it > properly unless they have depth knowledge about the hardware. I'm talking about a menu in the BIOS where you can set the thresholding levels on the system.

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Mauro Carvalho Chehab
Em Fri, 21 Jul 2017 15:34:50 + "Kani, Toshimitsu" escreveu: > On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > > On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu wrote: > > > Yes, that is correct.  Corrected errors are reported to the OS when > > >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Mauro Carvalho Chehab
Em Fri, 21 Jul 2017 15:34:50 + "Kani, Toshimitsu" escreveu: > On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > > On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu wrote: > > > Yes, that is correct.  Corrected errors are reported to the OS when > > > they exceeded the

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu wrote: > > Yes, that is correct.  Corrected errors are reported to the OS when > > they exceeded the platform's threshold. > > Are those thresholds user-configurable? I suppose

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu wrote: > > Yes, that is correct.  Corrected errors are reported to the OS when > > they exceeded the platform's threshold. > > Are those thresholds user-configurable? I suppose

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu wrote: > Yes, that is correct. Corrected errors are reported to the OS when > they exceeded the platform's threshold. Are those thresholds user-configurable? If not, what are you telling users who want to see *every* corrected error for

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu wrote: > Yes, that is correct. Corrected errors are reported to the OS when > they exceeded the platform's threshold. Are those thresholds user-configurable? If not, what are you telling users who want to see *every* corrected error for

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 15:47 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 10:40:01AM -0300, Mauro Carvalho Chehab > wrote: > > What happens when the error can be corrected? Does it still report > > it to userspace, or just silently hide the error? > > > > If I remember well about a past

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 15:47 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 10:40:01AM -0300, Mauro Carvalho Chehab > wrote: > > What happens when the error can be corrected? Does it still report > > it to userspace, or just silently hide the error? > > > > If I remember well about a past

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 10:40:01AM -0300, Mauro Carvalho Chehab wrote: > What happens when the error can be corrected? Does it still report it to > userspace, or just silently hide the error? > > If I remember well about a past discussion with some vendor, I was told > that the firmware can hide

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 10:40:01AM -0300, Mauro Carvalho Chehab wrote: > What happens when the error can be corrected? Does it still report it to > userspace, or just silently hide the error? > > If I remember well about a past discussion with some vendor, I was told > that the firmware can hide

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Mauro Carvalho Chehab
Em Fri, 21 Jul 2017 15:34:41 +0200 Borislav Petkov escreveu: > On Thu, Jul 20, 2017 at 07:50:03PM +, Kani, Toshimitsu wrote: > > GHES / firmware-first still requires OS recovery actions when an error > > cannot be corrected by the platform. They are handled by ghes_proc(), >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Mauro Carvalho Chehab
Em Fri, 21 Jul 2017 15:34:41 +0200 Borislav Petkov escreveu: > On Thu, Jul 20, 2017 at 07:50:03PM +, Kani, Toshimitsu wrote: > > GHES / firmware-first still requires OS recovery actions when an error > > cannot be corrected by the platform. They are handled by ghes_proc(), > > and ghes_edac

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Thu, Jul 20, 2017 at 07:50:03PM +, Kani, Toshimitsu wrote: > GHES / firmware-first still requires OS recovery actions when an error > cannot be corrected by the platform. They are handled by ghes_proc(), > and ghes_edac remains its error-reporting wrapper. I mean all the recovery actions

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Thu, Jul 20, 2017 at 07:50:03PM +, Kani, Toshimitsu wrote: > GHES / firmware-first still requires OS recovery actions when an error > cannot be corrected by the platform. They are handled by ghes_proc(), > and ghes_edac remains its error-reporting wrapper. I mean all the recovery actions

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Kani, Toshimitsu
On Thu, 2017-07-20 at 17:15 -0300, Mauro Carvalho Chehab wrote: > Em Thu, 20 Jul 2017 19:50:03 + > "Kani, Toshimitsu" escreveu: : > > Firmware has better knowledge about the platform and can provide > > better RAS when implemented properly.  I agree that user > >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Kani, Toshimitsu
On Thu, 2017-07-20 at 17:15 -0300, Mauro Carvalho Chehab wrote: > Em Thu, 20 Jul 2017 19:50:03 + > "Kani, Toshimitsu" escreveu: : > > Firmware has better knowledge about the platform and can provide > > better RAS when implemented properly.  I agree that user > > experiences may vary on

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Mauro Carvalho Chehab
Em Thu, 20 Jul 2017 19:50:03 + "Kani, Toshimitsu" escreveu: > On Thu, 2017-07-20 at 06:33 +0200, Borislav Petkov wrote: > > On Wed, Jul 19, 2017 at 04:40:25PM +, Kani, Toshimitsu wrote: > > >  ghes_edac allows to report errors to OS management tools like > > >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Mauro Carvalho Chehab
Em Thu, 20 Jul 2017 19:50:03 + "Kani, Toshimitsu" escreveu: > On Thu, 2017-07-20 at 06:33 +0200, Borislav Petkov wrote: > > On Wed, Jul 19, 2017 at 04:40:25PM +, Kani, Toshimitsu wrote: > > >  ghes_edac allows to report errors to OS management tools like > > > rasdaemon in addition to

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Kani, Toshimitsu
On Thu, 2017-07-20 at 06:33 +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:40:25PM +, Kani, Toshimitsu wrote: > >  ghes_edac allows to report errors to OS management tools like > > rasdaemon in addition to platform- specific managements. > > So ghes_edac *is* a poor man's driver

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Kani, Toshimitsu
On Thu, 2017-07-20 at 06:33 +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:40:25PM +, Kani, Toshimitsu wrote: > >  ghes_edac allows to report errors to OS management tools like > > rasdaemon in addition to platform- specific managements. > > So ghes_edac *is* a poor man's driver

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Mauro Carvalho Chehab
Em Thu, 20 Jul 2017 19:05:04 +0200 Borislav Petkov escreveu: > On Thu, Jul 20, 2017 at 04:55:59PM +, Luck, Tony wrote: > > Add a module parameter to those edac drivers that can override the check > > and let them load anyway. I'm not paranoid, I just assume that there is a

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Mauro Carvalho Chehab
Em Thu, 20 Jul 2017 19:05:04 +0200 Borislav Petkov escreveu: > On Thu, Jul 20, 2017 at 04:55:59PM +, Luck, Tony wrote: > > Add a module parameter to those edac drivers that can override the check > > and let them load anyway. I'm not paranoid, I just assume that there is a > > BIOS > > out

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Luck, Tony
> Or add that parameter to edac_core.ko and let it control which EDAC > driver gets loaded? Something like > > edac=ignore_ghes > > or so. And then the other EDAC drivers query it. Sure ... one central place is better than adding code to each driver. -Tony

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Luck, Tony
> Or add that parameter to edac_core.ko and let it control which EDAC > driver gets loaded? Something like > > edac=ignore_ghes > > or so. And then the other EDAC drivers query it. Sure ... one central place is better than adding code to each driver. -Tony

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Borislav Petkov
On Thu, Jul 20, 2017 at 04:55:59PM +, Luck, Tony wrote: > Add a module parameter to those edac drivers that can override the check > and let them load anyway. I'm not paranoid, I just assume that there is a > BIOS > out there that sets the OSC/WHEA bits, but isn't generating useful GHES

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Borislav Petkov
On Thu, Jul 20, 2017 at 04:55:59PM +, Luck, Tony wrote: > Add a module parameter to those edac drivers that can override the check > and let them load anyway. I'm not paranoid, I just assume that there is a > BIOS > out there that sets the OSC/WHEA bits, but isn't generating useful GHES

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Luck, Tony
>> Yes, the following message is shown on HP systems. Please note that >> WHEA is a Windows-defined interface. > > Ok, so let's couple ghes_edac loading to that and see how far we could > go. I guess we should add checks for that to the major x86 EDAC drivers > to not load and this way ghes_edac

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Luck, Tony
>> Yes, the following message is shown on HP systems. Please note that >> WHEA is a Windows-defined interface. > > Ok, so let's couple ghes_edac loading to that and see how far we could > go. I guess we should add checks for that to the major x86 EDAC drivers > to not load and this way ghes_edac

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Borislav Petkov
On Thu, Jul 20, 2017 at 02:42:25PM +, Kani, Toshimitsu wrote: > Yes, the following message is shown on HP systems. Please note that > WHEA is a Windows-defined interface. Ok, so let's couple ghes_edac loading to that and see how far we could go. I guess we should add checks for that to the

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Borislav Petkov
On Thu, Jul 20, 2017 at 02:42:25PM +, Kani, Toshimitsu wrote: > Yes, the following message is shown on HP systems. Please note that > WHEA is a Windows-defined interface. Ok, so let's couple ghes_edac loading to that and see how far we could go. I guess we should add checks for that to the

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Kani, Toshimitsu
On Thu, 2017-07-20 at 06:16 +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:56:17PM +, Kani, Toshimitsu wrote: > > Since ghes_edac has not been used for a long time, I have a feeling > > that not so many vendors want to use it.  In the case of HPE, we do > > not need to update with

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Kani, Toshimitsu
On Thu, 2017-07-20 at 06:16 +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:56:17PM +, Kani, Toshimitsu wrote: > > Since ghes_edac has not been used for a long time, I have a feeling > > that not so many vendors want to use it.  In the case of HPE, we do > > not need to update with

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 04:40:25PM +, Kani, Toshimitsu wrote: > ghes_edac allows to report errors to OS management tools like > rasdaemon in addition to platform- specific managements. So ghes_edac *is* a poor man's driver in the sense that it doesn't do anything fancy but repeat like a

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 04:40:25PM +, Kani, Toshimitsu wrote: > ghes_edac allows to report errors to OS management tools like > rasdaemon in addition to platform- specific managements. So ghes_edac *is* a poor man's driver in the sense that it doesn't do anything fancy but repeat like a

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 02:55:08PM -0400, Aristeu Rozanski wrote: > That would also need to keep an eye on versions. A newer version of BIOS > on a whitelisted platform might be broken. Yeah, that would be a nasty, back-stabbing SNAFU. So I'm thinking of adding a bunch of FW_ERR sanity checks to

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 02:55:08PM -0400, Aristeu Rozanski wrote: > That would also need to keep an eye on versions. A newer version of BIOS > on a whitelisted platform might be broken. Yeah, that would be a nasty, back-stabbing SNAFU. So I'm thinking of adding a bunch of FW_ERR sanity checks to

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 04:56:17PM +, Kani, Toshimitsu wrote: > Since ghes_edac has not been used for a long time, I have a feeling > that not so many vendors want to use it. In the case of HPE, we do not > need to update with each platform since "HPE" "Server" will cover all > platforms we

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 04:56:17PM +, Kani, Toshimitsu wrote: > Since ghes_edac has not been used for a long time, I have a feeling > that not so many vendors want to use it. In the case of HPE, we do not > need to update with each platform since "HPE" "Server" will cover all > platforms we

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Wed, 2017-07-19 at 14:55 -0400, Aristeu Rozanski wrote: > On Wed, Jul 19, 2017 at 06:22:04PM +0200, Borislav Petkov wrote: > > On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > > > I do prefer to avoid any white / black listing.  But I do not see > > > how > > > it solves the

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Wed, 2017-07-19 at 14:55 -0400, Aristeu Rozanski wrote: > On Wed, Jul 19, 2017 at 06:22:04PM +0200, Borislav Petkov wrote: > > On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > > > I do prefer to avoid any white / black listing.  But I do not see > > > how > > > it solves the

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Aristeu Rozanski
On Wed, Jul 19, 2017 at 06:22:04PM +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > > I do prefer to avoid any white / black listing. But I do not see how > > it solves the buggy DMI/SMBIOS info as an example of firmware bugs we > > may have to

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Aristeu Rozanski
On Wed, Jul 19, 2017 at 06:22:04PM +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > > I do prefer to avoid any white / black listing. But I do not see how > > it solves the buggy DMI/SMBIOS info as an example of firmware bugs we > > may have to

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Luck, Tony
>> Later when GHES gives you a NODE/CARD/MODULE) in an error record. You need >> to match these up. But SMBIOS only gave you two strings "Locator" and "Bank >> Locator" which have no defined syntax. You are at the mercy of the BIOS >> writer >> to put in something parseable. > > Well, at some

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Luck, Tony
>> Later when GHES gives you a NODE/CARD/MODULE) in an error record. You need >> to match these up. But SMBIOS only gave you two strings "Locator" and "Bank >> Locator" which have no defined syntax. You are at the mercy of the BIOS >> writer >> to put in something parseable. > > Well, at some

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Wed, 2017-07-19 at 18:22 +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > > I do prefer to avoid any white / black listing.  But I do not see > > how it solves the buggy DMI/SMBIOS info as an example of firmware > > bugs we may have to deal

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Wed, 2017-07-19 at 18:22 +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > > I do prefer to avoid any white / black listing.  But I do not see > > how it solves the buggy DMI/SMBIOS info as an example of firmware > > bugs we may have to deal

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Tue, 2017-07-18 at 18:15 -0300, Mauro Carvalho Chehab wrote: > Em Tue, 18 Jul 2017 19:58:54 + : > We had a similar discussion several years ago when I wrote this > driver. On that time, I talked with Red Hat, HP, Dell, Intel people > and with some customers with large clusters. > > The

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Tue, 2017-07-18 at 18:15 -0300, Mauro Carvalho Chehab wrote: > Em Tue, 18 Jul 2017 19:58:54 + : > We had a similar discussion several years ago when I wrote this > driver. On that time, I talked with Red Hat, HP, Dell, Intel people > and with some customers with large clusters. > > The

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > I do prefer to avoid any white / black listing. But I do not see how > it solves the buggy DMI/SMBIOS info as an example of firmware bugs we > may have to deal with. So how do you want to deal with this? Maintain an evergrowing

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > I do prefer to avoid any white / black listing. But I do not see how > it solves the buggy DMI/SMBIOS info as an example of firmware bugs we > may have to deal with. So how do you want to deal with this? Maintain an evergrowing

  1   2   >