Hi Shiju,
On 06/10/2020 17:13, Shiju Jose wrote:
[...]
> Please find following pseudo code we added for the kernel side to make sure
> we correctly understand your suggestions.
>
> 1. Create edac device and edac device sysfs entries for the online CPU caches.
> /drivers/edac/edac_device.c
>
el@vger.kernel.org; tony.l...@intel.com;
>r...@rjwysocki.net; l...@kernel.org; Linuxarm
>Subject: Re: [RFC PATCH 0/7] RAS/CEC: Extend CEC for errors count check on
>short time period
>
>Hi Shiju,
>
>On 02/10/2020 16:38, Shiju Jose wrote:
>>> -Original Message-
On Fri, Oct 02, 2020 at 06:33:17PM +0100, James Morse wrote:
> > I think adding the CPU error collection to the kernel
> > has the following advantages,
> > 1. The CPU error collection and isolation would not be active if the
> > rasdaemon stopped running or not running on a machine.
r.kernel.org; tony.l...@intel.com; r...@rjwysocki.net;
>> james.mo...@arm.com; l...@kernel.org; Linuxarm
>>
>> Subject: Re: [RFC PATCH 0/7] RAS/CEC: Extend CEC for errors count check on
>> short time period
>>
>> On Fri, Oct 02, 2020 at 01:22:28PM +0100, Shiju
> Because from my x86 CPUs limited experience, the cache arrays are mostly
> fine and errors reported there are not something that happens very
> frequently so we don't even need to collect and count those.
On Intel X86 we leave the counting and threshold decisions about cache
health to the
t;james.mo...@arm.com; l...@kernel.org; Linuxarm
>
>Subject: Re: [RFC PATCH 0/7] RAS/CEC: Extend CEC for errors count check on
>short time period
>
>On Fri, Oct 02, 2020 at 01:22:28PM +0100, Shiju Jose wrote:
>> Open Questions based on the feedback from Boris, 1. ARM process
On Fri, Oct 02, 2020 at 01:22:28PM +0100, Shiju Jose wrote:
> Open Questions based on the feedback from Boris,
> 1. ARM processor error types are cache/TLB/bus errors.
>[Reference N2.4.4.1 ARM Processor Error Information UEFI Spec v2.8]
> Any of the above error types should not be consider for
7 matches
Mail list logo