On 7/14/2022 5:58 PM, Dan Williams wrote:
[..]
>>>
>>>>> However, the ARS engine likely can return the precise error ranges so I
>>>>> think the fix is to just use the address range indicated by 1UL <<
>>>>> MCI_MISC_ADDR_LSB(mce->misc) to filter the results from a short ARS
>>>>> scrub request to ask the device for the precise error list.
>>>>
>>>> You mean for nfit_handle_mce() callback to issue a short ARS per each
>>>> poison report over a 4K range
>>>
>>> Over a L1_CACHE_BYTES range...
>>>
[..]
>>>
>>> For the badrange tracking, no. So this would just be a check to say
>>> "Yes, CPU I see you think the whole 4K is gone, but lets double check
>>> with more precise information for what gets placed in the badrange
>>> tracking".
>>
>> Okay, process-wise, this is what I am seeing -
>>
>> - for each poison, nfit_handle_mce() issues a short ARS given (addr,
>> 64bytes)
> 
> Why would the short-ARS be performed over a 64-byte span when the MCE
> reported a 4K aligned event?

Cuz you said so, see above. :)  Yes, 4K range as reported by the MCE 
makes sense.

> 
>> - and short ARS returns to say that's actually (addr, 256bytes),
>> - and then nvdimm_bus_add_badrange() logs the poison in (addr, 512bytes)
>> anyway.
> 
> Right, I am reacting to the fact that the patch is picking 512 as an
> arbtitrary blast radius. It's ok to expand the blast radius from
> hardware when, for example, recording a 64-byte MCE in badrange which
> only understands 512 byte records, but it's not ok to take a 4K MCE and
> trim it to 512 bytes without asking hardware for a more precise report.

Agreed.

> 
> Recall that the NFIT driver supports platforms that may not offer ARS.
> In that case the 4K MCE from the CPU is all that the driver gets and
> there is no data source for a more precise answer.
> 
> So the ask is to avoid trimming the blast radius of MCE reports unless
> and until a short-ARS says otherwise.
> 

What happens to short ARS on a platform that doesn't support ARS?
-EOPNOTSUPPORTED ?

>> The precise badrange from short ARS is lost in the process, given the
>> time spent visiting the BIOS, what's the gain?
> 
> Generic support for not under-recording poison on platforms that do not
> support ARS.
> 
>> Could we defer the precise badrange until there is consumer of the
>> information?
> 
> Ideally the consumer is immediate and this precise information can make
> it to the filesystem which might be able to make a better decision about
> what data got clobbered.
> 
> See dax_notify_failure() infrastructure currently in linux-next that can
> convey poison events to filesystems. That might be a path to start
> tracking and reporting precise failure information to address the
> constraints of the badrange implementation.

Yes, I'm aware of dax_notify_failure(), but would appreciate if you 
don't mind to elaborate on how the code path could be leveraged for 
precise badrange implementation.
My understanding is that dax_notify_failure() is in the path of 
synchronous fault accompanied by SIGBUS with BUS_MCEERR_AR.
But badrange could be recorded without poison being consumed, even 
without DAX filesystem in the picture.

thanks,
-jane

Reply via email to