On Tue, 11 Nov 2025 14:08:13 +1000 Gavin Shan <[email protected]> wrote:
> Hi Igor and Jonathan, > > On 11/11/25 12:49 AM, Igor Mammedov wrote: > > On Thu, 6 Nov 2025 13:15:52 +1000 > > Gavin Shan <[email protected]> wrote: > >> On 11/6/25 12:14 AM, Jonathan Cameron wrote: > >>> On Wed, 5 Nov 2025 21:44:49 +1000 > >>> Gavin Shan <[email protected]> wrote: > >>> > >>>> In the situation where host and guest has 64KiB and 4KiB page sizes, > >>>> one problematic host page affects 16 guest pages. we need to send 16 > >>>> consective errors in this specific case. > >>>> > >>>> Extend acpi_ghes_memory_errors() to support multiple CPERs after the > >>>> hunk of code to generate the GHES error status is pulled out from > >>>> ghes_gen_err_data_uncorrectable_recoverable(). The status field of > >>>> generic error status block is also updated accordingly if multiple > >>>> error data entries are contained in the generic error status block. > >>>> > >>>> Signed-off-by: Gavin Shan <[email protected]> > >>> Hi Gavin, > >>> > >>> Mostly fine, but a few comments on the defines added and a > >>> question on what the multiple things are meant to mean? > >>> > >> > >> Thanks for your review and comments, replies as below. > >> > >>>> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c > >>>> index a9c08e73c0..527b85c8d8 100644 > >>>> --- a/hw/acpi/ghes.c > >>>> +++ b/hw/acpi/ghes.c > >>>> @@ -57,8 +57,12 @@ > >>>> /* The memory section CPER size, UEFI 2.6: N.2.5 Memory Error Section > >>>> */ > >>>> #define ACPI_GHES_MEM_CPER_LENGTH 80 > >>>> > >>>> -/* Masks for block_status flags */ > >>>> -#define ACPI_GEBS_UNCORRECTABLE 1 > >>>> +/* Bits for block_status flags */ > >>>> +#define ACPI_GEBS_UNCORRECTABLE 0 > >>>> +#define ACPI_GEBS_CORRECTABLE 1 > >>>> +#define ACPI_GEBS_MULTIPLE_UNCORRECTABLE 2 > >>>> +#define ACPI_GEBS_MULTIPLE_CORRECTABLE 3 > >>> > >>> So this maps to the bits in block status. > >>> > >>> I'm not actually sure what these multiple variants are meant to tell us. > >>> The multiple error blocks example referred to by the spec is a way to > >>> represent > >>> the same error applying to multiple places. So that's one error, many > >>> blocks. > >>> I have no idea if we set these bits in that case. > >>> > >>> Based on a quick look I don't think linux even takes any notice. THere > >>> are defines in actbl1.h but I'm not seeing any use made of them. > >>> > >> > >> I hope Igor can confirm since it was suggested by him. > >> > >> It's hard to understand how exactly these multiple variants are used from > >> the > >> spec. In ACPI 6.5 Table 18.11, it's explained as below. > >> > >> Bit [2] - Multiple Uncorrectable Errors: If set to one, indicates that more > >> than one uncorrectable errors have been detected. > >> > >> I don't see those multiple variants have been used by Linux. So I think > >> it's > >> safe to drop them. > > > > even though example describes 'same' error at different components, > > the bit fields descriptions doesn't set any limits on what 'more than one' > > means. > > > > Also from guest POV it's multiple different pages that we are reporting here > > as multiple CPERs. > > It seems to me that setting *_MULTIPLE_* here is correct thing to do. > > > > I don't have strong opinions. Lets keep to set _MULTIPLE_ flag if Jonathan > is fine. Again, this field isn't used by Linux guest. I don't care strongly. Maybe we should ask for a spec clarification as I doubt implementations will be consistent on this given the vague description and that Linux ignores it today. > > >>>> +#define ACPI_GEBS_ERROR_DATA_ENTRIES 4 > >>> > >>> This is bits 4-13 and the define isn't used. I'd drop it. > >>> > >> > >> The definition is used in acpi_ghes_memory_errors() of this patch. However, > >> I don't see it has been used by Linux. This field isn't used by Linux to > >> determine > >> the total number of error entries. So I think I can drop it either if Igor > >> is ok. > >> > > Lets keep this field either in next revision if Jonathan is fine. I'm fine with the field, but not the value. As far as I can tell form the spec, it should be a mask, not a single bit. > > Thanks, > Gavin > > >
