Re: Phantom PMEM poison issue

2022-01-21 Thread Jane Chu
On 1/21/2022 5:51 PM, Tsaur, Erwin wrote:
> Hi Jane,
> 
> Is phantom error, an poison that was injected and then cleared, but somehow 
> shows up again?
> How is "daxfs takes acation and clears the poison" by doing mailbox or writes?
> Also how are you doing ARS?

The phantom show up as soon as this console message show up
[Hardware Error]: Hardware error from APEI Generic Hardware Error 
Source: 1
from 'ghes'.

The poisons were clear via pmem_clear_poison().

ARS was run as
   "ndctl start-scrub; ndctl wait-scrub -p 30"

thanks,
-jane


> 
> Erwin
> 
> -Original Message-
> From: Luck, Tony 
> Sent: Friday, January 21, 2022 5:27 PM
> To: chu, jane 
> Cc: Williams, Dan J ; b...@alien8.de >> Borislav 
> Petkov ; djw...@kernel.org; wi...@infradead.org; 
> nvd...@lists.linux.dev; linux-kernel@vger.kernel.org
> Subject: Re: Phantom PMEM poison issue
> 
> On Sat, Jan 22, 2022 at 12:40:18AM +, Jane Chu wrote:
>> On 1/21/2022 4:31 PM, Jane Chu wrote:
>>> On baremetal Intel platform with DCPMEM installed and configured to
>>> provision daxfs, say a poison was consumed by a load from a user
>>> thread, and then daxfs takes action and clears the poison, confirmed
>>> by "ndctl -NM".
>>>
>>> Now, depends on the luck, after sometime(from a few seconds to 5+
>>> hours) the ghost of the previous poison will surface, and it takes
>>> unload/reload the libnvdimm drivers in order to drive the phantom
>>> poison away, confirmed by ARS.
>>>
>>> Turns out, the issue is quite reproducible with the latest stable Linux.
>>>
>>> Here is the relevant console message after injected 8 poisons in one
>>> page via
>>>  # ndctl inject-error namespace0.0 -n 2 -B 8210
>>
>> There is a cut-n-paste error, the above line should be
>> "# ndctl inject-error namespace0.0 -n 8 -B 8210"
> 
> You say "in one page" here. What is the page size?
>>
>> -jane
>>
>>> then, cleared them all, and wait for 5+ hours, notice the time stamp.
>>> BTW, the system is idle otherwise.
>>>
>>> [ 2439.742296] mce: Uncorrected hardware memory error in user-access
>>> at
>>> 1850602400
>>> [ 2439.742420] Memory failure: 0x1850602: Sending SIGBUS to
>>> fsdax_poison_v1:8457 due to hardware memory corruption [
>>> 2439.761866] Memory failure: 0x1850602: recovery action for dax page:
>>> Recovered
>>> [ 2439.769949] mce: [Hardware Error]: Machine check events logged
>>> -1850603000 uncached-minus<->write-back [ 2439.769984] x86/PAT:
>>> memtype_reserve failed [mem 0x1850602000-0x1850602fff], track
>>> uncached-minus, req uncached-minus [ 2439.769985] Could not
>>> invalidate pfn=0x1850602 from 1:1 map [ 2440.856351] x86/PAT:
>>> fsdax_poison_v1:8457 freeing invalid memtype [mem
>>> 0x1850602000-0x1850602fff]
> 
> This error is reported in PFN=1850602 (at offset 0x400 = 1K)
> 
>>>
>>> At this point,
>>> # ndctl list -NMu -r 0
>>> {
>>>  "dev":"namespace0.0",
>>>  "mode":"fsdax",
>>>  "map":"dev",
>>>  "size":"15.75 GiB (16.91 GB)",
>>>  "uuid":"2ccc540a-3c7b-4b91-b87b-9e897ad0b9bb",
>>>  "sector_size":4096,
>>>  "align":2097152,
>>>  "blockdev":"pmem0"
>>> }
>>>
>>> [21351.992296] {2}[Hardware Error]: Hardware error from APEI Generic
>>> Hardware Error Source: 1 [21352.001528] {2}[Hardware Error]: event
>>> severity: recoverable [21352.007838] {2}[Hardware Error]:  Error 0,
>>> type: recoverable
>>> [21352.014156] {2}[Hardware Error]:   section_type: memory error
>>> [21352.020572] {2}[Hardware Error]:   physical_address: 0x001850603200
> 
> This error is in the following page: PFN=1850603 (at offset 0x200 = 512b)
> 
> Is that what you mean by "phantom error" ... from a different address from 
> those that were injected?
> 
> -Tony
> 



Re: Phantom PMEM poison issue

2022-01-21 Thread Jane Chu
On 1/21/2022 5:27 PM, Luck, Tony wrote:
> On Sat, Jan 22, 2022 at 12:40:18AM +, Jane Chu wrote:
>> On 1/21/2022 4:31 PM, Jane Chu wrote:
>>> On baremetal Intel platform with DCPMEM installed and configured to
>>> provision daxfs, say a poison was consumed by a load from a user thread,
>>> and then daxfs takes action and clears the poison, confirmed by "ndctl
>>> -NM".
>>>
>>> Now, depends on the luck, after sometime(from a few seconds to 5+ hours)
>>> the ghost of the previous poison will surface, and it takes
>>> unload/reload the libnvdimm drivers in order to drive the phantom poison
>>> away, confirmed by ARS.
>>>
>>> Turns out, the issue is quite reproducible with the latest stable Linux.
>>>
>>> Here is the relevant console message after injected 8 poisons in one
>>> page via
>>>  # ndctl inject-error namespace0.0 -n 2 -B 8210
>>
>> There is a cut-n-paste error, the above line should be
>> "# ndctl inject-error namespace0.0 -n 8 -B 8210"
> 
> You say "in one page" here. What is the page size?

The page size is 4K, the size of base page on x86.
I said "one page", as 8 (poisons) * 256B = 2KiB, only half page.

>>
>> -jane
>>
>>> then, cleared them all, and wait for 5+ hours, notice the time stamp.
>>> BTW, the system is idle otherwise.
>>>
>>> [ 2439.742296] mce: Uncorrected hardware memory error in user-access at
>>> 1850602400
>>> [ 2439.742420] Memory failure: 0x1850602: Sending SIGBUS to
>>> fsdax_poison_v1:8457 due to hardware memory corruption
>>> [ 2439.761866] Memory failure: 0x1850602: recovery action for dax page:
>>> Recovered
>>> [ 2439.769949] mce: [Hardware Error]: Machine check events logged
>>> -1850603000 uncached-minus<->write-back
>>> [ 2439.769984] x86/PAT: memtype_reserve failed [mem
>>> 0x1850602000-0x1850602fff], track uncached-minus, req uncached-minus
>>> [ 2439.769985] Could not invalidate pfn=0x1850602 from 1:1 map
>>> [ 2440.856351] x86/PAT: fsdax_poison_v1:8457 freeing invalid memtype
>>> [mem 0x1850602000-0x1850602fff]
> 
> This error is reported in PFN=1850602 (at offset 0x400 = 1K)

yes.
> 
>>>
>>> At this point,
>>> # ndctl list -NMu -r 0
>>> {
>>>  "dev":"namespace0.0",
>>>  "mode":"fsdax",
>>>  "map":"dev",
>>>  "size":"15.75 GiB (16.91 GB)",
>>>  "uuid":"2ccc540a-3c7b-4b91-b87b-9e897ad0b9bb",
>>>  "sector_size":4096,
>>>  "align":2097152,
>>>  "blockdev":"pmem0"
>>> }
>>>
>>> [21351.992296] {2}[Hardware Error]: Hardware error from APEI Generic
>>> Hardware Error Source: 1
>>> [21352.001528] {2}[Hardware Error]: event severity: recoverable
>>> [21352.007838] {2}[Hardware Error]:  Error 0, type: recoverable
>>> [21352.014156] {2}[Hardware Error]:   section_type: memory error
>>> [21352.020572] {2}[Hardware Error]:   physical_address: 0x001850603200
> 
> This error is in the following page: PFN=1850603 (at offset 0x200 = 512b)
> 

I see, this is the next page... the issue is reproducible with
a single poison injection.

> Is that what you mean by "phantom error" ... from a different
> address from those that were injected?

All 8 poisons were cleared by the driver via DSM, and verified
by "ndctl -NMu -r 0", that covers every page in the 16GiB /dev/pmem.

It's phantom because unload->reload libnvdimm, followed by a full ARS
scan confirms the poison isn't there, hence phantom.

thanks,
-jane

> 
> -Tony



RE: Phantom PMEM poison issue

2022-01-21 Thread Tsaur, Erwin
Hi Jane,

Is phantom error, an poison that was injected and then cleared, but somehow 
shows up again?
How is "daxfs takes acation and clears the poison" by doing mailbox or writes?  
Also how are you doing ARS?

Erwin

-Original Message-
From: Luck, Tony  
Sent: Friday, January 21, 2022 5:27 PM
To: chu, jane 
Cc: Williams, Dan J ; b...@alien8.de >> Borislav 
Petkov ; djw...@kernel.org; wi...@infradead.org; 
nvd...@lists.linux.dev; linux-kernel@vger.kernel.org
Subject: Re: Phantom PMEM poison issue

On Sat, Jan 22, 2022 at 12:40:18AM +, Jane Chu wrote:
> On 1/21/2022 4:31 PM, Jane Chu wrote:
> > On baremetal Intel platform with DCPMEM installed and configured to 
> > provision daxfs, say a poison was consumed by a load from a user 
> > thread, and then daxfs takes action and clears the poison, confirmed 
> > by "ndctl -NM".
> > 
> > Now, depends on the luck, after sometime(from a few seconds to 5+ 
> > hours) the ghost of the previous poison will surface, and it takes 
> > unload/reload the libnvdimm drivers in order to drive the phantom 
> > poison away, confirmed by ARS.
> > 
> > Turns out, the issue is quite reproducible with the latest stable Linux.
> > 
> > Here is the relevant console message after injected 8 poisons in one 
> > page via
> > # ndctl inject-error namespace0.0 -n 2 -B 8210
> 
> There is a cut-n-paste error, the above line should be
>"# ndctl inject-error namespace0.0 -n 8 -B 8210"

You say "in one page" here. What is the page size? 
> 
> -jane
> 
> > then, cleared them all, and wait for 5+ hours, notice the time stamp.
> > BTW, the system is idle otherwise.
> > 
> > [ 2439.742296] mce: Uncorrected hardware memory error in user-access 
> > at
> > 1850602400
> > [ 2439.742420] Memory failure: 0x1850602: Sending SIGBUS to
> > fsdax_poison_v1:8457 due to hardware memory corruption [ 
> > 2439.761866] Memory failure: 0x1850602: recovery action for dax page:
> > Recovered
> > [ 2439.769949] mce: [Hardware Error]: Machine check events logged
> > -1850603000 uncached-minus<->write-back [ 2439.769984] x86/PAT: 
> > memtype_reserve failed [mem 0x1850602000-0x1850602fff], track 
> > uncached-minus, req uncached-minus [ 2439.769985] Could not 
> > invalidate pfn=0x1850602 from 1:1 map [ 2440.856351] x86/PAT: 
> > fsdax_poison_v1:8457 freeing invalid memtype [mem 
> > 0x1850602000-0x1850602fff]

This error is reported in PFN=1850602 (at offset 0x400 = 1K)

> > 
> > At this point,
> > # ndctl list -NMu -r 0
> > {
> > "dev":"namespace0.0",
> > "mode":"fsdax",
> > "map":"dev",
> > "size":"15.75 GiB (16.91 GB)",
> > "uuid":"2ccc540a-3c7b-4b91-b87b-9e897ad0b9bb",
> > "sector_size":4096,
> > "align":2097152,
> > "blockdev":"pmem0"
> > }
> > 
> > [21351.992296] {2}[Hardware Error]: Hardware error from APEI Generic 
> > Hardware Error Source: 1 [21352.001528] {2}[Hardware Error]: event 
> > severity: recoverable [21352.007838] {2}[Hardware Error]:  Error 0, 
> > type: recoverable
> > [21352.014156] {2}[Hardware Error]:   section_type: memory error
> > [21352.020572] {2}[Hardware Error]:   physical_address: 0x001850603200

This error is in the following page: PFN=1850603 (at offset 0x200 = 512b)

Is that what you mean by "phantom error" ... from a different address from 
those that were injected?

-Tony




Re: Phantom PMEM poison issue

2022-01-21 Thread Luck, Tony
On Sat, Jan 22, 2022 at 12:40:18AM +, Jane Chu wrote:
> On 1/21/2022 4:31 PM, Jane Chu wrote:
> > On baremetal Intel platform with DCPMEM installed and configured to
> > provision daxfs, say a poison was consumed by a load from a user thread,
> > and then daxfs takes action and clears the poison, confirmed by "ndctl
> > -NM".
> > 
> > Now, depends on the luck, after sometime(from a few seconds to 5+ hours)
> > the ghost of the previous poison will surface, and it takes
> > unload/reload the libnvdimm drivers in order to drive the phantom poison
> > away, confirmed by ARS.
> > 
> > Turns out, the issue is quite reproducible with the latest stable Linux.
> > 
> > Here is the relevant console message after injected 8 poisons in one
> > page via
> > # ndctl inject-error namespace0.0 -n 2 -B 8210
> 
> There is a cut-n-paste error, the above line should be
>"# ndctl inject-error namespace0.0 -n 8 -B 8210"

You say "in one page" here. What is the page size? 
> 
> -jane
> 
> > then, cleared them all, and wait for 5+ hours, notice the time stamp.
> > BTW, the system is idle otherwise.
> > 
> > [ 2439.742296] mce: Uncorrected hardware memory error in user-access at
> > 1850602400
> > [ 2439.742420] Memory failure: 0x1850602: Sending SIGBUS to
> > fsdax_poison_v1:8457 due to hardware memory corruption
> > [ 2439.761866] Memory failure: 0x1850602: recovery action for dax page:
> > Recovered
> > [ 2439.769949] mce: [Hardware Error]: Machine check events logged
> > -1850603000 uncached-minus<->write-back
> > [ 2439.769984] x86/PAT: memtype_reserve failed [mem
> > 0x1850602000-0x1850602fff], track uncached-minus, req uncached-minus
> > [ 2439.769985] Could not invalidate pfn=0x1850602 from 1:1 map
> > [ 2440.856351] x86/PAT: fsdax_poison_v1:8457 freeing invalid memtype
> > [mem 0x1850602000-0x1850602fff]

This error is reported in PFN=1850602 (at offset 0x400 = 1K)

> > 
> > At this point,
> > # ndctl list -NMu -r 0
> > {
> > "dev":"namespace0.0",
> > "mode":"fsdax",
> > "map":"dev",
> > "size":"15.75 GiB (16.91 GB)",
> > "uuid":"2ccc540a-3c7b-4b91-b87b-9e897ad0b9bb",
> > "sector_size":4096,
> > "align":2097152,
> > "blockdev":"pmem0"
> > }
> > 
> > [21351.992296] {2}[Hardware Error]: Hardware error from APEI Generic
> > Hardware Error Source: 1
> > [21352.001528] {2}[Hardware Error]: event severity: recoverable
> > [21352.007838] {2}[Hardware Error]:  Error 0, type: recoverable
> > [21352.014156] {2}[Hardware Error]:   section_type: memory error
> > [21352.020572] {2}[Hardware Error]:   physical_address: 0x001850603200

This error is in the following page: PFN=1850603 (at offset 0x200 = 512b)

Is that what you mean by "phantom error" ... from a different
address from those that were injected?

-Tony



Re: Phantom PMEM poison issue

2022-01-21 Thread Jane Chu
On 1/21/2022 4:31 PM, Jane Chu wrote:
> On baremetal Intel platform with DCPMEM installed and configured to
> provision daxfs, say a poison was consumed by a load from a user thread,
> and then daxfs takes action and clears the poison, confirmed by "ndctl
> -NM".
> 
> Now, depends on the luck, after sometime(from a few seconds to 5+ hours)
> the ghost of the previous poison will surface, and it takes
> unload/reload the libnvdimm drivers in order to drive the phantom poison
> away, confirmed by ARS.
> 
> Turns out, the issue is quite reproducible with the latest stable Linux.
> 
> Here is the relevant console message after injected 8 poisons in one
> page via
> # ndctl inject-error namespace0.0 -n 2 -B 8210

There is a cut-n-paste error, the above line should be
   "# ndctl inject-error namespace0.0 -n 8 -B 8210"

-jane

> then, cleared them all, and wait for 5+ hours, notice the time stamp.
> BTW, the system is idle otherwise.
> 
> [ 2439.742296] mce: Uncorrected hardware memory error in user-access at
> 1850602400
> [ 2439.742420] Memory failure: 0x1850602: Sending SIGBUS to
> fsdax_poison_v1:8457 due to hardware memory corruption
> [ 2439.761866] Memory failure: 0x1850602: recovery action for dax page:
> Recovered
> [ 2439.769949] mce: [Hardware Error]: Machine check events logged
> -1850603000 uncached-minus<->write-back
> [ 2439.769984] x86/PAT: memtype_reserve failed [mem
> 0x1850602000-0x1850602fff], track uncached-minus, req uncached-minus
> [ 2439.769985] Could not invalidate pfn=0x1850602 from 1:1 map
> [ 2440.856351] x86/PAT: fsdax_poison_v1:8457 freeing invalid memtype
> [mem 0x1850602000-0x1850602fff]
> 
> At this point,
> # ndctl list -NMu -r 0
> {
> "dev":"namespace0.0",
> "mode":"fsdax",
> "map":"dev",
> "size":"15.75 GiB (16.91 GB)",
> "uuid":"2ccc540a-3c7b-4b91-b87b-9e897ad0b9bb",
> "sector_size":4096,
> "align":2097152,
> "blockdev":"pmem0"
> }
> 
> [21351.992296] {2}[Hardware Error]: Hardware error from APEI Generic
> Hardware Error Source: 1
> [21352.001528] {2}[Hardware Error]: event severity: recoverable
> [21352.007838] {2}[Hardware Error]:  Error 0, type: recoverable
> [21352.014156] {2}[Hardware Error]:   section_type: memory error
> [21352.020572] {2}[Hardware Error]:   physical_address: 0x001850603200
> [21352.027958] {2}[Hardware Error]:   physical_address_mask:
> 0xff00
> [21352.035827] {2}[Hardware Error]:   node: 0 module: 1
> [21352.041466] {2}[Hardware Error]:   DIMM location: /SYS/MB/P0 D6
> [21352.048277] Memory failure: 0x1850603: recovery action for dax page:
> Recovered
> [21352.056346] mce: [Hardware Error]: Machine check events logged
> [21352.056890] EDAC skx MC0: HANDLING MCE MEMORY ERROR
> [21352.056892] EDAC skx MC0: CPU 0: Machine Check Event: 0x0 Bank 255:
> 0xbc9f
> [21352.056894] EDAC skx MC0: TSC 0x0
> [21352.056895] EDAC skx MC0: ADDR 0x1850603200
> [21352.056897] EDAC skx MC0: MISC 0x8c
> [21352.056898] EDAC skx MC0: PROCESSOR 0:0x50656 TIME 1642758243 SOCKET
> 0 APIC 0x0
> [21352.056909] EDAC MC0: 1 UE memory read error on
> CPU_SrcID#0_MC#0_Chan#0_DIMM#1 (channel:0 slot:1 page:0x1850603
> offset:0x200 grain:32 -  err_code:0x:0x009f [..]
> 
> And now,
> 
> # ndctl list -NMu -r 0
> {
> "dev":"namespace0.0",
> "mode":"fsdax",
> "map":"dev",
> "size":"15.75 GiB (16.91 GB)",
> "uuid":"2ccc540a-3c7b-4b91-b87b-9e897ad0b9bb",
> "sector_size":4096,
> "align":2097152,
> "blockdev":"pmem0",
> "badblock_count":1,
> "badblocks":[
>   {
> "offset":8217,
> "length":1,
> "dimms":[
>   "nmem0"
> ]
>   }
> ]
> }
> 
> According to my limited research, when ghes_proc_in_irq() is fired to
> report a delayed UE and it calls memory_failure() to take the page out
> and causes driver to record a badblock record, and that's how the
> phantom poison appeared.
> 
> Note, 1 phantom poison for 8 injected poisons, so, not an accurate
> phantom representation.
> 
> But that aside, it seems that the GHES mechanism and the synchronous MCE
> handling is totally at odds with each other, and that cannot be correct.
> 
> What is the right thing to do to fix the issue? Should memory_failure
> handler second-guess the GHES report?  Should the synchronous MCE
> handling mechanism manage to tell the firmware that so-and-so memory UE
> has been cleared and hence clear the record in firmware?  Other ideas?
> 
> 
> Thanks!
> -jane



Phantom PMEM poison issue

2022-01-21 Thread Jane Chu
On baremetal Intel platform with DCPMEM installed and configured to 
provision daxfs, say a poison was consumed by a load from a user thread, 
and then daxfs takes action and clears the poison, confirmed by "ndctl 
-NM".

Now, depends on the luck, after sometime(from a few seconds to 5+ hours) 
the ghost of the previous poison will surface, and it takes 
unload/reload the libnvdimm drivers in order to drive the phantom poison 
away, confirmed by ARS.

Turns out, the issue is quite reproducible with the latest stable Linux.

Here is the relevant console message after injected 8 poisons in one 
page via
   # ndctl inject-error namespace0.0 -n 2 -B 8210
then, cleared them all, and wait for 5+ hours, notice the time stamp. 
BTW, the system is idle otherwise.

[ 2439.742296] mce: Uncorrected hardware memory error in user-access at 
1850602400
[ 2439.742420] Memory failure: 0x1850602: Sending SIGBUS to 
fsdax_poison_v1:8457 due to hardware memory corruption
[ 2439.761866] Memory failure: 0x1850602: recovery action for dax page: 
Recovered
[ 2439.769949] mce: [Hardware Error]: Machine check events logged
-1850603000 uncached-minus<->write-back
[ 2439.769984] x86/PAT: memtype_reserve failed [mem 
0x1850602000-0x1850602fff], track uncached-minus, req uncached-minus
[ 2439.769985] Could not invalidate pfn=0x1850602 from 1:1 map
[ 2440.856351] x86/PAT: fsdax_poison_v1:8457 freeing invalid memtype 
[mem 0x1850602000-0x1850602fff]

At this point,
# ndctl list -NMu -r 0
{
   "dev":"namespace0.0",
   "mode":"fsdax",
   "map":"dev",
   "size":"15.75 GiB (16.91 GB)",
   "uuid":"2ccc540a-3c7b-4b91-b87b-9e897ad0b9bb",
   "sector_size":4096,
   "align":2097152,
   "blockdev":"pmem0"
}

[21351.992296] {2}[Hardware Error]: Hardware error from APEI Generic 
Hardware Error Source: 1
[21352.001528] {2}[Hardware Error]: event severity: recoverable
[21352.007838] {2}[Hardware Error]:  Error 0, type: recoverable
[21352.014156] {2}[Hardware Error]:   section_type: memory error
[21352.020572] {2}[Hardware Error]:   physical_address: 0x001850603200
[21352.027958] {2}[Hardware Error]:   physical_address_mask: 
0xff00
[21352.035827] {2}[Hardware Error]:   node: 0 module: 1
[21352.041466] {2}[Hardware Error]:   DIMM location: /SYS/MB/P0 D6
[21352.048277] Memory failure: 0x1850603: recovery action for dax page: 
Recovered
[21352.056346] mce: [Hardware Error]: Machine check events logged
[21352.056890] EDAC skx MC0: HANDLING MCE MEMORY ERROR
[21352.056892] EDAC skx MC0: CPU 0: Machine Check Event: 0x0 Bank 255: 
0xbc9f
[21352.056894] EDAC skx MC0: TSC 0x0
[21352.056895] EDAC skx MC0: ADDR 0x1850603200
[21352.056897] EDAC skx MC0: MISC 0x8c
[21352.056898] EDAC skx MC0: PROCESSOR 0:0x50656 TIME 1642758243 SOCKET 
0 APIC 0x0
[21352.056909] EDAC MC0: 1 UE memory read error on 
CPU_SrcID#0_MC#0_Chan#0_DIMM#1 (channel:0 slot:1 page:0x1850603 
offset:0x200 grain:32 -  err_code:0x:0x009f [..]

And now,

# ndctl list -NMu -r 0
{
   "dev":"namespace0.0",
   "mode":"fsdax",
   "map":"dev",
   "size":"15.75 GiB (16.91 GB)",
   "uuid":"2ccc540a-3c7b-4b91-b87b-9e897ad0b9bb",
   "sector_size":4096,
   "align":2097152,
   "blockdev":"pmem0",
   "badblock_count":1,
   "badblocks":[
 {
   "offset":8217,
   "length":1,
   "dimms":[
 "nmem0"
   ]
 }
   ]
}

According to my limited research, when ghes_proc_in_irq() is fired to 
report a delayed UE and it calls memory_failure() to take the page out 
and causes driver to record a badblock record, and that's how the 
phantom poison appeared.

Note, 1 phantom poison for 8 injected poisons, so, not an accurate 
phantom representation.

But that aside, it seems that the GHES mechanism and the synchronous MCE 
handling is totally at odds with each other, and that cannot be correct.

What is the right thing to do to fix the issue? Should memory_failure 
handler second-guess the GHES report?  Should the synchronous MCE 
handling mechanism manage to tell the firmware that so-and-so memory UE 
has been cleared and hence clear the record in firmware?  Other ideas?


Thanks!
-jane


Re: [PATCH 1/5] mm: rmap: fix cache flush on THP pages

2022-01-21 Thread Yang Shi
On Thu, Jan 20, 2022 at 11:56 PM Muchun Song  wrote:
>
> The flush_cache_page() only remove a PAGE_SIZE sized range from the cache.
> However, it does not cover the full pages in a THP except a head page.
> Replace it with flush_cache_range() to fix this issue. At least, no
> problems were found due to this. Maybe because the architectures that
> have virtual indexed caches is less.

Yeah, actually flush_cache_page()/flush_cache_range() are no-op for
the most architectures which have THP supported, i.e. x86, aarch64,
powerpc, etc.

And currently just tmpfs and read-only files support PMD-mapped THP,
but both don't have to do writeback. And it seems DAX doesn't have
writeback either, which uses __set_page_dirty_no_writeback() for
set_page_dirty. So this code should never be called IIUC.

But anyway your fix looks correct to me. Reviewed-by: Yang Shi


>
> Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use 
> page_vma_mapped_walk()")
> Signed-off-by: Muchun Song 
> ---
>  mm/rmap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b0fd9dc19eba..65670cb805d6 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -974,7 +974,7 @@ static bool page_mkclean_one(struct page *page, struct 
> vm_area_struct *vma,
> if (!pmd_dirty(*pmd) && !pmd_write(*pmd))
> continue;
>
> -   flush_cache_page(vma, address, page_to_pfn(page));
> +   flush_cache_range(vma, address, address + 
> HPAGE_PMD_SIZE);
> entry = pmdp_invalidate(vma, address, pmd);
> entry = pmd_wrprotect(entry);
> entry = pmd_mkclean(entry);
> --
> 2.11.0
>



Re: [PATCH v9 10/10] fsdax: set a CoW flag when associate reflink mappings

2022-01-21 Thread Shiyang Ruan




在 2022/1/21 15:16, Christoph Hellwig 写道:

On Fri, Jan 21, 2022 at 10:33:58AM +0800, Shiyang Ruan wrote:


But different question, how does this not conflict with:

#define PAGE_MAPPING_ANON   0x1

in page-flags.h?


Now we are treating dax pages, so I think its flags should be different from
normal page.  In another word, PAGE_MAPPING_ANON is a flag of rmap mechanism
for normal page, it doesn't work for dax page.  And now, we have dax rmap
for dax page.  So, I think this two kinds of flags are supposed to be used
in different mechanisms and won't conflect.


It just needs someone to use folio_test_anon in a place where a DAX
folio can be passed.  This probably should not happen, but we need to
clearly document that.


Either way I think this flag should move to page-flags.h and be
integrated with the PAGE_MAPPING_FLAGS infrastucture.


And that's why I keep them in this dax.c file.


But that does not integrate it with the infrastructure.  For people
to debug things it needs to be next to PAGE_MAPPING_ANON and have
documentation explaining why they are exclusive.


Ok, understood.


--
Thanks,
Ruan.