Re: [PATCH] memory_failure: remove redundant check for the PG_HWPoison flag of 'hpage'

2015-07-31 Thread Naoya Horiguchi
On Thu, Jul 30, 2015 at 10:52:46AM +0800, Wang Xiaoqiang wrote:
...
> In your example, the 100th subage enter the memory
> error handler firstly, and then it uses the 
> set_page_hwpoison_huge_page to set all subpages
> with PG_HWPoison flag, the 50th page handler waits
> for grab the lock_page(hpage) now. 
> 
> When the 100th page handler unlock the 'hpage', 
> the 50th grab it, and now the 'hapge' has been 
> set with PG_HWPosison. So PageHWPoison micro 
> will return true, and the following code will
> be executed:
> 
> if (PageHWPoison(hpage)) {
> if ((hwpoison_filter(p) && TestClearPageHWPoison(p))
> || (p != hpage && TestSetPageHWPoison(hpage))) {
> atomic_long_sub(nr_pages, _poisoned_pages);
> unlock_page(hpage);
> return 0;
> }   
> }
> 
> Now 'p' is 50th subpage, it doesn't equal the 
> 'hpage' obviously, so if we don't have TestSetPageHWPoison
> here, it still will ignore the 50th error.

Ah, you're right, thanks for the explanation, Xiaoqiang!

Acked-by: Naoya Horiguchi --
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memory_failure: remove redundant check for the PG_HWPoison flag of 'hpage'

2015-07-31 Thread Naoya Horiguchi
On Thu, Jul 30, 2015 at 10:52:46AM +0800, Wang Xiaoqiang wrote:
...
 In your example, the 100th subage enter the memory
 error handler firstly, and then it uses the 
 set_page_hwpoison_huge_page to set all subpages
 with PG_HWPoison flag, the 50th page handler waits
 for grab the lock_page(hpage) now. 
 
 When the 100th page handler unlock the 'hpage', 
 the 50th grab it, and now the 'hapge' has been 
 set with PG_HWPosison. So PageHWPoison micro 
 will return true, and the following code will
 be executed:
 
 if (PageHWPoison(hpage)) {
 if ((hwpoison_filter(p)  TestClearPageHWPoison(p))
 || (p != hpage  TestSetPageHWPoison(hpage))) {
 atomic_long_sub(nr_pages, num_poisoned_pages);
 unlock_page(hpage);
 return 0;
 }   
 }
 
 Now 'p' is 50th subpage, it doesn't equal the 
 'hpage' obviously, so if we don't have TestSetPageHWPoison
 here, it still will ignore the 50th error.

Ah, you're right, thanks for the explanation, Xiaoqiang!

Acked-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memory_failure: remove redundant check for the PG_HWPoison flag of 'hpage'

2015-07-29 Thread Wang Xiaoqiang
On Wed, 29 Jul 2015 09:17:32 +
Naoya Horiguchi  wrote:

> # CC:ed linux-mm
> 
> Hi Xiaoqiang,
> 
> On Wed, Jul 29, 2015 at 03:52:46PM +0800, Wang Xiaoqiang wrote:
> > Hi,
> > 
> > I find a little problem in the memory_failure function in
> > mm/memory-failure.c . Please check it.
> > 
> > memory_failure: remove redundant check for the PG_HWPoison flag of
> > `hpage'.
> > 
> > Since we have check the PG_HWPoison flag by `PageHWPoison' before,
> > so the later check by `TestSetPageHWPoison' must return true, there
> > is no need to check again!
> 
> I'm afraid that this TestSetPageHWPoison is not redundant, because
> this code serializes the concurrent memory error events over the same
> hugetlb page (, where 'p' indicates the 4kB error page and 'hpage'
> indicates the head page.)
> 
> When an error hits a hugetlb page, set_page_hwpoison_huge_page() sets
> PageHWPoison flags over all subpages of the hugetlb page in the
> ascending order of pfn. So if we don't have this TestSet, memory
> error handler can run more than once on concurrent errors when the
> 1st memory error hits (for example) the 100th subpage and the 2nd
> memory error hits (for example) the 50th subpage.

In your example, the 100th subage enter the memory
error handler firstly, and then it uses the 
set_page_hwpoison_huge_page to set all subpages
with PG_HWPoison flag, the 50th page handler waits
for grab the lock_page(hpage) now. 

When the 100th page handler unlock the 'hpage', 
the 50th grab it, and now the 'hapge' has been 
set with PG_HWPosison. So PageHWPoison micro 
will return true, and the following code will
be executed:

if (PageHWPoison(hpage)) {
if ((hwpoison_filter(p) && TestClearPageHWPoison(p))
|| (p != hpage && TestSetPageHWPoison(hpage))) {
atomic_long_sub(nr_pages, _poisoned_pages);
unlock_page(hpage);
return 0;
}   
}

Now 'p' is 50th subpage, it doesn't equal the 
'hpage' obviously, so if we don't have TestSetPageHWPoison
here, it still will ignore the 50th error.
Why the memory error handler can run more than once?
Hope to receive from you!

thx,
Wang Xiaoqiang


> 
> Thanks,
> Naoya Horiguchi
> 
> > Signed-off-by: Wang Xiaoqiang 
> > ---
> >  mm/memory-failure.c |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 1cf7f29..7794fd8 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1115,7 +1115,7 @@ int memory_failure(unsigned long pfn, int
> > trapno, int flags) lock_page(hpage);
> > if (PageHWPoison(hpage)) {
> > if ((hwpoison_filter(p) &&
> > TestClearPageHWPoison(p))
> > -   || (p != hpage &&
> > TestSetPageHWPoison(hpage))) {
> > +   || p != hpage) {
> > atomic_long_sub(nr_pages,
> > _poisoned_pages); unlock_page(hpage);
> > return 0;
> > -- 
> > 1.7.10.4
> > 
> > 
> > 
> > --
> > thx!
> > Wang Xiaoqiang
> > 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memory_failure: remove redundant check for the PG_HWPoison flag of 'hpage'

2015-07-29 Thread Naoya Horiguchi
# CC:ed linux-mm

Hi Xiaoqiang,

On Wed, Jul 29, 2015 at 03:52:46PM +0800, Wang Xiaoqiang wrote:
> Hi,
> 
> I find a little problem in the memory_failure function in
> mm/memory-failure.c . Please check it.
> 
> memory_failure: remove redundant check for the PG_HWPoison flag of
> `hpage'.
> 
> Since we have check the PG_HWPoison flag by `PageHWPoison' before,
> so the later check by `TestSetPageHWPoison' must return true, there
> is no need to check again!

I'm afraid that this TestSetPageHWPoison is not redundant, because this code
serializes the concurrent memory error events over the same hugetlb page
(, where 'p' indicates the 4kB error page and 'hpage' indicates the head page.)

When an error hits a hugetlb page, set_page_hwpoison_huge_page() sets
PageHWPoison flags over all subpages of the hugetlb page in the ascending
order of pfn. So if we don't have this TestSet, memory error handler can
run more than once on concurrent errors when the 1st memory error hits
(for example) the 100th subpage and the 2nd memory error hits (for example)
the 50th subpage.

Thanks,
Naoya Horiguchi

> Signed-off-by: Wang Xiaoqiang 
> ---
>  mm/memory-failure.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 1cf7f29..7794fd8 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1115,7 +1115,7 @@ int memory_failure(unsigned long pfn, int trapno, int 
> flags)
>   lock_page(hpage);
>   if (PageHWPoison(hpage)) {
>   if ((hwpoison_filter(p) && 
> TestClearPageHWPoison(p))
> - || (p != hpage && 
> TestSetPageHWPoison(hpage))) {
> + || p != hpage) {
>   atomic_long_sub(nr_pages, 
> _poisoned_pages);
>   unlock_page(hpage);
>   return 0;
> -- 
> 1.7.10.4
> 
> 
> 
> --
> thx!
> Wang Xiaoqiang
> --
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] memory_failure: remove redundant check for the PG_HWPoison flag of 'hpage'

2015-07-29 Thread Wang Xiaoqiang
Hi,

I find a little problem in the memory_failure function in
mm/memory-failure.c . Please check it.

memory_failure: remove redundant check for the PG_HWPoison flag of
`hpage'.

Since we have check the PG_HWPoison flag by `PageHWPoison' before,
so the later check by `TestSetPageHWPoison' must return true, there
is no need to check again!

Signed-off-by: Wang Xiaoqiang 
---
 mm/memory-failure.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 1cf7f29..7794fd8 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1115,7 +1115,7 @@ int memory_failure(unsigned long pfn, int trapno, int 
flags)
lock_page(hpage);
if (PageHWPoison(hpage)) {
if ((hwpoison_filter(p) && 
TestClearPageHWPoison(p))
-   || (p != hpage && 
TestSetPageHWPoison(hpage))) {
+   || p != hpage) {
atomic_long_sub(nr_pages, 
_poisoned_pages);
unlock_page(hpage);
return 0;
-- 
1.7.10.4



--
thx!
Wang Xiaoqiang

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memory_failure: remove redundant check for the PG_HWPoison flag of 'hpage'

2015-07-29 Thread Wang Xiaoqiang
On Wed, 29 Jul 2015 09:17:32 +
Naoya Horiguchi n-horigu...@ah.jp.nec.com wrote:

 # CC:ed linux-mm
 
 Hi Xiaoqiang,
 
 On Wed, Jul 29, 2015 at 03:52:46PM +0800, Wang Xiaoqiang wrote:
  Hi,
  
  I find a little problem in the memory_failure function in
  mm/memory-failure.c . Please check it.
  
  memory_failure: remove redundant check for the PG_HWPoison flag of
  `hpage'.
  
  Since we have check the PG_HWPoison flag by `PageHWPoison' before,
  so the later check by `TestSetPageHWPoison' must return true, there
  is no need to check again!
 
 I'm afraid that this TestSetPageHWPoison is not redundant, because
 this code serializes the concurrent memory error events over the same
 hugetlb page (, where 'p' indicates the 4kB error page and 'hpage'
 indicates the head page.)
 
 When an error hits a hugetlb page, set_page_hwpoison_huge_page() sets
 PageHWPoison flags over all subpages of the hugetlb page in the
 ascending order of pfn. So if we don't have this TestSet, memory
 error handler can run more than once on concurrent errors when the
 1st memory error hits (for example) the 100th subpage and the 2nd
 memory error hits (for example) the 50th subpage.

In your example, the 100th subage enter the memory
error handler firstly, and then it uses the 
set_page_hwpoison_huge_page to set all subpages
with PG_HWPoison flag, the 50th page handler waits
for grab the lock_page(hpage) now. 

When the 100th page handler unlock the 'hpage', 
the 50th grab it, and now the 'hapge' has been 
set with PG_HWPosison. So PageHWPoison micro 
will return true, and the following code will
be executed:

if (PageHWPoison(hpage)) {
if ((hwpoison_filter(p)  TestClearPageHWPoison(p))
|| (p != hpage  TestSetPageHWPoison(hpage))) {
atomic_long_sub(nr_pages, num_poisoned_pages);
unlock_page(hpage);
return 0;
}   
}

Now 'p' is 50th subpage, it doesn't equal the 
'hpage' obviously, so if we don't have TestSetPageHWPoison
here, it still will ignore the 50th error.
Why the memory error handler can run more than once?
Hope to receive from you!

thx,
Wang Xiaoqiang


 
 Thanks,
 Naoya Horiguchi
 
  Signed-off-by: Wang Xiaoqiang wangx...@lzu.edu.cn
  ---
   mm/memory-failure.c |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/mm/memory-failure.c b/mm/memory-failure.c
  index 1cf7f29..7794fd8 100644
  --- a/mm/memory-failure.c
  +++ b/mm/memory-failure.c
  @@ -1115,7 +1115,7 @@ int memory_failure(unsigned long pfn, int
  trapno, int flags) lock_page(hpage);
  if (PageHWPoison(hpage)) {
  if ((hwpoison_filter(p) 
  TestClearPageHWPoison(p))
  -   || (p != hpage 
  TestSetPageHWPoison(hpage))) {
  +   || p != hpage) {
  atomic_long_sub(nr_pages,
  num_poisoned_pages); unlock_page(hpage);
  return 0;
  -- 
  1.7.10.4
  
  
  
  --
  thx!
  Wang Xiaoqiang
  

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] memory_failure: remove redundant check for the PG_HWPoison flag of 'hpage'

2015-07-29 Thread Wang Xiaoqiang
Hi,

I find a little problem in the memory_failure function in
mm/memory-failure.c . Please check it.

memory_failure: remove redundant check for the PG_HWPoison flag of
`hpage'.

Since we have check the PG_HWPoison flag by `PageHWPoison' before,
so the later check by `TestSetPageHWPoison' must return true, there
is no need to check again!

Signed-off-by: Wang Xiaoqiang wangx...@lzu.edu.cn
---
 mm/memory-failure.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 1cf7f29..7794fd8 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1115,7 +1115,7 @@ int memory_failure(unsigned long pfn, int trapno, int 
flags)
lock_page(hpage);
if (PageHWPoison(hpage)) {
if ((hwpoison_filter(p)  
TestClearPageHWPoison(p))
-   || (p != hpage  
TestSetPageHWPoison(hpage))) {
+   || p != hpage) {
atomic_long_sub(nr_pages, 
num_poisoned_pages);
unlock_page(hpage);
return 0;
-- 
1.7.10.4



--
thx!
Wang Xiaoqiang

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memory_failure: remove redundant check for the PG_HWPoison flag of 'hpage'

2015-07-29 Thread Naoya Horiguchi
# CC:ed linux-mm

Hi Xiaoqiang,

On Wed, Jul 29, 2015 at 03:52:46PM +0800, Wang Xiaoqiang wrote:
 Hi,
 
 I find a little problem in the memory_failure function in
 mm/memory-failure.c . Please check it.
 
 memory_failure: remove redundant check for the PG_HWPoison flag of
 `hpage'.
 
 Since we have check the PG_HWPoison flag by `PageHWPoison' before,
 so the later check by `TestSetPageHWPoison' must return true, there
 is no need to check again!

I'm afraid that this TestSetPageHWPoison is not redundant, because this code
serializes the concurrent memory error events over the same hugetlb page
(, where 'p' indicates the 4kB error page and 'hpage' indicates the head page.)

When an error hits a hugetlb page, set_page_hwpoison_huge_page() sets
PageHWPoison flags over all subpages of the hugetlb page in the ascending
order of pfn. So if we don't have this TestSet, memory error handler can
run more than once on concurrent errors when the 1st memory error hits
(for example) the 100th subpage and the 2nd memory error hits (for example)
the 50th subpage.

Thanks,
Naoya Horiguchi

 Signed-off-by: Wang Xiaoqiang wangx...@lzu.edu.cn
 ---
  mm/memory-failure.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/mm/memory-failure.c b/mm/memory-failure.c
 index 1cf7f29..7794fd8 100644
 --- a/mm/memory-failure.c
 +++ b/mm/memory-failure.c
 @@ -1115,7 +1115,7 @@ int memory_failure(unsigned long pfn, int trapno, int 
 flags)
   lock_page(hpage);
   if (PageHWPoison(hpage)) {
   if ((hwpoison_filter(p)  
 TestClearPageHWPoison(p))
 - || (p != hpage  
 TestSetPageHWPoison(hpage))) {
 + || p != hpage) {
   atomic_long_sub(nr_pages, 
 num_poisoned_pages);
   unlock_page(hpage);
   return 0;
 -- 
 1.7.10.4
 
 
 
 --
 thx!
 Wang Xiaoqiang
 --
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/