Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache

2012-08-13 Thread Jun'ichi Nomura
On 08/11/12 08:09, Andi Kleen wrote:
> Naoya Horiguchi  writes:
> 
>> Current memory error handling on dirty pagecache has a bug that user
>> processes who use corrupted pages via read() or write() can't be aware
>> of the memory error and result in discarding dirty data silently.
>>
>> The following patch is to improve handling/reporting memory errors on
>> this case, but as a short term solution I suggest that we should undo
>> the present error handling code and just leave errors for such cases
>> (which expect the 2nd MCE to panic the system) to ensure data consistency.
> 
> Not sure that's the right approach. It's not worse than any other IO 
> errors isn't it? 

IMO, it's worse in certain cases.  For example, producer-consumer type
program which uses file as a temporary storage.
Current memory-failure.c drops produced data from dirty pagecache
and allows reader to consume old or empty data from disk (silently!),
that's what I think HWPOISON should prevent.

Similar thing could happen theoretically with disk I/O errors,
though, practically those errors are often persistent and reader will
likely get errors again instead of bad data.
Also, ext3/ext4 has an option to panic when an error is detected,
for people who want to avoid corruption on intermittent errors.

-- 
Jun'ichi Nomura, NEC Corporation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache

2012-08-13 Thread Jun'ichi Nomura
On 08/11/12 08:09, Andi Kleen wrote:
 Naoya Horiguchi n-horigu...@ah.jp.nec.com writes:
 
 Current memory error handling on dirty pagecache has a bug that user
 processes who use corrupted pages via read() or write() can't be aware
 of the memory error and result in discarding dirty data silently.

 The following patch is to improve handling/reporting memory errors on
 this case, but as a short term solution I suggest that we should undo
 the present error handling code and just leave errors for such cases
 (which expect the 2nd MCE to panic the system) to ensure data consistency.
 
 Not sure that's the right approach. It's not worse than any other IO 
 errors isn't it? 

IMO, it's worse in certain cases.  For example, producer-consumer type
program which uses file as a temporary storage.
Current memory-failure.c drops produced data from dirty pagecache
and allows reader to consume old or empty data from disk (silently!),
that's what I think HWPOISON should prevent.

Similar thing could happen theoretically with disk I/O errors,
though, practically those errors are often persistent and reader will
likely get errors again instead of bad data.
Also, ext3/ext4 has an option to panic when an error is detected,
for people who want to avoid corruption on intermittent errors.

-- 
Jun'ichi Nomura, NEC Corporation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache

2012-08-10 Thread Naoya Horiguchi
Hi Andi,

On Fri, Aug 10, 2012 at 04:09:48PM -0700, Andi Kleen wrote:
> Naoya Horiguchi  writes:
> 
> > Current memory error handling on dirty pagecache has a bug that user
> > processes who use corrupted pages via read() or write() can't be aware
> > of the memory error and result in discarding dirty data silently.
> >
> > The following patch is to improve handling/reporting memory errors on
> > this case, but as a short term solution I suggest that we should undo
> > the present error handling code and just leave errors for such cases
> > (which expect the 2nd MCE to panic the system) to ensure data consistency.
> 
> Not sure that's the right approach. It's not worse than any other IO 
> errors isn't it? 

Right, in current situation both memory errors and other IO errors have
the possibility of data lost in the same manner.
I thought that in real mission critical system (for which I think
HWPOISON feature is targeted) closing dangerous path is better than
keeping waiting for someone to solve the problem in more generic manner.

But if we start with Fengguang's approach at first as you replied to
patch 3, this patch is not necessary.

Naoya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache

2012-08-10 Thread Andi Kleen
Naoya Horiguchi  writes:

> Current memory error handling on dirty pagecache has a bug that user
> processes who use corrupted pages via read() or write() can't be aware
> of the memory error and result in discarding dirty data silently.
>
> The following patch is to improve handling/reporting memory errors on
> this case, but as a short term solution I suggest that we should undo
> the present error handling code and just leave errors for such cases
> (which expect the 2nd MCE to panic the system) to ensure data consistency.

Not sure that's the right approach. It's not worse than any other IO 
errors isn't it? 

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache

2012-08-10 Thread Naoya Horiguchi
Current memory error handling on dirty pagecache has a bug that user
processes who use corrupted pages via read() or write() can't be aware
of the memory error and result in discarding dirty data silently.

The following patch is to improve handling/reporting memory errors on
this case, but as a short term solution I suggest that we should undo
the present error handling code and just leave errors for such cases
(which expect the 2nd MCE to panic the system) to ensure data consistency.

Signed-off-by: Naoya Horiguchi 
Cc: sta...@vger.kernel.org
---
 mm/memory-failure.c | 54 +++--
 1 file changed, 11 insertions(+), 43 deletions(-)

diff --git v3.6-rc1.orig/mm/memory-failure.c v3.6-rc1/mm/memory-failure.c
index 79dfb2f..7e62797 100644
--- v3.6-rc1.orig/mm/memory-failure.c
+++ v3.6-rc1/mm/memory-failure.c
@@ -613,49 +613,17 @@ static int me_pagecache_clean(struct page *p, unsigned 
long pfn)
  */
 static int me_pagecache_dirty(struct page *p, unsigned long pfn)
 {
-   struct address_space *mapping = page_mapping(p);
-
-   SetPageError(p);
-   /* TBD: print more information about the file. */
-   if (mapping) {
-   /*
-* IO error will be reported by write(), fsync(), etc.
-* who check the mapping.
-* This way the application knows that something went
-* wrong with its dirty file data.
-*
-* There's one open issue:
-*
-* The EIO will be only reported on the next IO
-* operation and then cleared through the IO map.
-* Normally Linux has two mechanisms to pass IO error
-* first through the AS_EIO flag in the address space
-* and then through the PageError flag in the page.
-* Since we drop pages on memory failure handling the
-* only mechanism open to use is through AS_AIO.
-*
-* This has the disadvantage that it gets cleared on
-* the first operation that returns an error, while
-* the PageError bit is more sticky and only cleared
-* when the page is reread or dropped.  If an
-* application assumes it will always get error on
-* fsync, but does other operations on the fd before
-* and the page is dropped between then the error
-* will not be properly reported.
-*
-* This can already happen even without hwpoisoned
-* pages: first on metadata IO errors (which only
-* report through AS_EIO) or when the page is dropped
-* at the wrong time.
-*
-* So right now we assume that the application DTRT on
-* the first EIO, but we're not worse than other parts
-* of the kernel.
-*/
-   mapping_set_error(mapping, EIO);
-   }
-
-   return me_pagecache_clean(p, pfn);
+   /*
+* The original memory error handling on dirty pagecache has
+* a bug that user processes who use corrupted pages via read()
+* or write() can't be aware of the memory error and result
+* in throwing out dirty data silently.
+*
+* Until we solve the problem, let's close the path of memory
+* error handling for dirty pagecache. We just leave errors
+* for the 2nd MCE to trigger panics.
+*/
+   return IGNORED;
 }
 
 /*
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache

2012-08-10 Thread Naoya Horiguchi
Current memory error handling on dirty pagecache has a bug that user
processes who use corrupted pages via read() or write() can't be aware
of the memory error and result in discarding dirty data silently.

The following patch is to improve handling/reporting memory errors on
this case, but as a short term solution I suggest that we should undo
the present error handling code and just leave errors for such cases
(which expect the 2nd MCE to panic the system) to ensure data consistency.

Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com
Cc: sta...@vger.kernel.org
---
 mm/memory-failure.c | 54 +++--
 1 file changed, 11 insertions(+), 43 deletions(-)

diff --git v3.6-rc1.orig/mm/memory-failure.c v3.6-rc1/mm/memory-failure.c
index 79dfb2f..7e62797 100644
--- v3.6-rc1.orig/mm/memory-failure.c
+++ v3.6-rc1/mm/memory-failure.c
@@ -613,49 +613,17 @@ static int me_pagecache_clean(struct page *p, unsigned 
long pfn)
  */
 static int me_pagecache_dirty(struct page *p, unsigned long pfn)
 {
-   struct address_space *mapping = page_mapping(p);
-
-   SetPageError(p);
-   /* TBD: print more information about the file. */
-   if (mapping) {
-   /*
-* IO error will be reported by write(), fsync(), etc.
-* who check the mapping.
-* This way the application knows that something went
-* wrong with its dirty file data.
-*
-* There's one open issue:
-*
-* The EIO will be only reported on the next IO
-* operation and then cleared through the IO map.
-* Normally Linux has two mechanisms to pass IO error
-* first through the AS_EIO flag in the address space
-* and then through the PageError flag in the page.
-* Since we drop pages on memory failure handling the
-* only mechanism open to use is through AS_AIO.
-*
-* This has the disadvantage that it gets cleared on
-* the first operation that returns an error, while
-* the PageError bit is more sticky and only cleared
-* when the page is reread or dropped.  If an
-* application assumes it will always get error on
-* fsync, but does other operations on the fd before
-* and the page is dropped between then the error
-* will not be properly reported.
-*
-* This can already happen even without hwpoisoned
-* pages: first on metadata IO errors (which only
-* report through AS_EIO) or when the page is dropped
-* at the wrong time.
-*
-* So right now we assume that the application DTRT on
-* the first EIO, but we're not worse than other parts
-* of the kernel.
-*/
-   mapping_set_error(mapping, EIO);
-   }
-
-   return me_pagecache_clean(p, pfn);
+   /*
+* The original memory error handling on dirty pagecache has
+* a bug that user processes who use corrupted pages via read()
+* or write() can't be aware of the memory error and result
+* in throwing out dirty data silently.
+*
+* Until we solve the problem, let's close the path of memory
+* error handling for dirty pagecache. We just leave errors
+* for the 2nd MCE to trigger panics.
+*/
+   return IGNORED;
 }
 
 /*
-- 
1.7.11.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache

2012-08-10 Thread Andi Kleen
Naoya Horiguchi n-horigu...@ah.jp.nec.com writes:

 Current memory error handling on dirty pagecache has a bug that user
 processes who use corrupted pages via read() or write() can't be aware
 of the memory error and result in discarding dirty data silently.

 The following patch is to improve handling/reporting memory errors on
 this case, but as a short term solution I suggest that we should undo
 the present error handling code and just leave errors for such cases
 (which expect the 2nd MCE to panic the system) to ensure data consistency.

Not sure that's the right approach. It's not worse than any other IO 
errors isn't it? 

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache

2012-08-10 Thread Naoya Horiguchi
Hi Andi,

On Fri, Aug 10, 2012 at 04:09:48PM -0700, Andi Kleen wrote:
 Naoya Horiguchi n-horigu...@ah.jp.nec.com writes:
 
  Current memory error handling on dirty pagecache has a bug that user
  processes who use corrupted pages via read() or write() can't be aware
  of the memory error and result in discarding dirty data silently.
 
  The following patch is to improve handling/reporting memory errors on
  this case, but as a short term solution I suggest that we should undo
  the present error handling code and just leave errors for such cases
  (which expect the 2nd MCE to panic the system) to ensure data consistency.
 
 Not sure that's the right approach. It's not worse than any other IO 
 errors isn't it? 

Right, in current situation both memory errors and other IO errors have
the possibility of data lost in the same manner.
I thought that in real mission critical system (for which I think
HWPOISON feature is targeted) closing dangerous path is better than
keeping waiting for someone to solve the problem in more generic manner.

But if we start with Fengguang's approach at first as you replied to
patch 3, this patch is not necessary.

Naoya
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/