Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache
On 08/11/12 08:09, Andi Kleen wrote: > Naoya Horiguchi writes: > >> Current memory error handling on dirty pagecache has a bug that user >> processes who use corrupted pages via read() or write() can't be aware >> of the memory error and result in discarding dirty data silently. >> >> The following patch is to improve handling/reporting memory errors on >> this case, but as a short term solution I suggest that we should undo >> the present error handling code and just leave errors for such cases >> (which expect the 2nd MCE to panic the system) to ensure data consistency. > > Not sure that's the right approach. It's not worse than any other IO > errors isn't it? IMO, it's worse in certain cases. For example, producer-consumer type program which uses file as a temporary storage. Current memory-failure.c drops produced data from dirty pagecache and allows reader to consume old or empty data from disk (silently!), that's what I think HWPOISON should prevent. Similar thing could happen theoretically with disk I/O errors, though, practically those errors are often persistent and reader will likely get errors again instead of bad data. Also, ext3/ext4 has an option to panic when an error is detected, for people who want to avoid corruption on intermittent errors. -- Jun'ichi Nomura, NEC Corporation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache
On 08/11/12 08:09, Andi Kleen wrote: Naoya Horiguchi n-horigu...@ah.jp.nec.com writes: Current memory error handling on dirty pagecache has a bug that user processes who use corrupted pages via read() or write() can't be aware of the memory error and result in discarding dirty data silently. The following patch is to improve handling/reporting memory errors on this case, but as a short term solution I suggest that we should undo the present error handling code and just leave errors for such cases (which expect the 2nd MCE to panic the system) to ensure data consistency. Not sure that's the right approach. It's not worse than any other IO errors isn't it? IMO, it's worse in certain cases. For example, producer-consumer type program which uses file as a temporary storage. Current memory-failure.c drops produced data from dirty pagecache and allows reader to consume old or empty data from disk (silently!), that's what I think HWPOISON should prevent. Similar thing could happen theoretically with disk I/O errors, though, practically those errors are often persistent and reader will likely get errors again instead of bad data. Also, ext3/ext4 has an option to panic when an error is detected, for people who want to avoid corruption on intermittent errors. -- Jun'ichi Nomura, NEC Corporation -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache
Hi Andi, On Fri, Aug 10, 2012 at 04:09:48PM -0700, Andi Kleen wrote: > Naoya Horiguchi writes: > > > Current memory error handling on dirty pagecache has a bug that user > > processes who use corrupted pages via read() or write() can't be aware > > of the memory error and result in discarding dirty data silently. > > > > The following patch is to improve handling/reporting memory errors on > > this case, but as a short term solution I suggest that we should undo > > the present error handling code and just leave errors for such cases > > (which expect the 2nd MCE to panic the system) to ensure data consistency. > > Not sure that's the right approach. It's not worse than any other IO > errors isn't it? Right, in current situation both memory errors and other IO errors have the possibility of data lost in the same manner. I thought that in real mission critical system (for which I think HWPOISON feature is targeted) closing dangerous path is better than keeping waiting for someone to solve the problem in more generic manner. But if we start with Fengguang's approach at first as you replied to patch 3, this patch is not necessary. Naoya -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache
Naoya Horiguchi writes: > Current memory error handling on dirty pagecache has a bug that user > processes who use corrupted pages via read() or write() can't be aware > of the memory error and result in discarding dirty data silently. > > The following patch is to improve handling/reporting memory errors on > this case, but as a short term solution I suggest that we should undo > the present error handling code and just leave errors for such cases > (which expect the 2nd MCE to panic the system) to ensure data consistency. Not sure that's the right approach. It's not worse than any other IO errors isn't it? -Andi -- a...@linux.intel.com -- Speaking for myself only -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache
Current memory error handling on dirty pagecache has a bug that user processes who use corrupted pages via read() or write() can't be aware of the memory error and result in discarding dirty data silently. The following patch is to improve handling/reporting memory errors on this case, but as a short term solution I suggest that we should undo the present error handling code and just leave errors for such cases (which expect the 2nd MCE to panic the system) to ensure data consistency. Signed-off-by: Naoya Horiguchi Cc: sta...@vger.kernel.org --- mm/memory-failure.c | 54 +++-- 1 file changed, 11 insertions(+), 43 deletions(-) diff --git v3.6-rc1.orig/mm/memory-failure.c v3.6-rc1/mm/memory-failure.c index 79dfb2f..7e62797 100644 --- v3.6-rc1.orig/mm/memory-failure.c +++ v3.6-rc1/mm/memory-failure.c @@ -613,49 +613,17 @@ static int me_pagecache_clean(struct page *p, unsigned long pfn) */ static int me_pagecache_dirty(struct page *p, unsigned long pfn) { - struct address_space *mapping = page_mapping(p); - - SetPageError(p); - /* TBD: print more information about the file. */ - if (mapping) { - /* -* IO error will be reported by write(), fsync(), etc. -* who check the mapping. -* This way the application knows that something went -* wrong with its dirty file data. -* -* There's one open issue: -* -* The EIO will be only reported on the next IO -* operation and then cleared through the IO map. -* Normally Linux has two mechanisms to pass IO error -* first through the AS_EIO flag in the address space -* and then through the PageError flag in the page. -* Since we drop pages on memory failure handling the -* only mechanism open to use is through AS_AIO. -* -* This has the disadvantage that it gets cleared on -* the first operation that returns an error, while -* the PageError bit is more sticky and only cleared -* when the page is reread or dropped. If an -* application assumes it will always get error on -* fsync, but does other operations on the fd before -* and the page is dropped between then the error -* will not be properly reported. -* -* This can already happen even without hwpoisoned -* pages: first on metadata IO errors (which only -* report through AS_EIO) or when the page is dropped -* at the wrong time. -* -* So right now we assume that the application DTRT on -* the first EIO, but we're not worse than other parts -* of the kernel. -*/ - mapping_set_error(mapping, EIO); - } - - return me_pagecache_clean(p, pfn); + /* +* The original memory error handling on dirty pagecache has +* a bug that user processes who use corrupted pages via read() +* or write() can't be aware of the memory error and result +* in throwing out dirty data silently. +* +* Until we solve the problem, let's close the path of memory +* error handling for dirty pagecache. We just leave errors +* for the 2nd MCE to trigger panics. +*/ + return IGNORED; } /* -- 1.7.11.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache
Current memory error handling on dirty pagecache has a bug that user processes who use corrupted pages via read() or write() can't be aware of the memory error and result in discarding dirty data silently. The following patch is to improve handling/reporting memory errors on this case, but as a short term solution I suggest that we should undo the present error handling code and just leave errors for such cases (which expect the 2nd MCE to panic the system) to ensure data consistency. Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com Cc: sta...@vger.kernel.org --- mm/memory-failure.c | 54 +++-- 1 file changed, 11 insertions(+), 43 deletions(-) diff --git v3.6-rc1.orig/mm/memory-failure.c v3.6-rc1/mm/memory-failure.c index 79dfb2f..7e62797 100644 --- v3.6-rc1.orig/mm/memory-failure.c +++ v3.6-rc1/mm/memory-failure.c @@ -613,49 +613,17 @@ static int me_pagecache_clean(struct page *p, unsigned long pfn) */ static int me_pagecache_dirty(struct page *p, unsigned long pfn) { - struct address_space *mapping = page_mapping(p); - - SetPageError(p); - /* TBD: print more information about the file. */ - if (mapping) { - /* -* IO error will be reported by write(), fsync(), etc. -* who check the mapping. -* This way the application knows that something went -* wrong with its dirty file data. -* -* There's one open issue: -* -* The EIO will be only reported on the next IO -* operation and then cleared through the IO map. -* Normally Linux has two mechanisms to pass IO error -* first through the AS_EIO flag in the address space -* and then through the PageError flag in the page. -* Since we drop pages on memory failure handling the -* only mechanism open to use is through AS_AIO. -* -* This has the disadvantage that it gets cleared on -* the first operation that returns an error, while -* the PageError bit is more sticky and only cleared -* when the page is reread or dropped. If an -* application assumes it will always get error on -* fsync, but does other operations on the fd before -* and the page is dropped between then the error -* will not be properly reported. -* -* This can already happen even without hwpoisoned -* pages: first on metadata IO errors (which only -* report through AS_EIO) or when the page is dropped -* at the wrong time. -* -* So right now we assume that the application DTRT on -* the first EIO, but we're not worse than other parts -* of the kernel. -*/ - mapping_set_error(mapping, EIO); - } - - return me_pagecache_clean(p, pfn); + /* +* The original memory error handling on dirty pagecache has +* a bug that user processes who use corrupted pages via read() +* or write() can't be aware of the memory error and result +* in throwing out dirty data silently. +* +* Until we solve the problem, let's close the path of memory +* error handling for dirty pagecache. We just leave errors +* for the 2nd MCE to trigger panics. +*/ + return IGNORED; } /* -- 1.7.11.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache
Naoya Horiguchi n-horigu...@ah.jp.nec.com writes: Current memory error handling on dirty pagecache has a bug that user processes who use corrupted pages via read() or write() can't be aware of the memory error and result in discarding dirty data silently. The following patch is to improve handling/reporting memory errors on this case, but as a short term solution I suggest that we should undo the present error handling code and just leave errors for such cases (which expect the 2nd MCE to panic the system) to ensure data consistency. Not sure that's the right approach. It's not worse than any other IO errors isn't it? -Andi -- a...@linux.intel.com -- Speaking for myself only -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache
Hi Andi, On Fri, Aug 10, 2012 at 04:09:48PM -0700, Andi Kleen wrote: Naoya Horiguchi n-horigu...@ah.jp.nec.com writes: Current memory error handling on dirty pagecache has a bug that user processes who use corrupted pages via read() or write() can't be aware of the memory error and result in discarding dirty data silently. The following patch is to improve handling/reporting memory errors on this case, but as a short term solution I suggest that we should undo the present error handling code and just leave errors for such cases (which expect the 2nd MCE to panic the system) to ensure data consistency. Not sure that's the right approach. It's not worse than any other IO errors isn't it? Right, in current situation both memory errors and other IO errors have the possibility of data lost in the same manner. I thought that in real mission critical system (for which I think HWPOISON feature is targeted) closing dangerous path is better than keeping waiting for someone to solve the problem in more generic manner. But if we start with Fengguang's approach at first as you replied to patch 3, this patch is not necessary. Naoya -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/