Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-30 Thread Dave Chinner
On Mon, Oct 29, 2012 at 06:11:58PM +, Luck, Tony wrote: > > What I would recommend is adding a > > > > #define FS_CORRUPTED_FL 0x0100 /* File is corrupted */ > > > > ... and which could be accessed and cleared via the lsattr and chattr > > programs. > > Good - but we need

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-30 Thread Dave Chinner
On Mon, Oct 29, 2012 at 06:11:58PM +, Luck, Tony wrote: What I would recommend is adding a #define FS_CORRUPTED_FL 0x0100 /* File is corrupted */ ... and which could be accessed and cleared via the lsattr and chattr programs. Good - but we need some space to

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Jun'ichi Nomura
On 10/30/12 04:07, Andi Kleen wrote: > Theodore Ts'o writes: >> Note that the problem that we're dealing with is buffered writes; so >> it's quite possible that the process which wrote the file, thus >> dirtying the page cache, has already exited; so there's no way we can >> guarantee we can

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Naoya Horiguchi
On Mon, Oct 29, 2012 at 12:07:04PM -0700, Andi Kleen wrote: > Theodore Ts'o writes: ... > > Also, if you're going to keep this state in memory, what happens if > > the inode gets pushed out of memory? > > You lose the error, just like you do today with any other IO error. > > We had a lot of

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Luck, Tony
> What I would recommend is adding a > > #define FS_CORRUPTED_FL 0x0100 /* File is corrupted */ > > ... and which could be accessed and cleared via the lsattr and chattr > programs. Good - but we need some space to save the corrupted range information too. These errors should

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Andi Kleen
Theodore Ts'o writes: > > It's actually pretty easy to test this particular one, Note the error can happen at any time. > and certainly > one of the things I'd strongly encourage in this patch series is the > introduction of an interface via madvise It already exists of course. I would

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Jan Kara
On Mon 29-10-12 14:24:56, Ted Tso wrote: > On Mon, Oct 29, 2012 at 03:37:05AM -0700, Andi Kleen wrote: > > > Agreed, if we're going to add an xattr, then we might as well store > > > > I don't think an xattr makes sense for this. It's sufficient to keep > > this state in memory. > > > > In

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Theodore Ts'o
On Mon, Oct 29, 2012 at 03:37:05AM -0700, Andi Kleen wrote: > > Agreed, if we're going to add an xattr, then we might as well store > > I don't think an xattr makes sense for this. It's sufficient to keep > this state in memory. > > In general these error paths are hard to test and it's

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Jun'ichi Nomura
On 10/29/12 19:37, Andi Kleen wrote: > Theodore Ts'o writes: >> On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote: >>> Except that there are filesystems that cannot implement such flags, >>> or require on-disk format changes to add more of those flags. This >>> is most definitely not a

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Andi Kleen
Theodore Ts'o writes: > On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote: >> >> Except that there are filesystems that cannot implement such flags, >> or require on-disk format changes to add more of those flags. This >> is most definitely not a filesystem specific behaviour, so any

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Naoya Horiguchi
On Mon, Oct 29, 2012 at 12:07:04PM -0700, Andi Kleen wrote: Theodore Ts'o ty...@mit.edu writes: ... Also, if you're going to keep this state in memory, what happens if the inode gets pushed out of memory? You lose the error, just like you do today with any other IO error. We had a lot

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Jun'ichi Nomura
On 10/30/12 04:07, Andi Kleen wrote: Theodore Ts'o ty...@mit.edu writes: Note that the problem that we're dealing with is buffered writes; so it's quite possible that the process which wrote the file, thus dirtying the page cache, has already exited; so there's no way we can guarantee we can

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Andi Kleen
Theodore Ts'o ty...@mit.edu writes: On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote: Except that there are filesystems that cannot implement such flags, or require on-disk format changes to add more of those flags. This is most definitely not a filesystem specific behaviour, so

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Jun'ichi Nomura
On 10/29/12 19:37, Andi Kleen wrote: Theodore Ts'o ty...@mit.edu writes: On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote: Except that there are filesystems that cannot implement such flags, or require on-disk format changes to add more of those flags. This is most definitely not

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Theodore Ts'o
On Mon, Oct 29, 2012 at 03:37:05AM -0700, Andi Kleen wrote: Agreed, if we're going to add an xattr, then we might as well store I don't think an xattr makes sense for this. It's sufficient to keep this state in memory. In general these error paths are hard to test and it's important to

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Jan Kara
On Mon 29-10-12 14:24:56, Ted Tso wrote: On Mon, Oct 29, 2012 at 03:37:05AM -0700, Andi Kleen wrote: Agreed, if we're going to add an xattr, then we might as well store I don't think an xattr makes sense for this. It's sufficient to keep this state in memory. In general these error

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Andi Kleen
Theodore Ts'o ty...@mit.edu writes: It's actually pretty easy to test this particular one, Note the error can happen at any time. and certainly one of the things I'd strongly encourage in this patch series is the introduction of an interface via madvise It already exists of course. I

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Luck, Tony
What I would recommend is adding a #define FS_CORRUPTED_FL 0x0100 /* File is corrupted */ ... and which could be accessed and cleared via the lsattr and chattr programs. Good - but we need some space to save the corrupted range information too. These errors should be

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-28 Thread Theodore Ts'o
On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote: > > Except that there are filesystems that cannot implement such flags, > or require on-disk format changes to add more of those flags. This > is most definitely not a filesystem specific behaviour, so any sort > of VFS level per-file

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-28 Thread Dave Chinner
On Sat, Oct 27, 2012 at 06:16:26PM -0400, Theodore Ts'o wrote: > On Fri, Oct 26, 2012 at 10:24:23PM +, Luck, Tony wrote: > > > Well, we could set a new attribute bit on the file which indicates > > > that the file has been corrupted, and this could cause any attempts to > > > open the file to

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-28 Thread Dave Chinner
On Sat, Oct 27, 2012 at 06:16:26PM -0400, Theodore Ts'o wrote: On Fri, Oct 26, 2012 at 10:24:23PM +, Luck, Tony wrote: Well, we could set a new attribute bit on the file which indicates that the file has been corrupted, and this could cause any attempts to open the file to return

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-28 Thread Theodore Ts'o
On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote: Except that there are filesystems that cannot implement such flags, or require on-disk format changes to add more of those flags. This is most definitely not a filesystem specific behaviour, so any sort of VFS level per-file state

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-27 Thread Naoya Horiguchi
Hi Ted, On Sat, Oct 27, 2012 at 06:16:26PM -0400, Theodore Ts'o wrote: > On Fri, Oct 26, 2012 at 10:24:23PM +, Luck, Tony wrote: > > > Well, we could set a new attribute bit on the file which indicates > > > that the file has been corrupted, and this could cause any attempts to > > > open the

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-27 Thread Theodore Ts'o
On Fri, Oct 26, 2012 at 10:24:23PM +, Luck, Tony wrote: > > Well, we could set a new attribute bit on the file which indicates > > that the file has been corrupted, and this could cause any attempts to > > open the file to return some error until the bit has been cleared. > > That sounds a

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-27 Thread Theodore Ts'o
On Fri, Oct 26, 2012 at 10:24:23PM +, Luck, Tony wrote: Well, we could set a new attribute bit on the file which indicates that the file has been corrupted, and this could cause any attempts to open the file to return some error until the bit has been cleared. That sounds a lot better

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-27 Thread Naoya Horiguchi
Hi Ted, On Sat, Oct 27, 2012 at 06:16:26PM -0400, Theodore Ts'o wrote: On Fri, Oct 26, 2012 at 10:24:23PM +, Luck, Tony wrote: Well, we could set a new attribute bit on the file which indicates that the file has been corrupted, and this could cause any attempts to open the file to

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony
> Well, we could set a new attribute bit on the file which indicates > that the file has been corrupted, and this could cause any attempts to > open the file to return some error until the bit has been cleared. That sounds a lot better than renaming/moving the file. > This would persist across

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Naoya Horiguchi
On Fri, Oct 26, 2012 at 02:12:06AM -0400, Theodore Ts'o wrote: > On Thu, Oct 25, 2012 at 11:12:48AM -0400, Naoya Horiguchi wrote: > > + /* Lost data. Handle as critical fs error. */ > > + bh = head = page_buffers(page); > > + do { > > + if (buffer_dirty(bh) && !buffer_delay(bh)) {

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Theodore Ts'o
On Fri, Oct 26, 2012 at 04:55:01PM +, Luck, Tony wrote: > > I think that we know that the file *is* corrupted, not just "potentially". > We probably know the location of the corruption to cache-line granularity. > Perhaps better on systems where we have access to ecc syndrome bits, > perhaps

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony
> If we go back to first principles, what do we want to do? We want the > system administrator to know that a file might be potentially > corrupted. And perhaps, if a program tries to read from that file, it > should get an error. If we have a program that has that file mmap'ed > at the time of

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Theodore Ts'o
On Thu, Oct 25, 2012 at 11:12:48AM -0400, Naoya Horiguchi wrote: > + /* Lost data. Handle as critical fs error. */ > + bh = head = page_buffers(page); > + do { > + if (buffer_dirty(bh) && !buffer_delay(bh)) { > + block = bh->b_blocknr; > +

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Theodore Ts'o
On Thu, Oct 25, 2012 at 11:12:48AM -0400, Naoya Horiguchi wrote: + /* Lost data. Handle as critical fs error. */ + bh = head = page_buffers(page); + do { + if (buffer_dirty(bh) !buffer_delay(bh)) { + block = bh-b_blocknr; +

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony
If we go back to first principles, what do we want to do? We want the system administrator to know that a file might be potentially corrupted. And perhaps, if a program tries to read from that file, it should get an error. If we have a program that has that file mmap'ed at the time of the

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Theodore Ts'o
On Fri, Oct 26, 2012 at 04:55:01PM +, Luck, Tony wrote: I think that we know that the file *is* corrupted, not just potentially. We probably know the location of the corruption to cache-line granularity. Perhaps better on systems where we have access to ecc syndrome bits, perhaps worse

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Naoya Horiguchi
On Fri, Oct 26, 2012 at 02:12:06AM -0400, Theodore Ts'o wrote: On Thu, Oct 25, 2012 at 11:12:48AM -0400, Naoya Horiguchi wrote: + /* Lost data. Handle as critical fs error. */ + bh = head = page_buffers(page); + do { + if (buffer_dirty(bh) !buffer_delay(bh)) { +

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony
Well, we could set a new attribute bit on the file which indicates that the file has been corrupted, and this could cause any attempts to open the file to return some error until the bit has been cleared. That sounds a lot better than renaming/moving the file. This would persist across

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-25 Thread Jan Kara
On Thu 25-10-12 11:12:48, Naoya Horiguchi wrote: > Ext4 has its own configurable error handling policy, so it's helpful > if we can use it also in the context of memory error handling. > With this patch, when we detect a memory error on a dirty pagecache in > ext4 filesystem, we can allow users to

[PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-25 Thread Naoya Horiguchi
Ext4 has its own configurable error handling policy, so it's helpful if we can use it also in the context of memory error handling. With this patch, when we detect a memory error on a dirty pagecache in ext4 filesystem, we can allow users to choose to trigger kernel panic to avoid consuming

[PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-25 Thread Naoya Horiguchi
Ext4 has its own configurable error handling policy, so it's helpful if we can use it also in the context of memory error handling. With this patch, when we detect a memory error on a dirty pagecache in ext4 filesystem, we can allow users to choose to trigger kernel panic to avoid consuming

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-25 Thread Jan Kara
On Thu 25-10-12 11:12:48, Naoya Horiguchi wrote: Ext4 has its own configurable error handling policy, so it's helpful if we can use it also in the context of memory error handling. With this patch, when we detect a memory error on a dirty pagecache in ext4 filesystem, we can allow users to