Re: Fix(es) for ext2 fsync bug

2007-02-20 Thread Valerie Henson
On Thu, Feb 15, 2007 at 09:20:21AM -0500, Theodore Tso wrote: It's actually not the case that fsck will complete the truncate for file A. The problem is that while e2fsck is processing indirect blocks in pass 1, the block which is marked as file A's indirect block (but which actually

Re: Fix(es) for ext2 fsync bug

2007-02-20 Thread Dawson Engler
On Tue, Feb 20, 2007 at 01:30:25PM -0800, Junfeng Yang wrote: On 2/20/07, Valerie Henson [EMAIL PROTECTED] wrote: Google. (GoogleFS runs on top of ext2.) It's surprising to know that... I guess they reply on GoogleFS's own replication and checksumming for consistency. Yep, they

Re: Fix(es) for ext2 fsync bug

2007-02-20 Thread Erez Zadok
In message [EMAIL PROTECTED], Valerie Henson writes: On Thu, Feb 15, 2007 at 09:20:21AM -0500, Theodore Tso wrote: And POSIX also states that sync() is only required to schedule the writes, but may return before the actual writing is done. Looks like One more reason to form a group to

Re: Fix(es) for ext2 fsync bug

2007-02-20 Thread Dave Kleikamp
On Tue, 2007-02-20 at 21:39 +, Valerie Henson wrote: On Tue, Feb 20, 2007 at 01:30:25PM -0800, Junfeng Yang wrote: On 2/20/07, Valerie Henson [EMAIL PROTECTED] wrote: Google. (GoogleFS runs on top of ext2.) It's surprising to know that... I guess they reply on GoogleFS's own

Re: Fix(es) for ext2 fsync bug

2007-02-15 Thread Theodore Tso
On Wed, Feb 14, 2007 at 11:54:54AM -0800, Valerie Henson wrote: Background: The eXplode file system checker found a bug in ext2 fsync behavior. Do the following: truncate file A, create file B which reallocates one of A's old indirect blocks, fsync file B. If you then crash before file A's

Re: Fix(es) for ext2 fsync bug

2007-02-15 Thread Dave Kleikamp
On Thu, 2007-02-15 at 09:20 -0500, Theodore Tso wrote: Another very heavyweight approach would be to simply force a full sync of the filesystem whenever fysnc() is called. Not pretty, and without the proper write ordering, the race is still potentially there. I don't think this race is an

Re: Fix(es) for ext2 fsync bug

2007-02-15 Thread sfaibish
On Thu, 15 Feb 2007 10:09:22 -0500, Dave Kleikamp [EMAIL PROTECTED] wrote: On Thu, 2007-02-15 at 09:20 -0500, Theodore Tso wrote: Another very heavyweight approach would be to simply force a full sync of the filesystem whenever fysnc() is called. Not pretty, and without the proper write

Re: Fix(es) for ext2 fsync bug

2007-02-15 Thread Dave Kleikamp
On Thu, 2007-02-15 at 10:59 -0500, sfaibish wrote: On Thu, 15 Feb 2007 10:09:22 -0500, Dave Kleikamp [EMAIL PROTECTED] wrote: On Thu, 2007-02-15 at 09:20 -0500, Theodore Tso wrote: Another very heavyweight approach would be to simply force a full sync of the filesystem whenever

Re: Fix(es) for ext2 fsync bug

2007-02-15 Thread Theodore Tso
On Thu, Feb 15, 2007 at 10:39:02AM -0600, Dave Kleikamp wrote: It was my understanding from the persentation of Dawson that ext3 and jfs have ame problem. Hmm. If jfs has the problem, it is a bug. jfs is designed to handle this correctly. I'm pretty sure I've fixed at least one bug

Re: Fix(es) for ext2 fsync bug

2007-02-15 Thread sfaibish
On Thu, 15 Feb 2007 12:15:59 -0500, Theodore Tso [EMAIL PROTECTED] wrote: On Thu, Feb 15, 2007 at 10:39:02AM -0600, Dave Kleikamp wrote: It was my understanding from the persentation of Dawson that ext3 and jfs have ame problem. Hmm. If jfs has the problem, it is a bug. jfs is designed

Re: Fix(es) for ext2 fsync bug

2007-02-15 Thread Dave Kleikamp
On Thu, 2007-02-15 at 11:11 -0800, Junfeng Yang wrote: Hmm. If jfs has the problem, it is a bug. jfs is designed to handle this correctly. I'm pretty sure I've fixed at least one bug that eXplode has uncovered in the past. I'm not sure what was

Re: Fix(es) for ext2 fsync bug

2007-02-15 Thread Dawson Engler
It was my understanding from the persentation of Dawson that ext3 and jfs have same problem. It is not an ext2 only problem. Also whatever solution we adopt we need to be sure that we test it using the eXplode methodology. apologies for dropping in randomly into the discussion: if this

Re: Fix(es) for ext2 fsync bug

2007-02-15 Thread Theodore Tso
On Thu, Feb 15, 2007 at 11:28:46AM -0800, Junfeng Yang wrote: Actually, we found a crash-during-recovery bug in ext3 too. It's a race between resetting the journal super block and replay of the journal. This bug was fixed by Ted long time ago (3 years?). That was found in your original

Fix(es) for ext2 fsync bug

2007-02-14 Thread Valerie Henson
Just some quick notes on possible ways to fix the ext2 fsync bug that eXplode found. Whether or not anyone will bother to implement it is another matter. Background: The eXplode file system checker found a bug in ext2 fsync behavior. Do the following: truncate file A, create file B which

Re: Fix(es) for ext2 fsync bug

2007-02-14 Thread David Chinner
On Wed, Feb 14, 2007 at 11:54:54AM -0800, Valerie Henson wrote: Just some quick notes on possible ways to fix the ext2 fsync bug that eXplode found. Whether or not anyone will bother to implement it is another matter. Background: The eXplode file system checker found a bug in ext2 fsync

Re: Fix(es) for ext2 fsync bug

2007-02-14 Thread Dave Kleikamp
On Thu, 2007-02-15 at 07:31 +1100, David Chinner wrote: On Wed, Feb 14, 2007 at 11:54:54AM -0800, Valerie Henson wrote: Just some quick notes on possible ways to fix the ext2 fsync bug that eXplode found. Whether or not anyone will bother to implement it is another matter. Background:

Re: Fix(es) for ext2 fsync bug

2007-02-14 Thread David Chinner
On Wed, Feb 14, 2007 at 03:26:22PM -0600, Dave Kleikamp wrote: On Thu, 2007-02-15 at 07:31 +1100, David Chinner wrote: On Wed, Feb 14, 2007 at 11:54:54AM -0800, Valerie Henson wrote: Just some quick notes on possible ways to fix the ext2 fsync bug that eXplode found. Whether or not

Re: Fix(es) for ext2 fsync bug

2007-02-14 Thread sfaibish
Val, Maybe it is not only our (FS people) problem. We probably need to bring the kernel people judge as ext2 and ext3 are the base Linux FS. I add the kernel list for opinion. /Sorin On Wed, 14 Feb 2007 14:54:54 -0500, Valerie Henson [EMAIL PROTECTED] wrote: Just some quick notes on