Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23
On Tuesday November 13, [EMAIL PROTECTED] wrote: > > raid5-fix-unending-write-sequence.patch is in -mm and I believe is > waiting on an Acked-by from Neil? > It seems to have just been sent on to Linus, so it probably will go in without: Acked-By: NeilBrown <[EMAIL PROTECTED]> I'm beginning to think that I really should sit down and make sure I understand exactly how those STRIPE_OP_ flags are uses. They generally make sense but there seem to be a number of corner cases where they aren't quite handled properly.. Maybe they are all found now, or maybe NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23
On Tue, Nov 13, 2007 at 10:36:30PM -0700, Dan Williams wrote: > On Nov 13, 2007 8:43 PM, Greg KH <[EMAIL PROTECTED]> wrote: > > > > > > Careful, it looks like you cherry picked commit 4ae3f847 "md: raid5: > > > fix clearing of biofill operations" which ended up misapplied in > > > Linus' tree, You should either also pick up def6ae26 "md: fix > > > misapplied patch in raid5.c" or I can resend the original "raid5: fix > > > clearing of biofill operations." > > > > > > The other patch for -stable "raid5: fix unending write sequence" is > > > currently in -mm. > > > > Hm, I've attached the two patches that I have right now in the -stable > > tree so far (still have over 100 patches to go, so I might not have > > gotten to them yet if you have sent them). These were sent to me by > > Andrew on their way to Linus. if I should drop either one, or add > > another one, please let me know. > > > > Drop md-raid5-fix-clearing-of-biofill-operations.patch and replace it > with the attached > md-raid5-not-raid6-fix-clearing-of-biofill-operations.patch (the > original sent to Neil). > > The critical difference is that the replacement patch touches > handle_stripe5, not handle_stripe6. Diffing the patches shows the > changes for hunk #3: > > -@@ -2903,6 +2907,13 @@ static void handle_stripe6(struct stripe > +@@ -2630,6 +2634,13 @@ static void handle_stripe5(struct stripe_head *sh) Ah, ok, thanks, will do that. > raid5-fix-unending-write-sequence.patch is in -mm and I believe is > waiting on an Acked-by from Neil? I don't see it in Linus's tree yet, so I can't apply it to -stable... thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23
On Nov 13, 2007 8:43 PM, Greg KH <[EMAIL PROTECTED]> wrote: > > > > Careful, it looks like you cherry picked commit 4ae3f847 "md: raid5: > > fix clearing of biofill operations" which ended up misapplied in > > Linus' tree, You should either also pick up def6ae26 "md: fix > > misapplied patch in raid5.c" or I can resend the original "raid5: fix > > clearing of biofill operations." > > > > The other patch for -stable "raid5: fix unending write sequence" is > > currently in -mm. > > Hm, I've attached the two patches that I have right now in the -stable > tree so far (still have over 100 patches to go, so I might not have > gotten to them yet if you have sent them). These were sent to me by > Andrew on their way to Linus. if I should drop either one, or add > another one, please let me know. > Drop md-raid5-fix-clearing-of-biofill-operations.patch and replace it with the attached md-raid5-not-raid6-fix-clearing-of-biofill-operations.patch (the original sent to Neil). The critical difference is that the replacement patch touches handle_stripe5, not handle_stripe6. Diffing the patches shows the changes for hunk #3: -@@ -2903,6 +2907,13 @@ static void handle_stripe6(struct stripe +@@ -2630,6 +2634,13 @@ static void handle_stripe5(struct stripe_head *sh) raid5-fix-unending-write-sequence.patch is in -mm and I believe is waiting on an Acked-by from Neil? > thanks, > > greg k-h Thanks, Dan raid5: fix clearing of biofill operations From: Dan Williams <[EMAIL PROTECTED]> ops_complete_biofill() runs outside of spin_lock(&sh->lock) and clears the 'pending' and 'ack' bits. Since the test_and_ack_op() macro only checks against 'complete' it can get an inconsistent snapshot of pending work. Move the clearing of these bits to handle_stripe5(), under the lock. Signed-off-by: Dan Williams <[EMAIL PROTECTED]> Tested-by: Joel Bertrand <[EMAIL PROTECTED]> Signed-off-by: Neil Brown <[EMAIL PROTECTED]> --- drivers/md/raid5.c | 17 ++--- 1 files changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index f96dea9..3808f52 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -377,7 +377,12 @@ static unsigned long get_stripe_work(struct stripe_head *sh) ack++; sh->ops.count -= ack; - BUG_ON(sh->ops.count < 0); + if (unlikely(sh->ops.count < 0)) { + printk(KERN_ERR "pending: %#lx ops.pending: %#lx ops.ack: %#lx " + "ops.complete: %#lx\n", pending, sh->ops.pending, + sh->ops.ack, sh->ops.complete); + BUG(); + } return pending; } @@ -551,8 +556,7 @@ static void ops_complete_biofill(void *stripe_head_ref) } } } - clear_bit(STRIPE_OP_BIOFILL, &sh->ops.ack); - clear_bit(STRIPE_OP_BIOFILL, &sh->ops.pending); + set_bit(STRIPE_OP_BIOFILL, &sh->ops.complete); return_io(return_bi); @@ -2630,6 +2634,13 @@ static void handle_stripe5(struct stripe_head *sh) s.expanded = test_bit(STRIPE_EXPAND_READY, &sh->state); /* Now to look around and see what can be done */ + /* clean-up completed biofill operations */ + if (test_bit(STRIPE_OP_BIOFILL, &sh->ops.complete)) { + clear_bit(STRIPE_OP_BIOFILL, &sh->ops.pending); + clear_bit(STRIPE_OP_BIOFILL, &sh->ops.ack); + clear_bit(STRIPE_OP_BIOFILL, &sh->ops.complete); + } + rcu_read_lock(); for (i=disks; i--; ) { mdk_rdev_t *rdev; raid5: fix unending write sequence From: Dan Williams <[EMAIL PROTECTED]> handling stripe 7629696, state=0x14 cnt=1, pd_idx=2 ops=0:0:0 check 5: state 0x6 toread read write f800ffcffcc0 written check 4: state 0x6 toread read write f800fdd4e360 written check 3: state 0x1 toread read write written check 2: state 0x1 toread read write written check 1: state 0x6 toread read write f800ff517e40 written check 0: state 0x6 toread read write f800fd4cae60 written locked=4 uptodate=2 to_read=0 to_write=4 failed=0 failed_num=0 for sector 7629696, rmw=0 rcw=0 These blocks were prepared to be written out, but were never handled in ops_run_biodrain(), so they remain locked forever. The operations flags are all clear which means handle_stripe() thinks nothing else needs to be done. This state suggests that the STRIPE_OP_PREXOR bit was sampled 'set' when it should not have been. This patch cleans up cases where the code looks at sh->ops.pending when it should be looking at the consistent stack-based snapshot of the operations flags. Report from Joel: Resync done. Patch fix this bug. Signed-off-by: Dan Williams <[EMAIL PROTECTED]> Tested-by: Joel Bertrand <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] --- drivers/md/raid5.c | 16 +--- 1 files changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/
Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23
On Tue, Nov 13, 2007 at 08:36:05PM -0700, Dan Williams wrote: > On Nov 13, 2007 5:23 PM, Greg KH <[EMAIL PROTECTED]> wrote: > > On Tue, Nov 13, 2007 at 04:22:14PM -0800, Greg KH wrote: > > > On Mon, Oct 22, 2007 at 05:15:27PM +1000, NeilBrown wrote: > > > > > > > > It appears that a couple of bugs slipped in to md for 2.6.23. > > > > These two patches fix them and are appropriate for 2.6.23.y as well > > > > as 2.6.24-rcX > > > > > > > > Thanks, > > > > NeilBrown > > > > > > > > [PATCH 001 of 2] md: Fix an unsigned compare to allow creation of > > > > bitmaps with v1.0 metadata. > > > > [PATCH 002 of 2] md: raid5: fix clearing of biofill operations > > > > > > I don't see these patches in 2.6.24-rcX, are they there under some other > > > subject? > > > > Oh nevermind, I found them, sorry for the noise... > > > > Careful, it looks like you cherry picked commit 4ae3f847 "md: raid5: > fix clearing of biofill operations" which ended up misapplied in > Linus' tree, You should either also pick up def6ae26 "md: fix > misapplied patch in raid5.c" or I can resend the original "raid5: fix > clearing of biofill operations." > > The other patch for -stable "raid5: fix unending write sequence" is > currently in -mm. Hm, I've attached the two patches that I have right now in the -stable tree so far (still have over 100 patches to go, so I might not have gotten to them yet if you have sent them). These were sent to me by Andrew on their way to Linus. if I should drop either one, or add another one, please let me know. thanks, greg k-h >From [EMAIL PROTECTED] Mon Oct 22 20:45:35 2007 From: [EMAIL PROTECTED] Date: Mon, 22 Oct 2007 20:45:11 -0700 Subject: md: fix an unsigned compare to allow creation of bitmaps with v1.0 metadata To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Message-ID: <[EMAIL PROTECTED]> From: NeilBrown <[EMAIL PROTECTED]> patch 85bfb4da8cad483a4e550ec89060d05a4daf895b in mainline. As page->index is unsigned, this all becomes an unsigned comparison, which almost always returns an error. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/md/bitmap.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/md/bitmap.c +++ b/drivers/md/bitmap.c @@ -274,7 +274,7 @@ static int write_sb_page(struct bitmap * if (bitmap->offset < 0) { /* DATA BITMAP METADATA */ if (bitmap->offset -+ page->index * (PAGE_SIZE/512) ++ (long)(page->index * (PAGE_SIZE/512)) + size/512 > 0) /* bitmap runs in to metadata */ return -EINVAL; >From [EMAIL PROTECTED] Mon Oct 22 20:45:44 2007 From: Dan Williams <[EMAIL PROTECTED]> Date: Mon, 22 Oct 2007 20:45:11 -0700 Subject: md: raid5: fix clearing of biofill operations To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Message-ID: <[EMAIL PROTECTED]> From: Dan Williams <[EMAIL PROTECTED]> patch 4ae3f847e49e3787eca91bced31f8fd328d50496 in mainline. ops_complete_biofill() runs outside of spin_lock(&sh->lock) and clears the 'pending' and 'ack' bits. Since the test_and_ack_op() macro only checks against 'complete' it can get an inconsistent snapshot of pending work. Move the clearing of these bits to handle_stripe5(), under the lock. Signed-off-by: Dan Williams <[EMAIL PROTECTED]> Tested-by: Joel Bertrand <[EMAIL PROTECTED]> Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/md/raid5.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -377,7 +377,12 @@ static unsigned long get_stripe_work(str ack++; sh->ops.count -= ack; - BUG_ON(sh->ops.count < 0); + if (unlikely(sh->ops.count < 0)) { + printk(KERN_ERR "pending: %#lx ops.pending: %#lx ops.ack: %#lx " + "ops.complete: %#lx\n", pending, sh->ops.pending, + sh->ops.ack, sh->ops.complete); + BUG(); + } return pending; } @@ -551,8 +556,7 @@ static void ops_complete_biofill(void *s } } } - clear_bit(STRIPE_OP_BIOFILL, &sh->ops.ack); - clear_bit(STRIPE_OP_BIOFILL, &sh->ops.pending); + set_bit(STRIPE_OP_BIOFILL, &sh->ops.complete); return_io(return_bi); @@ -2903,6 +2907,13 @@ static void handle_stripe6(struct stripe s.expanded = test_bit(STRIPE_EXPAND_READY, &sh->state); /* Now to look around and see what can be done */ + /* clean-up completed biofill operations */ + if (test_bit(STRIPE_OP_BIOFILL, &sh->ops.complete)) { + clear_bit(STRIPE_OP_BIOFILL, &sh->ops.pending); + clear_bit(STRIPE_OP_BIOFILL, &sh->ops.ack); + clear_bit(STRIPE_OP_BIOFILL, &sh->ops.complete); + } + rcu_read_lock(); for (i=disks; i--; ) { mdk_r
Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23
On Nov 13, 2007 5:23 PM, Greg KH <[EMAIL PROTECTED]> wrote: > On Tue, Nov 13, 2007 at 04:22:14PM -0800, Greg KH wrote: > > On Mon, Oct 22, 2007 at 05:15:27PM +1000, NeilBrown wrote: > > > > > > It appears that a couple of bugs slipped in to md for 2.6.23. > > > These two patches fix them and are appropriate for 2.6.23.y as well > > > as 2.6.24-rcX > > > > > > Thanks, > > > NeilBrown > > > > > > [PATCH 001 of 2] md: Fix an unsigned compare to allow creation of > > > bitmaps with v1.0 metadata. > > > [PATCH 002 of 2] md: raid5: fix clearing of biofill operations > > > > I don't see these patches in 2.6.24-rcX, are they there under some other > > subject? > > Oh nevermind, I found them, sorry for the noise... > Careful, it looks like you cherry picked commit 4ae3f847 "md: raid5: fix clearing of biofill operations" which ended up misapplied in Linus' tree, You should either also pick up def6ae26 "md: fix misapplied patch in raid5.c" or I can resend the original "raid5: fix clearing of biofill operations." The other patch for -stable "raid5: fix unending write sequence" is currently in -mm. > greg k-h Regards, Dan - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23
On Tue, Nov 13, 2007 at 04:22:14PM -0800, Greg KH wrote: > On Mon, Oct 22, 2007 at 05:15:27PM +1000, NeilBrown wrote: > > > > It appears that a couple of bugs slipped in to md for 2.6.23. > > These two patches fix them and are appropriate for 2.6.23.y as well > > as 2.6.24-rcX > > > > Thanks, > > NeilBrown > > > > [PATCH 001 of 2] md: Fix an unsigned compare to allow creation of bitmaps > > with v1.0 metadata. > > [PATCH 002 of 2] md: raid5: fix clearing of biofill operations > > I don't see these patches in 2.6.24-rcX, are they there under some other > subject? Oh nevermind, I found them, sorry for the noise... greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23
On Mon, Oct 22, 2007 at 05:15:27PM +1000, NeilBrown wrote: > > It appears that a couple of bugs slipped in to md for 2.6.23. > These two patches fix them and are appropriate for 2.6.23.y as well > as 2.6.24-rcX > > Thanks, > NeilBrown > > [PATCH 001 of 2] md: Fix an unsigned compare to allow creation of bitmaps > with v1.0 metadata. > [PATCH 002 of 2] md: raid5: fix clearing of biofill operations I don't see these patches in 2.6.24-rcX, are they there under some other subject? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html