Re: detecting read errors after RAID1 check operation

2007-08-25 Thread Mike Snitzer
On 8/17/07, Mike Accetta [EMAIL PROTECTED] wrote:

 Neil Brown writes:
  On Wednesday August 15, [EMAIL PROTECTED] wrote:
   Neil Brown writes:
On Wednesday August 15, [EMAIL PROTECTED] wrote:

   ...
   This happens in our old friend sync_request_write()?  I'm dealing with
 
  Yes, that would be the place.
 
   ...
   This fragment
  
   if (j  0 || test_bit(MD_RECOVERY_CHECK, mddev-recovery)) {
   sbio-bi_end_io = NULL;
   rdev_dec_pending(conf-mirrors[i].rdev, mddev);
   } else {
   /* fixup the bio for reuse */
   ...
   }
  
   looks suspicously like any correction attempt for 'check' is being
   short-circuited to me, regardless of whether or not there was a read
   error.  Actually, even if the rewrite was not being short-circuited,
   I still don't see the path that would update 'corrected_errors' in this
   case.  There are only two raid1.c sites that touch 'corrected_errors', one
   is in fix_read_errors() and the other is later in sync_request_write().
   With my limited understanding of how this all works, neither of these
   paths would seem to apply here.
 
  hmmm yes
  I guess I was thinking of the RAID5 code rather than the RAID1 code.
  It doesn't do the right thing does it?
  Maybe this patch is what we need.  I think it is right.
 
  Thanks,
  NeilBrown
 
 
  Signed-off-by: Neil Brown [EMAIL PROTECTED]
 
  ### Diffstat output
   ./drivers/md/raid1.c |3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)
 
  diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
  --- .prev/drivers/md/raid1.c  2007-08-16 10:29:58.0 +1000
  +++ ./drivers/md/raid1.c  2007-08-17 12:07:35.0 +1000
  @@ -1260,7 +1260,8 @@ static void sync_request_write(mddev_t *
j = 0;
if (j = 0)
mddev-resync_mismatches += 
  r1_bio-sec
  tors;
  - if (j  0 || test_bit(MD_RECOVERY_CHECK, 
  mddev
  -recovery)) {
  + if (j  0 || (test_bit(MD_RECOVERY_CHECK, 
  mdde
  v-recovery)
  +text_bit(BIO_UPTODATE, 
  sbio-
  bi_flags))) {
sbio-bi_end_io = NULL;

  rdev_dec_pending(conf-mirrors[i].rdev,
   mddev);
} else {

 I tried this (with the typo fixed) and it indeed issues a re-write.
 However, it doesn't seem to do anything with the corrected errors
 count if the rewrite succeeds.  Since end_sync_write() is only used one
 other place when !In_sync, I tried the following and it seems to work
 to get the error count updated.  I don't know whether this belongs in
 end_sync_write() but I'd think it needs to come after the write actually
 succeeds so that seems like the earliest it could be done.

Neil,

Any feedback on Mike's patch?

thanks,
Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: detecting read errors after RAID1 check operation

2007-08-17 Thread Mike Accetta

Neil Brown writes:
 On Wednesday August 15, [EMAIL PROTECTED] wrote:
  Neil Brown writes:
   On Wednesday August 15, [EMAIL PROTECTED] wrote:

  ... 
  This happens in our old friend sync_request_write()?  I'm dealing with
 
 Yes, that would be the place.
 
  ...
  This fragment
  
  if (j  0 || test_bit(MD_RECOVERY_CHECK, mddev-recovery)) {
  sbio-bi_end_io = NULL;
  rdev_dec_pending(conf-mirrors[i].rdev, mddev);
  } else {
  /* fixup the bio for reuse */
  ...
  }
  
  looks suspicously like any correction attempt for 'check' is being
  short-circuited to me, regardless of whether or not there was a read
  error.  Actually, even if the rewrite was not being short-circuited,
  I still don't see the path that would update 'corrected_errors' in this
  case.  There are only two raid1.c sites that touch 'corrected_errors', one
  is in fix_read_errors() and the other is later in sync_request_write().
  With my limited understanding of how this all works, neither of these
  paths would seem to apply here.
 
 hmmm yes
 I guess I was thinking of the RAID5 code rather than the RAID1 code.
 It doesn't do the right thing does it?
 Maybe this patch is what we need.  I think it is right.
 
 Thanks,
 NeilBrown
 
 
 Signed-off-by: Neil Brown [EMAIL PROTECTED]
 
 ### Diffstat output
  ./drivers/md/raid1.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
 --- .prev/drivers/md/raid1.c  2007-08-16 10:29:58.0 +1000
 +++ ./drivers/md/raid1.c  2007-08-17 12:07:35.0 +1000
 @@ -1260,7 +1260,8 @@ static void sync_request_write(mddev_t *
   j = 0;
   if (j = 0)
   mddev-resync_mismatches += r1_bio-sec
 tors;
 - if (j  0 || test_bit(MD_RECOVERY_CHECK, mddev
 -recovery)) {
 + if (j  0 || (test_bit(MD_RECOVERY_CHECK, mdde
 v-recovery)
 +text_bit(BIO_UPTODATE, sbio-
 bi_flags))) {
   sbio-bi_end_io = NULL;
   rdev_dec_pending(conf-mirrors[i].rdev,
  mddev);
   } else {

I tried this (with the typo fixed) and it indeed issues a re-write.
However, it doesn't seem to do anything with the corrected errors
count if the rewrite succeeds.  Since end_sync_write() is only used one
other place when !In_sync, I tried the following and it seems to work
to get the error count updated.  I don't know whether this belongs in
end_sync_write() but I'd think it needs to come after the write actually
succeeds so that seems like the earliest it could be done.

--- BUILD/kernel/drivers/md/raid1.c 2007-06-04 13:52:42.0 -0400
+++ drivers/md/raid1.c  2007-08-17 16:52:14.219364000 -0400
@@ -1166,6 +1166,7 @@ static int end_sync_write(struct bio *bi
conf_t *conf = mddev_to_conf(mddev);
int i;
int mirror=0;
+   mdk_rdev_t *rdev;
 
if (bio-bi_size)
return 1;
@@ -1175,6 +1176,8 @@ static int end_sync_write(struct bio *bi
mirror = i;
break;
}
+
+   rdev = conf-mirrors[mirror].rdev;
if (!uptodate) {
int sync_blocks = 0;
sector_t s = r1_bio-sector;
@@ -1186,7 +1189,13 @@ static int end_sync_write(struct bio *bi
s += sync_blocks;
sectors_to_go -= sync_blocks;
} while (sectors_to_go  0);
-   md_error(mddev, conf-mirrors[mirror].rdev);
+   md_error(mddev, rdev);
+   } else if (test_bit(In_sync, rdev-flags)) {
+   /*
+* If we are currently in sync, this was a re-write to
+* correct a read error and we should account for it.
+*/
+   atomic_add(r1_bio-sectors, rdev-corrected_errors);
}
 
update_head_pos(mirror, r1_bio);
@@ -1251,7 +1260,8 @@ static void sync_request_write(mddev_t *
}
if (j = 0)
mddev-resync_mismatches += 
r1_bio-sectors;
-   if (j  0 || test_bit(MD_RECOVERY_CHECK, 
mddev-recovery)) {
+   if (j  0 || (test_bit(MD_RECOVERY_CHECK, 
mddev-recovery)
+  test_bit(BIO_UPTODATE, 
sbio-bi_flags))) {
sbio-bi_end_io = NULL;
rdev_dec_pending(conf-mirrors[i].rdev, 
mddev);
} else {
--
Mike Accetta

ECI Telecom Ltd.
Transport Networking Division, US (previously Laurel Networks)
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a 

Re: detecting read errors after RAID1 check operation

2007-08-16 Thread Neil Brown
On Wednesday August 15, [EMAIL PROTECTED] wrote:
 Neil Brown writes:
  On Wednesday August 15, [EMAIL PROTECTED] wrote:
   
   There are already files like /sys/block/md_d0/md/dev-sdb/errors in /sys
   which would be very convenient to consult but according to the kernel
   driver implementation the error counts reported there are apparently
   for corrected errors and not relevant for read errors during a check
   operation.
   
  
  When 'check' hits a read error, an attempt is made to 'correct' it by
  over-writing with correct data.  This will either increase the
  'errors' count or fail the drive completely.
  
  What 'check' doesn't do (and 'repair' does) it react when it find that
  successful reads of all drives (in a raid1) do not match.
  
  So just use the 'errors' number - it is exactly what you want.
 
 This happens in our old friend sync_request_write()?  I'm dealing with

Yes, that would be the place.

 simulated errors and will dig further to make sure that is not perturbing
 the results but I don't see any 'errors' effect.  This is with our
 patched 2.6.20 raid1.c.  The logic doesn't seem to be any different in
 2.6.22 from what I can tell, though.
 
 This fragment
 
   if (j  0 || test_bit(MD_RECOVERY_CHECK, mddev-recovery)) {
   sbio-bi_end_io = NULL;
   rdev_dec_pending(conf-mirrors[i].rdev, mddev);
   } else {
   /* fixup the bio for reuse */
   ...
   }
 
 looks suspicously like any correction attempt for 'check' is being
 short-circuited to me, regardless of whether or not there was a read
 error.  Actually, even if the rewrite was not being short-circuited,
 I still don't see the path that would update 'corrected_errors' in this
 case.  There are only two raid1.c sites that touch 'corrected_errors', one
 is in fix_read_errors() and the other is later in sync_request_write().
 With my limited understanding of how this all works, neither of these
 paths would seem to apply here.

hmmm yes
I guess I was thinking of the RAID5 code rather than the RAID1 code.
It doesn't do the right thing does it?
Maybe this patch is what we need.  I think it is right.

Thanks,
NeilBrown


Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/raid1.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c2007-08-16 10:29:58.0 +1000
+++ ./drivers/md/raid1.c2007-08-17 12:07:35.0 +1000
@@ -1260,7 +1260,8 @@ static void sync_request_write(mddev_t *
j = 0;
if (j = 0)
mddev-resync_mismatches += 
r1_bio-sectors;
-   if (j  0 || test_bit(MD_RECOVERY_CHECK, 
mddev-recovery)) {
+   if (j  0 || (test_bit(MD_RECOVERY_CHECK, 
mddev-recovery)
+  text_bit(BIO_UPTODATE, 
sbio-bi_flags))) {
sbio-bi_end_io = NULL;
rdev_dec_pending(conf-mirrors[i].rdev, 
mddev);
} else {
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


detecting read errors after RAID1 check operation

2007-08-15 Thread Mike Accetta

We run a check operation periodically to try and turn up problems with
drives about to go bad before they become too severe.  In particularly,
if there were any drive read errors during the check operation I would
like to be able to notice and raise an alarm for human attention so that
the failing drive can be replaced sooner than later.  I'm looking for a
programatic way to detect this reliably without having to grovel through
the log files looking for kernel hard drive error messages that may have
occurred during the check operation.

There are already files like /sys/block/md_d0/md/dev-sdb/errors in /sys
which would be very convenient to consult but according to the kernel
driver implementation the error counts reported there are apparently
for corrected errors and not relevant for read errors during a check
operation.

I am contemplating adding a parallel /sys file that would report
all errors, not just the corrected ones.  Does this seem reasonable?
Are there other alternatives that might make sense here?
--
Mike Accetta

ECI Telecom Ltd.
Transport Networking Division, US (previously Laurel Networks)
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: detecting read errors after RAID1 check operation

2007-08-15 Thread Neil Brown
On Wednesday August 15, [EMAIL PROTECTED] wrote:
 
 There are already files like /sys/block/md_d0/md/dev-sdb/errors in /sys
 which would be very convenient to consult but according to the kernel
 driver implementation the error counts reported there are apparently
 for corrected errors and not relevant for read errors during a check
 operation.
 

When 'check' hits a read error, an attempt is made to 'correct' it by
over-writing with correct data.  This will either increase the
'errors' count or fail the drive completely.

What 'check' doesn't do (and 'repair' does) it react when it find that
successful reads of all drives (in a raid1) do not match.

So just use the 'errors' number - it is exactly what you want.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: detecting read errors after RAID1 check operation

2007-08-15 Thread Mike Accetta
Neil Brown writes:
 On Wednesday August 15, [EMAIL PROTECTED] wrote:
  
  There are already files like /sys/block/md_d0/md/dev-sdb/errors in /sys
  which would be very convenient to consult but according to the kernel
  driver implementation the error counts reported there are apparently
  for corrected errors and not relevant for read errors during a check
  operation.
  
 
 When 'check' hits a read error, an attempt is made to 'correct' it by
 over-writing with correct data.  This will either increase the
 'errors' count or fail the drive completely.
 
 What 'check' doesn't do (and 'repair' does) it react when it find that
 successful reads of all drives (in a raid1) do not match.
 
 So just use the 'errors' number - it is exactly what you want.

This happens in our old friend sync_request_write()?  I'm dealing with
simulated errors and will dig further to make sure that is not perturbing
the results but I don't see any 'errors' effect.  This is with our
patched 2.6.20 raid1.c.  The logic doesn't seem to be any different in
2.6.22 from what I can tell, though.

This fragment

if (j  0 || test_bit(MD_RECOVERY_CHECK, mddev-recovery)) {
sbio-bi_end_io = NULL;
rdev_dec_pending(conf-mirrors[i].rdev, mddev);
} else {
/* fixup the bio for reuse */
...
}

looks suspicously like any correction attempt for 'check' is being
short-circuited to me, regardless of whether or not there was a read
error.  Actually, even if the rewrite was not being short-circuited,
I still don't see the path that would update 'corrected_errors' in this
case.  There are only two raid1.c sites that touch 'corrected_errors', one
is in fix_read_errors() and the other is later in sync_request_write().
With my limited understanding of how this all works, neither of these
paths would seem to apply here.
--
Mike Accetta

ECI Telecom Ltd.
Transport Networking Division, US (previously Laurel Networks)
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html