Re: detecting read errors after RAID1 check operation

2007-08-25 Thread Mike Snitzer
On 8/17/07, Mike Accetta [EMAIL PROTECTED] wrote:

 Neil Brown writes:
  On Wednesday August 15, [EMAIL PROTECTED] wrote:
   Neil Brown writes:
On Wednesday August 15, [EMAIL PROTECTED] wrote:

   ...
   This happens in our old friend sync_request_write()?  I'm dealing with
 
  Yes, that would be the place.
 
   ...
   This fragment
  
   if (j  0 || test_bit(MD_RECOVERY_CHECK, mddev-recovery)) {
   sbio-bi_end_io = NULL;
   rdev_dec_pending(conf-mirrors[i].rdev, mddev);
   } else {
   /* fixup the bio for reuse */
   ...
   }
  
   looks suspicously like any correction attempt for 'check' is being
   short-circuited to me, regardless of whether or not there was a read
   error.  Actually, even if the rewrite was not being short-circuited,
   I still don't see the path that would update 'corrected_errors' in this
   case.  There are only two raid1.c sites that touch 'corrected_errors', one
   is in fix_read_errors() and the other is later in sync_request_write().
   With my limited understanding of how this all works, neither of these
   paths would seem to apply here.
 
  hmmm yes
  I guess I was thinking of the RAID5 code rather than the RAID1 code.
  It doesn't do the right thing does it?
  Maybe this patch is what we need.  I think it is right.
 
  Thanks,
  NeilBrown
 
 
  Signed-off-by: Neil Brown [EMAIL PROTECTED]
 
  ### Diffstat output
   ./drivers/md/raid1.c |3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)
 
  diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
  --- .prev/drivers/md/raid1.c  2007-08-16 10:29:58.0 +1000
  +++ ./drivers/md/raid1.c  2007-08-17 12:07:35.0 +1000
  @@ -1260,7 +1260,8 @@ static void sync_request_write(mddev_t *
j = 0;
if (j = 0)
mddev-resync_mismatches += 
  r1_bio-sec
  tors;
  - if (j  0 || test_bit(MD_RECOVERY_CHECK, 
  mddev
  -recovery)) {
  + if (j  0 || (test_bit(MD_RECOVERY_CHECK, 
  mdde
  v-recovery)
  +text_bit(BIO_UPTODATE, 
  sbio-
  bi_flags))) {
sbio-bi_end_io = NULL;

  rdev_dec_pending(conf-mirrors[i].rdev,
   mddev);
} else {

 I tried this (with the typo fixed) and it indeed issues a re-write.
 However, it doesn't seem to do anything with the corrected errors
 count if the rewrite succeeds.  Since end_sync_write() is only used one
 other place when !In_sync, I tried the following and it seems to work
 to get the error count updated.  I don't know whether this belongs in
 end_sync_write() but I'd think it needs to come after the write actually
 succeeds so that seems like the earliest it could be done.

Neil,

Any feedback on Mike's patch?

thanks,
Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5:md3: kernel BUG , followed by , Silent halt .

2007-08-25 Thread Mr. James W. Laferriere

Hello Dan ,

On Mon, 20 Aug 2007, Dan Williams wrote:

On 8/18/07, Mr. James W. Laferriere [EMAIL PROTECTED] wrote:

Hello All ,  Here we go again .  Again attempting to do bonnie++ testing
on a small array .
Kernel 2.6.22.1
Patches involved ,
IOP1 ,  2.6.22.1-iop1 for improved sequential write performance
(stripe-queue) ,  Dan Williams [EMAIL PROTECTED]


Hello James,

Thanks for the report.

I tried to reproduce this on my system, no luck.

Possibly because there is significant hardware differances ?
See 'lspci -v' below .sig .


However it looks
like their is a potential race between 'handle_queue' and
'add_queue_bio'.  The attached patch moves these critical sections
under spin_lock(sq-lock), and adds some debugging output if this BUG
triggers.  It also includes a fix for retry_aligned_read which is
unrelated to this debug.
--
Dan
	Applied your patch .  The same 'kernel BUG at drivers/md/raid5.c:3689!' 
messages appear (see attached) .  The system is still responsive with your 
patch ,  the kernel crashed last time .  Tho the bonnie++ run is stuck in 'D' .
And doing a ' /md3/asdf'  stays hung even after passing the parent process a 
'kill -9' .
	Any further info You can think of I can/should ,  I will try to acquire 
.  But I'll have to repeat these steps to attempt to get the same results . 
I'll be shutting the system down after sending this off .

Fyi ,  the previous 'BUG without your patch was quite repeatable .
	I might have time over the next couple of weeks to be able to see if it 
is as repatable as the last one .


Contents of /proc/mdstat for md3 .

md3 : active raid6 sdx1[3] sdw1[2] sdv1[1] sdu1[0] sdt1[7](S) sds1[6] sdr1[5] 
sdq1[4]
  717378560 blocks level 6, 1024k chunk, algorithm 2 [7/7] [UUU]
  bitmap: 2/137 pages [8KB], 512KB chunk

Commands I ran that lead to the 'BUG' .

bonniemd3() { /root/bonnie++-1.03a/bonnie++  -u0:0  -d /md3  -s 131072  -f; }
bonniemd3  131072MB-bonnie++-run-md3-xfs.log-20070825 21 

 [EMAIL PROTECTED]:~ # top
top - 02:22:09 up  3:39,  2 users,  load average: 3.09, 2.89, 2.48
Tasks: 155 total,   1 running, 154 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8256156k total,  7995044k used,   261112k free, 7480k buffers
Swap:   987896k total, 2784k used,   985112k free,  7787320k cached

  PID USER P  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 4073 root 0  18   0 000 D0  0.0  15:10.86 [bonnie++]
 4076 root 5  15   0  3000 1828 1252 D0  0.0   0:00.05 -bash
 4308 root 5  15   0 000 D0  0.0   0:01.40 [pdflush]
 4422 root 6  18   0  2212 1168  860 R0  0.0   0:00.13 top


 [EMAIL PROTECTED]:~ # ps -auxww | grep -C3 bonni
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
root  4029  0.0  0.0  0 0 ?S   Aug25   0:01 [xfsbufd]
root  4030  0.0  0.0  0 0 ?S   Aug25   0:00 [xfssyncd]
root  4072  0.0  0.0   2992   848 ?SAug25   0:00 -bash
root  4073  7.3  0.0  0 0 ?DAug25  15:10 [bonnie++]
root  4074  0.1  0.0   6412  1980 ?Ss   Aug25   0:12 sshd: [EMAIL 
PROTECTED]/1
root  4076  0.0  0.0   3000  1828 pts/1Ds+  Aug25   0:00 -bash
root  4302  0.1  0.0  0 0 ?S00:50   0:08 [pdflush]


--
+-+
| James   W.   Laferriere | System   Techniques | Give me VMS |
| NetworkEngineer | 663  Beaumont  Blvd |  Give me Linux  |
| [EMAIL PROTECTED] | Pacifica, CA. 94044 |   only  on  AXP |
+-+

 [EMAIL PROTECTED]:~ # lspci -v
00:00.0 Host bridge: Intel Corporation 5000P Chipset Memory Controller Hub (rev 
b1)
Subsystem: Super Micro Computer Inc Unknown device 8080
Flags: bus master, fast devsel, latency 0
Capabilities: [50] Power Management version 2
Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 
Enable-
Capabilities: [6c] Express Root Port (Slot-) IRQ 0
Capabilities: [100] Advanced Error Reporting

00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8 Port 
2-3 (rev b1) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel