Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-09 Thread Justin Piszcz



On Thu, 8 Nov 2007, Carlos Carvalho wrote:


Jeff Lessem ([EMAIL PROTECTED]) wrote on 6 November 2007 22:00:
Dan Williams wrote:
  The following patch, also attached, cleans up cases where the code looks
  at sh-ops.pending when it should be looking at the consistent
  stack-based snapshot of the operations flags.

I tried this patch (against a stock 2.6.23), and it did not work for
me.  Not only did I/O to the effected RAID5  XFS partition stop, but
also I/O to all other disks.  I was not able to capture any debugging
information, but I should be able to do that tomorrow when I can hook
a serial console to the machine.

I'm not sure if my problem is identical to these others, as mine only
seems to manifest with RAID5+XFS.  The RAID rebuilds with no problem,
and I've not had any problems with RAID5+ext3.

Us too! We're stuck trying to build a disk server with several disks
in a raid5 array, and the rsync from the old machine stops writing to
the new filesystem. It only happens under heavy IO. We can make it
lock without rsync, using 8 simultaneous dd's to the array. All IO
stops, including the resync after a newly created raid or after an
unclean reboot.

We could not trigger the problem with ext3 or reiser3; it only happens
with xfs.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Including XFS mailing list as well can you provide more information to 
them?

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: telling mdadm to use spare drive.

2007-11-09 Thread Janek Kozicki
Richard Scobie said: (by the date of Fri, 09 Nov 2007 10:32:08 +1300)

 This was the bug I was thinking of:
 
 http://marc.info/?l=linux-raidm=116003247912732w=2

This bug says that it only with mdadm 1.x:

   If a drive is added to a raid1 using older tools
(mdadm-1.x or raidtools) then it will be included
in the array without any resync happening.

But I have here:

# mdadm --version
mdadm - v2.5.6 - 9 November 2006

maybe I stumbled on another bug?

-- 
Janek Kozicki |
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-09 Thread Jeff Lessem

Dan Williams wrote:
 On 11/8/07, Bill Davidsen [EMAIL PROTECTED] wrote:
 Jeff Lessem wrote:
 Dan Williams wrote:
 The following patch, also attached, cleans up cases where the code
 looks
 at sh-ops.pending when it should be looking at the consistent
 stack-based snapshot of the operations flags.
 I tried this patch (against a stock 2.6.23), and it did not work for
 me.  Not only did I/O to the effected RAID5  XFS partition stop, but
 also I/O to all other disks.  I was not able to capture any debugging
 information, but I should be able to do that tomorrow when I can hook
 a serial console to the machine.
 That can't be good! This is worrisome because Joel is giddy with joy
 because it fixes his iSCSI problems. I was going to try it with nbd, but
 perhaps I'll wait a week or so and see if others have more information.
 Applying patches before a holiday weekend is a good way to avoid time
 off. :-(

 We need to see more information on the failure that Jeff is seeing,
 and whether it goes away with the two known patches applied.  He
 applied this most recent patch against stock 2.6.23 which means that
 the platform was still open to the first biofill flags issue.

I applied both of the patches.  The biofill one did not apply cleanly,
as it was adding biofill to one section, and removing it from another,
but it appears that biofill does not need to be removed from a stock
2.6.23 kernel.  The second patch applies with a slight offset, but no
errors.

I can report success so far with both patches applied.  I created an
1100GB RAID5, formated it XFS, and successfully tar c | tar x 895GB
of data onto it.  I'm also in the process of rsync-ing the 895GB of
data from the (slightly changed) original.  In the past, I would
always get a hang within 0-50GB of data transfer.

For each drive in the RAID I also:

echo 128  /sys/block/$i/queue/max_sectors_kb
echo 512  /sys/block/$i/queue/nr_requests
echo 1  /sys/block/$i/device/queue_depth
blockdev --setra 65536 /dev/md3
echo 16384  /sys/block/md3/md/stripe_cache_size

These changes appear to improve performance, along with a RAID5 chunk
size of 1024k, but these changes alone (without the patches) do not
fix the problem.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 assemble after dual sata port failure

2007-11-09 Thread Chris Eddington

Thanks David.

I've had cable/port failures in the past and after re-adding the drive, 
the order changed - I'm not sure why, but I noticed it sometime ago but 
don't remember the exact order.


My initial attempt to assemble, it came up with only two drives in the 
array.  Then I tried assembling with --force and that brought up 3 of 
the drives.  At that point I thought I was good, so I tried mount 
/dev/md0 and it failed.  Would that have written to the disk?  I'm using 
XFS.


After that, I tried assembling with different drive orders on the 
command line, i.e. mdadm -Av --force /dev/md0 /dev/sda1, ... thinking 
that the order might not be right.


At the moment I can't access the machine, but I'll try fsck -n and send 
you the other info later this evening.


Many thanks,
Chris

David Greaves wrote:

Chris Eddington wrote:
  

Hi,


Hi
  

While on vacation I had one SATA port/cable fail, and then four hours
later a second one fail.  After fixing/moving the SATA ports, I can
reboot and all drives seem to be OK now, but when assembled it won't
recognize the filesystem.



That's unusual - if the array comes back then you should be OK.
In general if two devices fail then there is a real data loss risk.
However if the drives are good and there was just a cable glitch, then unless
you're unlucky it's usually fsck fixable.

I see
mdadm: /dev/md0 has been started with 3 drives (out of 4).

which means it's now up and running.

And:
sda1Events : 0.4880374
sdb1Events : 0.4880374
sdc1Events : 0.4857597
sdd1Events : 0.4880374

so sdc1 is way out of date... we'll add/resync that when everything else is 
working.

but:
  

 After futzing around with assemble options
like --force and disk order I couldn't get it to work.



Let me check... what commands did you use? Just 'assemble' - which doesn't care
about disk order - or did you try to re-'create' the array - which does care
about disk order and leads us down a different path...
err, scratch that:
  

 Creation Time : Sun Nov  5 14:25:01 2006


OK, it was created a year ago... so you did use assemble.


It is slightly odd to see that the drive order is:
/dev/mapper/sda1
/dev/mapper/sdb1
/dev/mapper/sdd1
/dev/mapper/sdc1
Usually people just create them in order.


Have you done any fsck's that involve a write?

What filesystem are you running? What does your 'fsck -n' (readonly) report?

Also, please report the results of:
 cat /proc/mdadm
 mdadm -D /dev/md0
 cat /etc/mdadm.conf


David

  


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 assemble after dual sata port failure

2007-11-09 Thread Chris Eddington

Hi David,

I ran xfs_check and get this:
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_check.  If you are unable to mount the filesystem, then use
the xfs_repair -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

After mounting (which fails) and re-running xfs_check it gives the same 
message.


The array info details are below and seems it is running correctly ??  I 
interpret the message above as actually a good sign - seems that 
xfs_check sees the filesystem but the log file and maybe the most 
currently written data is corrupted or will be lost.  But I'd like to 
hear some advice/guidance before doing anything permanent with 
xfs_repair.  I also would like to confirm somehow that the array is in 
the right order, etc.  Appreciate your feedback.


Thks,
Chris




cat /etc/mdadm/mdadm.conf
DEVICE /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
ARRAY /dev/md0 level=raid5 num-devices=4 
UUID=bc74c21c:9655c1c6:ba6cc37a:df870496

MAILADDR root

cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sda1[0] sdd1[2] sdb1[1]
 1465151808 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

unused devices: none


mdadm -D /dev/md0
/dev/md0:
   Version : 00.90.03
 Creation Time : Sun Nov  5 14:25:01 2006
Raid Level : raid5
Array Size : 1465151808 (1397.28 GiB 1500.32 GB)
   Device Size : 488383936 (465.76 GiB 500.11 GB)
  Raid Devices : 4
 Total Devices : 3
Preferred Minor : 0
   Persistence : Superblock is persistent

   Update Time : Fri Nov  9 16:26:31 2007
 State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 0
 Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

  UUID : bc74c21c:9655c1c6:ba6cc37a:df870496
Events : 0.4880384

   Number   Major   Minor   RaidDevice State
  0   810  active sync   /dev/sda1
  1   8   171  active sync   /dev/sdb1
  2   8   492  active sync   /dev/sdd1
  3   003  removed



Chris Eddington wrote:

Thanks David.

I've had cable/port failures in the past and after re-adding the 
drive, the order changed - I'm not sure why, but I noticed it sometime 
ago but don't remember the exact order.


My initial attempt to assemble, it came up with only two drives in the 
array.  Then I tried assembling with --force and that brought up 3 of 
the drives.  At that point I thought I was good, so I tried mount 
/dev/md0 and it failed.  Would that have written to the disk?  I'm 
using XFS.


After that, I tried assembling with different drive orders on the 
command line, i.e. mdadm -Av --force /dev/md0 /dev/sda1, ... thinking 
that the order might not be right.


At the moment I can't access the machine, but I'll try fsck -n and 
send you the other info later this evening.


Many thanks,
Chris



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html