Re: [PATCH md 2 of 4] Fix raid6 problem

2005-02-13 Thread Richard Scobie
Mike Hardy wrote: Its running x86_64 (Fedora Core 3) and the problem is rooted in the chipset I believe. I don't think its Opterons per se, I think its just the Athlon take two - which is to say that its a wonderful chip, but some of the chipsets its saddled with are horrible, and careful

Re: raid5 - failed disks - i'm confusing

2005-04-04 Thread Richard Scobie
Doug Ledford wrote: Anyway, it might or might not hurt the drives to run them well below their designed operating temperature, I don't have schematics and materials lists in front of me to tell for sure. But second guessing mechanical engineers that likely have compensated for thermal issues at a

Re: shared spare

2006-02-11 Thread Richard Scobie
Bill Davidsen wrote: One of the things I like about the IBM ServeRAID controller is spare drive shared between two RAID groups. First to fail gets it. For software RAID is this at all in the future? Hi Bill, Unless I am misunderstanding something, software RAID has this already. I have not

Array will not assemble

2006-07-06 Thread Richard Scobie
Perhaps I am misunderstanding how assemble works, but I have created a new RAID 1 array on a pair of SCSI drives and am having difficulty re-assembling it after a reboot. The relevent mdadm.conf entry looks like this: ARRAY /dev/md3 level=raid1 num-devices=2

Re: Array will not assemble

2006-07-06 Thread Richard Scobie
Neil Brown wrote: Add DEVICE /dev/sd? or similar on a separate line. Remove devices=/dev/sdc,/dev/sdd Thanks. My mistake, I thought after having assembled the arrays initially, that the output of: mdadm --detail --scan mdadm.conf could be used directly. I'm using Centos 4.3, which

RAID over Firewire

2006-08-22 Thread Richard Scobie
Has anyone had any experience or comment regarding linux RAID over ieee1394? As a budget backup solution, I am considering using a pair of 500GB drives, each connected to a firewire 400 port, configured as a linear array, to which the contents of an onboard array will be rsynced weekly. In

Re: RAID over Firewire

2006-08-23 Thread Richard Scobie
Gordon Henderson wrote: While I haven't done this, I have a client who uses Firewire drives (Lacie) as a backup solution and they seem to just work, and look like locally attached SCSI drives (Performance is quite good too!) I guess you won't be hot pluging/unplugging them, so those issues

Re: RAID over Firewire

2006-08-23 Thread Richard Scobie
Dexter Filmore wrote: Of all modes I wouldn't use a linear setup for backups. One disk dies - all data is lost. I'd go for an external raid5 solution, tho those tend to be slow and expensive. Unfortunately budget is the overriding factor here. Unlike RAID 0, I thought there may be a way

Re: RAID over Firewire

2006-08-23 Thread Richard Scobie
Mike Hardy wrote: I'm not sure SMART works over firewire anyway. That's a question. http://smartmontools.sourceforge.net/: As for USB and FireWire (ieee1394) disks and tape drives, the news is not good. They appear to Linux as SCSI devices but their implementations do not usually support

Re: Linux: Why software RAID?

2006-08-23 Thread Richard Scobie
Jeff Garzik wrote: Mark Perkel wrote: Running Linux on an AMD AM2 nVidia chip ser that supports Raid 0 striping on the motherboard. Just wondering if hardware raid (SATA2) is going to be faster that software raid and why? Jeff, on a slightly related note, is the driver status for the

Re: Linux: Why software RAID?

2006-08-24 Thread Richard Scobie
Jeff Garzik wrote: Richard Scobie wrote: Jeff, on a slightly related note, is the driver status for the NVIDIA as reflected on your site, correct for the new nForce 590/570 AM2 chipset? Unfortunately I rarely have an idea about how marketing names correlate to chipsets. Do you have

Kernel RAID support

2006-09-02 Thread Richard Scobie
I am building 2.6.18rc5-mm1 and I cannot find the entry under make config, to enable the various RAID options. Perhaps there is something I have said N to and it is hidden. Can someone please assist - this is a bit embarrassing... Regards, Richard -- VGER BF report: U 0.503282 - To

Re: Kernel RAID support

2006-09-02 Thread Richard Scobie
Josh Litherland wrote: On Sun, 2006-09-03 at 15:56 +1200, Richard Scobie wrote: I am building 2.6.18rc5-mm1 and I cannot find the entry under make config, to enable the various RAID options. Under Device Drivers, switch on Multi-device support. Thanks. I must be going nuts, as it does

Re: RAID over Firewire

2006-09-04 Thread Richard Scobie
Bill Davidsen wrote: It should work, but I don't like it... it leaves you with a lot of exposure between backups. Unless your data change a lot, you might consider a good incremental dump program to DVD or similar. Thanks. I have abandoned this option for various reasons, including

UUID's

2006-09-08 Thread Richard Scobie
If I have specified an array in mdadm.conf using UUID's: ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371 and I replace a failed drive in the array, will the new drive be given the previous UUID, or do I need to upate the mdadm.conf entry? Regards, Richard - To unsubscribe from this

Re: UUID's

2006-09-08 Thread Richard Scobie
dean gaudet wrote: On Sat, 9 Sep 2006, Richard Scobie wrote: If I have specified an array in mdadm.conf using UUID's: ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371 and I replace a failed drive in the array, will the new drive be given the previous UUID, or do I need to upate

Check/repair on composite RAID

2006-09-08 Thread Richard Scobie
If I have a RAID 10, comprising a RAID 0, /dev/md3 made up of RAID1, /dev/md1 and RAID1, /dev/md2 and I do an: echo repair /sys/block/md3/md/sync_action will this run simultaneous repairs on the the underlying RAID 1's, or should seperate repairs be done to md1 and 2? Thanks for any

Re: future hardware

2006-10-21 Thread Richard Scobie
Dan wrote: What are other users of mdadm using with the PCI-express cards, most cost effective solution? I have been successfully using a pair of Addonics AD2SA3GPX1 cards, with 4 x 500GB in a stacked RAID0 on top of a pair of RAID1 configuration. The cards are cheap and use the sil24

Re: new array not starting

2006-11-07 Thread Richard Scobie
Robin Bowes wrote: Robin Bowes wrote: This worked: # mdadm --assemble --auto=yes /dev/md2 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj mdadm: /dev/md2 has been started with 8 drives. However, I'm not sure why it didn't start automatically at boot. Do I need to put

Observations of a failing disk

2006-11-27 Thread Richard Scobie
I have a machine running Fedora 5, kernel 2.6.17-1.2187_FC5smp, with a pair of software RAID 1 arrays (WD 500GB RE2), RAID 0'ed together. Every 14 days, one of the arrays has a repair, (echo repair /sys/block/mdX/md/sync_action), run on it, to hopefully pick up and fix dead sectors. Over

Re: Observations of a failing disk

2006-11-27 Thread Richard Scobie
dean gaudet wrote: one theory was that i lucked out and the pending sectors in the unused disk near the md superblock... but since that's in general only about 90KB of disk i was kind of skeptical. it's certainly possible, but seems unlikely. I can discount this one in my case, as sectors

Re: RAID1 repair issue with 2.6.16.36 kernel

2007-01-08 Thread Richard Scobie
Michel Lespinasse wrote: Would you by any chance also know why the repair process did not work with 2.6.16.36 ??? Has any related bug been fixed recently ? Should I try again with a newer kernel, or should I rather avoid this for now ? I have not had much luck with repair fixing things, using

5th USENIX Conference on File and Storage Technologies - Paper

2007-02-20 Thread Richard Scobie
Another paper on hard drive failures. http://www.usenix.org/events/fast07/tech/schroeder/schroeder_html/index.html Regards, Richard - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at

Re: Linux Software RAID a bit of a weakness?

2007-02-23 Thread Richard Scobie
Neil Brown wrote: The 'check' process reads all copies and compares them with one another, If there is a difference it is reported. If you use 'repair' instead of 'check', the difference is arbitrarily corrected. If a read error is detected during the 'check', md/raid1 will attempt to write

Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Richard Scobie
Mark Hahn wrote: this - I checked the seagate 7200.10: 10k feet operating, 40k max. amusingly -200 feet is the min either way... Which means you could not use this drive on the shores of the Dead Sea, which is at about -1300ft. Regards, Richard - To unsubscribe from this list: send the

Re: Linux Software RAID a bit of a weakness?

2007-02-25 Thread Richard Scobie
Mark Hahn wrote: is it known what a long self-test does? for instance, ultimately you want the disk to be scrubbed over some fairly lengthy period of time. that is, not just read and checked, possibly with parity fixed, but all blocks read and rewritten (with verify, I suppose!) The smartctl

Re: swap on raid

2007-03-01 Thread Richard Scobie
Peter Rabbitson wrote: Hi, I need to use a raid volume for swap, utilizing partitions from 4 physical drives I have available. From my experience I have three options - raid5, raid10 with 2 offset chunks, and two raid 1 volumes that are swapon-ed with equal priority. However I have a hard

Re: Help with chunksize on raid10 -p o3 array

2007-03-12 Thread Richard Scobie
Peter Rabbitson wrote: Is this anywhere near the top of the todo list, or for now raid10 users are bound to a maximum read speed of a two drive combination? I have not done any testing with the md native RAID10 implementations, so perhaps there are some other advantages, but have you tried

Re: Speed variation depending on disk position

2007-05-06 Thread Richard Scobie
Peter Rabbitson wrote: design of modern drives? I have an array of 4 Maxtor sata drives, and raw read performance at the end of the disk is 38mb/s compared to 62mb/s at the beginning. At least one supplier of terabyte arrays mitigates this effect and improves seek times, by using 750GB

Re: mdadm array not found on reboot

2007-05-07 Thread Richard Scobie
Jeffrey B. Layton wrote: CentOS 4.2. I've been reading something about raidautorun. Would help in this case? Try adding: DEVICE partitions to the top of your mdadm.conf and: auto=part to the end of your /dev/md1 definition. eg. ARRAY /dev/md1 level=raid1 num-devices=2

Re: very strange (maybe) raid1 testing results

2007-05-30 Thread Richard Scobie
Jon Nelson wrote: I am getting 70-80MB/s read rates as reported via dstat, and 60-80MB/s as reported by dd. What I don't understand is why just one disk is being used here, instead of two or more. I tried different versions of metadata, and using a bitmap makes no difference. I created the

Re: Software based SATA RAID-5 expandable arrays?

2007-06-21 Thread Richard Scobie
Michael wrote: Thank you; Not that I want to, but where did you find a SATA PCI card that fit 15 drives? Areca have a few - a range of PCI-X cards that do up to 24 SATA drives (ARC-1170) and PCI-e up to 24 drives (ARC-1280). Regards, Richard - To unsubscribe from this list: send the line

RAID 5 Grow

2007-06-22 Thread Richard Scobie
I will soon be adding another same sized drive to an existing 3 drive RAID 5 array. The machine is running Fedora Core 6 with kernel 2.6.20-1.2952.fc6 and mdadm 2.5.4, both of which are the latest available Fedora packages. Is anyone aware of any obvious bugs in either of these that will

4 Port eSATA RAID5/JBOD PCI-E 8x Controller

2007-08-20 Thread Richard Scobie
This looks like a potentially good, cheap candidate for md use. Although Linux support is not explicitly mentioned, SiI 3124 is used. http://www.addonics.com/products/host_controller/ADSA3GPX8-4e.asp Regards, Richard - To unsubscribe from this list: send the line unsubscribe linux-raid in

Re: MDADM restart at system boot time

2007-09-04 Thread Richard Scobie
Paul wrote: Hi, I hope someone can help. I have manually installed mdadm and now have a working array. I installed mdadm from the tarball, then did 'make install' My problem is after system reboot the array will not start as the daemon is not running, I guess. In order to access my files

Re: MDADM restart at system boot time

2007-09-04 Thread Richard Scobie
Paul wrote: Hi, I hope someone can help. I have manually installed mdadm and now have a working array. I installed mdadm from the tarball, then did 'make install' My problem is after system reboot the array will not start as the daemon is not running, I guess. In order to access my files

Re: Raid-10 mount at startup always has problem

2007-09-09 Thread Richard Scobie
Daniel L. Miller wrote: And you didn't ask, but my mdadm.conf: DEVICE partitions ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a Hi Daniel, Try adding auto=part at the end of you mdadm.conf ARRAY line. Regards, Richard - To unsubscribe

Re: reducing the number of disks a RAID1 expects

2007-09-09 Thread Richard Scobie
J. David Beutel wrote: My /dev/hdd started failing its SMART check, so I removed it from a RAID1: # mdadm /dev/md5 -f /dev/hdd2 -r /dev/hdd2 Now when I boot it looks like this in /proc/mdstat: md5 : active raid1 hdc8[2] hdg8[1] 58604992 blocks [3/2] [_UU] and I get a DegradedArray event

Re: raid5 - which disk failed ?

2007-09-23 Thread Richard Scobie
Rainer Fuegenstein wrote: 1) when md starts a resync of the array, shouldn't one drive be marked as down [_UUU] in mdstat instead of reporting it as [] ? or, the other way round: is hde really the faulty drive ? how can I make sure I'm removing and replacing the proper drive ? If it is

Re: Without tweaking ,

2007-09-26 Thread Richard Scobie
Justin Piszcz wrote: For raptors, they are inheriently known for their poor speed when NCQ is enabled, I see 20-30MiB/s better performance with NCQ off. Hi Justin, Have you tested this for multiple reader/writers? Regards, Richard - To unsubscribe from this list: send the line unsubscribe

Re: Without tweaking ,

2007-09-26 Thread Richard Scobie
Justin Piszcz wrote: If you have a good repeatable benchmark you want me to run with it on/off let me know, no I only used bonnie++/iozone/tiobench/dd but not any parallelism with those utilities. Perhaps iozone with 5 threads, NCQ on and off? Regards, Richard - To unsubscribe from this

Re: Without tweaking ,

2007-09-26 Thread Richard Scobie
Justin Piszcz wrote: With multiple threads, not too much difference.. Thanks for that - as you say not a great deal there, slight improvements for some of the random tests. Regards, Richard - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to

Re: RAID 5 performance issue.

2007-10-03 Thread Richard Scobie
Andrew Clayton wrote: Yeah, I was wondering about that. It certainly hasn't improved things, it's unclear if it's made things any worse.. Many 3124 cards are PCI-X, so if you have one of these (and you seem to be using a server board which may well have PCI-X), bus performance is not going

Re: RAID 5 performance issue.

2007-10-05 Thread Richard Scobie
Have you had a look at the smartctl -a outputs of all the drives? Possibly one drive is being slow to respond due to seek errors etc. but I would perhaps expect to be seeing this in the log. If you have a full backup and a spare drive, I would probably rotate it through the array. Regards,

Re: very degraded RAID5, or increasing capacity by adding discs

2007-10-08 Thread Richard Scobie
Janek Kozicki wrote: Is it possible anyhow to create a very degraded raid array - a one that consists of 4 drives, but has only TWO ? No, but you can make a degraded 3 drive array, containing 2 drives and then add the next drive to complete it. The array can then be grown (man mdadm, GROW

Re: How do i limit the bandwidth-usage while resyncing on RAID 1?

2007-10-10 Thread Richard Scobie
Rustedt, Florian wrote: How can i tune this? I want somthing like nice -n 19 dm-mirror ;) Have a look at man rsync - the --bwlimit=KBPS option. Regards, Richard - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More

Re: AW: How do i limit the bandwidth-usage while resyncing on RAID 1?

2007-10-11 Thread Richard Scobie
Rustedt, Florian wrote: Hi Richard, Seems to me, that you mussunderstood? There's no rsync in RAID afaik? This is an internal driver...? Hi Florian, My mistake - I read rsync in the Subject of your mail instead of resync. Try echo X /sys/block/mdY/md/sync_speed_max where X is

Re: Software RAID when it works and when it doesn't

2007-10-16 Thread Richard Scobie
Mike Accetta wrote: is at the mercy of the low level disk driver. We've observed abysmal RAID1 recovery times on failing SATA disks because all the time is being spent in the driver retrying operations which will never succeed. Also, read errors don't tend to fail the array so when the bad

Re: Hardware raid; Areca 3ware hardware RAID cards under Linux

2007-10-18 Thread Richard Scobie
Harry Mangalam wrote: This list is usually about software raid, but this analysis of hardware RAID cards under Linux may be of interest as well. Comments/suggestions are very welcome. Thanks Harry. Detailed, well presented articles on this subject are pretty thin on the ground. Regards,

Re: slow raid5 performance

2007-10-22 Thread Richard Scobie
Peter wrote: Thanks Justin, good to hear about some real world experience. Hi Peter, I recently built a 3 drive RAID5 using the onboard SATA controllers on an MCP55 based board and get around 115MB/s write and 141MB/s read. A fourth drive was added some time later and after growing the

Re: Implementing low level timeouts within MD

2007-10-27 Thread Richard Scobie
Alberto Alonso wrote: After 4 different array failures all due to a single drive failure I think it would really be helpful if the md code timed out the driver. Hi Alberto, Sorry you've been having so much trouble. For interest, can you tell us what drives and controllers are involved?

Re: Implementing low level timeouts within MD

2007-10-27 Thread Richard Scobie
Alberto Alonso wrote: What hardware do you use? I was trying to compile a list of known configurations capable to detect and degrade properly. To date I have not yet had a SATA based array drive go faulty - all mine have been PATA arrays on Intel or AMD MB controllers, which as per your

Re: Raid-10 mount at startup always has problem

2007-10-29 Thread Richard Scobie
Daniel L. Miller wrote: Nothing in the documentation (that I read - granted I don't always read everything) stated that partitioning prior to md creation was necessary - in fact references were provided on how to use complete disks. Is there an official position on, To Partition, or Not To

Re: telling mdadm to use spare drive.

2007-11-07 Thread Richard Scobie
Janek Kozicki wrote: Goswin von Brederlow said: (by the date of Wed, 07 Nov 2007 10:17:51 +0100) Strange. That is exactly how I always do it and it always just worked. mdadm should start syncing on any spare as soon as a disk fails or you add the spare to a degraded array afaik. No

Re: telling mdadm to use spare drive.

2007-11-08 Thread Richard Scobie
Janek Kozicki wrote: Richard Scobie said: (by the date of Thu, 08 Nov 2007 08:13:19 +1300) What kernel and RAID level is this? If it's RAID 1, I seem to recall there was a relatively recently fixed bug for this. debian etch, stock install Linux 2.6.18-5-k7 #1 SMP i686 GNU/Linux

Re: Spontaneous rebuild

2007-12-02 Thread Richard Scobie
Justin Piszcz wrote: While we are on the subject of bad blocks, is it possible to do what 3ware raid controllers do without an external card? They know when a block is bad and they remap it to another part of the array etc, where as with software raid you never know this is happening until

Re: mdadm break / restore soft mirror

2007-12-12 Thread Richard Scobie
Brett Maton wrote: Hi, Question for you guys. A brief history: RHEL 4 AS I have a partition with way to many small files on (Usually around a couple of million) that needs to be backed up, standard methods mean that a restore is impossibly slow due to the sheer volume of files.

Re: raid10: unfair disk load?

2007-12-23 Thread Richard Scobie
Jon Nelson wrote: My own tests on identical hardware (same mobo, disks, partitions, everything) and same software, with the only difference being how mdadm is invoked (the only changes here being level and possibly layout) show that raid0 is about 15% faster on reads than the very fast raid10,

Re: raid5 grow reshaping speed is unchangeable

2007-12-27 Thread Richard Scobie
Cody Yellan wrote: I had a 4x500GB SATA2 array, md0. I added one 500GB drive and reshaping began at ~2500K/sec. Changing /proc/sys/dev/raid/speed_limit_m{in,ax} or /sys/block/md0/md/sync_speed_m{in,ax} had no effect. I shut down all unnecessary services and the array is offline (not

Re: raid5 grow reshaping speed is unchangeable

2007-12-28 Thread Richard Scobie
Cody Yellan wrote: You are right, Richard. RHEL5 had a stripe_cache_size of 256 when the reshape began. I increased it to 1024 and the reshape speed doubled to 4500K/s. I did not see any increase in memory usage. I tried 2048 and then 4096 but saw no difference in speed. Sorry, I did not

Re: New XFS benchmarks using David Chinner's recommendations for XFS-based optimizations.

2007-12-30 Thread Richard Scobie
Justin Piszcz wrote: Why does mdadm still use 64k for the default chunk size? Probably because this is the best balance for average file sizes, which are smaller than you seem to be testing with? Regards, Richard - To unsubscribe from this list: send the line unsubscribe linux-raid in the

Re: New XFS benchmarks using David Chinner's recommendations for XFS-based optimizations.

2007-12-31 Thread Richard Scobie
Peter Grandi wrote: In particular if one uses parity-based (not a good idea in general...) arrays, as small chunk sizes (as well as stripe sizes) give a better chance of reducing the frequency of RMW. Thanks for your thoughts - the above was my thinking when I posted. Regards, Richard - To

Re: RAID 1 and grub

2008-01-30 Thread Richard Scobie
David Rees wrote: Have you tried re-running grub-install after booting from a rescue disk? -Dave Hi David, I have but although I can advance further it seems that the BIOS is doing some strange things as well, switching drive ordering around. With a new hda installed and partitioned,

Re: RAID 1 and grub

2008-01-30 Thread Richard Scobie
A followup for the archives: I found this document very useful: http://lists.us.dell.com/pipermail/linux-poweredge/2003-July/008898.html After modifying my grub.conf to refer to (hd0,0), reinstalling grub on hdc with: grub device (hd0) /dev/hdc grub root (hd0,0) grub (hd0) and rebooting

Re: RAID 1 and grub

2008-01-30 Thread Richard Scobie
David Rees wrote: FWIW, this step is clearly marked in the Software-RAID HOWTO under Booting on RAID: http://tldp.org/HOWTO/Software-RAID-HOWTO-7.html#ss7.3 The one place I didn't look... BTW, I suspect you are missing the command setup from your 3rd command above, it should be: # grub

Re: RAID 1 and grub

2008-02-02 Thread Richard Scobie
Keld Jørn Simonsen wrote: # grub grub device (hd0) /dev/hdc grub root (hd0,0) grub setup (hd0) I do not grasp this. How and where is it said that two disks are involved? hda and hdc should both be involved. There are not two disks involved in this instance. This is used in the scenario

Re: RAID 1 and grub

2008-02-03 Thread Richard Scobie
Bill Davidsen wrote: Have you actually tested this by removing the first hd and booting? Depending on the BIOS I believe that the fallback drive will be called hdc by the BIOS but will be hdd in the system. That was with RHEL3, but worth testing. Hi Bill, I did not try this particular

Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash)

2008-02-04 Thread Richard Scobie
Michael Tokarev wrote: Unfortunately an UPS does not *really* help here. Because unless it has control program which properly shuts system down on the loss of input power, and the battery really has the capacity to power the system while it's shutting down (anyone tested this? With new UPS?

Re: Further clarification of the sync_action check behavior.

2008-02-24 Thread Richard Scobie
Harrell, Thomas wrote: Lastly, under which mounted conditions are 'check' and 'repair' safe/useful? Can I run both on a read-write mounted partition? Should they be read-only, or totally unmounted? I'm assuming since resyncs can occur while the overlying fs is mounted read-write, that it is