Re: RAID tuning?

2006-06-13 Thread Justin Piszcz
mkfs -t xfs -f -d su=128k,sw=14 /dev/md9 Gordon, What speed do you get on your RAID, read and write? When I made my XFS/RAID-5, I accepted the defaults for the XFS filesystem but used a 512kb stripe. I get 80-90MB/s reads and ~39MB/s writes. On 5 x 400GB ATA/100 Seagates (on a regular

SW RAID 5 Bug? - Slow After Rebuild (XFS+2.6.16.20)

2006-06-18 Thread Justin Piszcz
I set a disk faulty and then rebuilt it, afterwards, I got horrible performance, I was using 2.6.16.20 during the tests. The FS I use is XFS. # xfs_info /dev/md3 meta-data=/dev/root isize=256agcount=16, agsize=1097941 blks = sectsz=512 attr=0

Re: Large single raid and XFS or two small ones and EXT3?

2006-06-26 Thread Justin Piszcz
On Sun, 25 Jun 2006, Bill Davidsen wrote: Justin Piszcz wrote: On Sat, 24 Jun 2006, Neil Brown wrote: On Friday June 23, [EMAIL PROTECTED] wrote: The problem is that there is no cost effective backup available. One-liner questions : - How does Google make backups ? No, Google

When will the Areca RAID driver be merged into mainline?

2006-06-27 Thread Justin Piszcz
Anyone have an ETA on this? I heard soon but was wondering how soon..? kernel-version-2.6.x kernel-version-2.6.x/arcmsr kernel-version-2.6.x/arcmsr/arcmsr.c kernel-version-2.6.x/arcmsr/arcmsr.h kernel-version-2.6.x/arcmsr/Makefile kernel-version-2.6.x/readme.txt The driver is quite small and

Re: Ok to go ahead with this setup?

2006-06-28 Thread Justin Piszcz
On Wed, 28 Jun 2006, [EMAIL PROTECTED] wrote: Mike Dresser wrote: On Fri, 23 Jun 2006, Molle Bestefich wrote: Christian Pernegger wrote: Anything specific wrong with the Maxtors? I'd watch out regarding the Western Digital disks, apparently they have a bad habit of turning themselves

Re: I need a PCI V2.1 4 port SATA card

2006-06-28 Thread Justin Piszcz
On Wed, 28 Jun 2006, Brad Campbell wrote: Guy wrote: Hello group, I am upgrading my disks from old 18 Gig SCSI disks to 300 Gig SATA disks. I need a good SATA controller. My system is old and has PCI V 2.1. I need a 4 port card, or 2 2 port cards. My system has multi PCI buses,

Re: I need a PCI V2.1 4 port SATA card

2006-06-28 Thread Justin Piszcz
On Wed, 28 Jun 2006, Christian Pernegger wrote: My current 15 drive RAID-6 server is built around a KT600 board with an AMD Sempron processor and 4 SATA150TX4 cards. It does the job but it's not the fastest thing around (takes about 10 hours to do a check of the array or about 15 to do a

Linux SATA Support Question - Is the ULI M1575 chip supported?

2006-07-03 Thread Justin Piszcz
In the source: enum { uli_5289= 0, uli_5287= 1, uli_5281= 2, uli_max_ports = 4, /* PCI configuration registers */ ULI5287_BASE= 0x90, /* sata0 phy SCR registers */

Re: Linux SATA Support Question - Is the ULI M1575 chip supported?

2006-07-04 Thread Justin Piszcz
On Mon, 3 Jul 2006, Jeff Garzik wrote: Justin Piszcz wrote: In the source: enum { uli_5289= 0, uli_5287= 1, uli_5281= 2, uli_max_ports = 4, /* PCI configuration registers

Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-07 Thread Justin Piszcz
p34:~# mdadm /dev/md3 -a /dev/hde1 mdadm: added /dev/hde1 p34:~# mdadm -D /dev/md3 /dev/md3: Version : 00.90.03 Creation Time : Fri Jun 30 09:17:12 2006 Raid Level : raid5 Array Size : 1953543680 (1863.04 GiB 2000.43 GB) Device Size : 390708736 (372.61 GiB 400.09 GB)

Re: Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-07 Thread Justin Piszcz
On Fri, 7 Jul 2006, Justin Piszcz wrote: p34:~# mdadm /dev/md3 -a /dev/hde1 mdadm: added /dev/hde1 p34:~# mdadm -D /dev/md3 /dev/md3: Version : 00.90.03 Creation Time : Fri Jun 30 09:17:12 2006 Raid Level : raid5 Array Size : 1953543680 (1863.04 GiB 2000.43 GB) Device Size

Re: Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-07 Thread Justin Piszcz
On Fri, 7 Jul 2006, Justin Piszcz wrote: On Fri, 7 Jul 2006, Justin Piszcz wrote: On Fri, 7 Jul 2006, Justin Piszcz wrote: p34:~# mdadm /dev/md3 -a /dev/hde1 mdadm: added /dev/hde1 p34:~# mdadm -D /dev/md3 /dev/md3: Version : 00.90.03 Creation Time : Fri Jun 30 09:17:12 2006

Re: Resizing RAID-1 arrays - some possible bugs and problems

2006-07-07 Thread Justin Piszcz
On Sat, 8 Jul 2006, Reuben Farrelly wrote: I'm just in the process of upgrading the RAID-1 disks in my server, and have started to experiment with the RAID-1 --grow command. The first phase of the change went well, I added the new disks to the old arrays and then increased the size of the

Re: Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-07 Thread Justin Piszcz
On Fri, 7 Jul 2006, Justin Piszcz wrote: On Fri, 7 Jul 2006, Justin Piszcz wrote: On Fri, 7 Jul 2006, Justin Piszcz wrote: On Fri, 7 Jul 2006, Justin Piszcz wrote: p34:~# mdadm /dev/md3 -a /dev/hde1 mdadm: added /dev/hde1 p34:~# mdadm -D /dev/md3 /dev/md3: Version : 00.90.03

Re: Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-07 Thread Justin Piszcz
On Fri, 7 Jul 2006, Justin Piszcz wrote: On Fri, 7 Jul 2006, Justin Piszcz wrote: On Fri, 7 Jul 2006, Justin Piszcz wrote: On Fri, 7 Jul 2006, Justin Piszcz wrote: On Fri, 7 Jul 2006, Justin Piszcz wrote: p34:~# mdadm /dev/md3 -a /dev/hde1 mdadm: added /dev/hde1 p34:~# mdadm -D

Re: Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-07 Thread Justin Piszcz
On Sat, 8 Jul 2006, Neil Brown wrote: On Friday July 7, [EMAIL PROTECTED] wrote: Jul 7 08:44:59 p34 kernel: [4295845.933000] raid5: reshape: not enough stripes. Needed 512 Jul 7 08:44:59 p34 kernel: [4295845.962000] md: couldn't update array info. -28 So the RAID5 reshape only works if

Re: Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-07 Thread Justin Piszcz
On Sat, 8 Jul 2006, Neil Brown wrote: On Friday July 7, [EMAIL PROTECTED] wrote: Hey! You're awake :) Yes, and thinking about breakfast (it's 8:30am here). I am going to try it with just 64kb to prove to myself it works with that, but then I will re-create the raid5 again like I had

Re: Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-07 Thread Justin Piszcz
On Sat, 8 Jul 2006, Neil Brown wrote: On Friday July 7, [EMAIL PROTECTED] wrote: I guess one has to wait until the reshape is complete before growing the filesystem..? Yes. The extra space isn't available until the reshape has completed (if it was available earlier, the reshape wouldn't

Re: Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-10 Thread Justin Piszcz
On Sat, 8 Jul 2006, Neil Brown wrote: On Friday July 7, [EMAIL PROTECTED] wrote: Jul 7 08:44:59 p34 kernel: [4295845.933000] raid5: reshape: not enough stripes. Needed 512 Jul 7 08:44:59 p34 kernel: [4295845.962000] md: couldn't update array info. -28 So the RAID5 reshape only works if

Re: Kernel 2.6.17 and RAID5 Grow Problem (critical section backup)

2006-07-10 Thread Justin Piszcz
On Tue, 11 Jul 2006, Jan Engelhardt wrote: md3 : active raid5 sdc1[7] sde1[6] sdd1[5] hdk1[2] hdi1[4] hde1[3] hdc1[1] hda1[0] 2344252416 blocks super 0.91 level 5, 512k chunk, algorithm 2 [8/8] [] [] reshape = 0.2% (1099280/390708736) finish=1031.7min

Raid5 Reshape Status + xfs_growfs = Success! (2.6.17.3)

2006-07-11 Thread Justin Piszcz
Neil, It worked, echo'ing the 600 to the stripe width in /sys, however, how come /dev/md3 says it is 0 MB when I type fdisk -l? Is this normal? Disk /dev/md0 doesn't contain a valid partition table Disk /dev/md3: 0 MB, 0 bytes 2 heads, 4 sectors/track, 0 cylinders Units = cylinders of 8 *

Re: [bug?] raid1 integrity checking is broken on 2.6.18-rc4

2006-08-12 Thread Justin Piszcz
On Sat, 12 Aug 2006, Chuck Ebbert wrote: Doing this on a raid1 array: echo check /sys/block/md0/md/sync_action On 2.6.16.27: Activity lights on both mirrors show activity for a while, then the array status prints on the console. On 2.6.18-rc4 + the below patch:

Re: raid5 grow problem

2006-08-17 Thread Justin Piszcz
On Thu, 17 Aug 2006, ÊæÐÇ wrote: hello all: i installed adadm 2.5.2,and compiled the 2.6.17.6 kernel .when i cmd to grom a raid5 array ,it don't work.how to do can make raid5 grow.thks for your help. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a

Re: raid5 grow problem

2006-08-17 Thread Justin Piszcz
I've only tried growing a RAID5, which was the only RAID that I remember being supported (to grow) in the kernel, I am not sure if its posible to grow other types of RAID arrays. On Thu, 17 Aug 2006, ÊæÐÇ wrote: dear sir: i try to make a reshape command with mdadm (version 2.5.2).in

Re: raid5 grow problem

2006-08-18 Thread Justin Piszcz
Adding XFS mailing list to this e-mail to show that the grow for xfs worked. On Thu, 17 Aug 2006, ÊæÐÇ wrote: I've only tried growing a RAID5, which was the only RAID that I remember being supported (to grow) in the kernel, I am not sure if its posible to i know this,but how you grow your

Re: Correct way to create multiple RAID volumes with hot-spare?

2006-08-23 Thread Justin Piszcz
On Tue, 22 Aug 2006, Steve Cousins wrote: Hi, I have a set of 11 500 GB drives. Currently each has two 250 GB partitions (/dev/sd?1 and /dev/sd?2). I have two RAID6 arrays set up, each with 10 drives and then I wanted the 11th drive to be a hot-spare. When I originally created the array

Re: Interesting RAID checking observations

2006-08-27 Thread Justin Piszcz
Second, trying checks on a fast (2.2 GHz AMD64) machine, I'm surprised at how slow it is: The PCI bus is only capable of 133MB/s max. Unless you have dedicated SATA ports, each on its own PCI-e bus, you will not get speeds in excess of 133MB/s, 200MB/s+ I have read reports of someone using

Re: 2 Hard Drives RAID

2006-09-09 Thread Justin Piszcz
On Wed, 6 Sep 2006, Sandra L. McGrew wrote: I have two hard drives installed in this DELL GX110 Optiplex computer. I believe that they are configured in RAID5, but am not certain. Is there a graphical method of determining how many drives are being used and how they are configured??? I'm

Re: access array from knoppix

2006-09-12 Thread Justin Piszcz
fdisk -l then you have to assemble the array mdadm --assemble /dev/md0 /dev/hda1 /dev/hdb1 # i think, man mdadm On Tue, 12 Sep 2006, Dexter Filmore wrote: When running Knoppix on my file server, I can't mount /dev/md0 simply because it isn't there. Am I guessing right that I need to recreate

Re: access *existing* array from knoppix

2006-09-12 Thread Justin Piszcz
Strange, what knoppix are you using? I recall doing it to fix an XFS bug with 4.x and 5.x. On Tue, 12 Sep 2006, Dexter Filmore wrote: Am Dienstag, 12. September 2006 16:08 schrieb Justin Piszcz: /dev/MAKEDEV /dev/md0 also make sure the SW raid modules etc are loaded if necessary. Won't

Re: Cannot create RAID5 - device or resource busy

2006-10-14 Thread Justin Piszcz
See if its mounted/etc first, if not: mdadm -S /dev/md0 (stop it) Then try again. On Sat, 14 Oct 2006, Ray Greene wrote: I am having problems creating a RAID 5 array using 3x400GB SATA drives on a Dell SC430 running Xandros 4. I created this once with Webmin and it worked OK but then I

Re: future hardware

2006-10-21 Thread Justin Piszcz
On Sat, 21 Oct 2006, Dan wrote: I have been using an older 64bit system, socket 754 for a while now. It has the old PCI bus 33Mhz. I have two low cost (no HW RAID) PCI SATA I cards each with 4 ports to give me an eight disk RAID 6. I also have a Gig NIC, on the PCI bus. I have Gig

Tweaking/Optimizing MD RAID: 195MB/s write, 181MB/s read (so far)

2007-01-11 Thread Justin Piszcz
With 4 Raptor 150s XFS (default XFS options): # Stripe tests: echo 8192 /sys/block/md3/md/stripe_cache_size # DD TESTS [WRITE] DEFAULT: $ dd if=/dev/zero of=10gb.no.optimizations.out bs=1M count=10240 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 96.6988 seconds,

Re: Tweaking/Optimizing MD RAID: 195MB/s write, 181MB/s read (so far)

2007-01-11 Thread Justin Piszcz
=393216 blocks=0, rtextents=0 On Fri, 12 Jan 2007, David Chinner wrote: On Thu, Jan 11, 2007 at 06:05:36PM -0500, Justin Piszcz wrote: With 4 Raptor 150s XFS (default XFS options): I need more context for this to be meaningful in any way. What type of md config are you using here? RAID0, 1

Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-11 Thread Justin Piszcz
Using 4 raptor 150s: Without the tweaks, I get 111MB/s write and 87MB/s read. With the tweaks, 195MB/s write and 211MB/s read. Using kernel 2.6.19.1. Without the tweaks and with the tweaks: # Stripe tests: echo 8192 /sys/block/md3/md/stripe_cache_size # DD TESTS [WRITE] DEFAULT: (512K) $ dd

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Justin Piszcz
On Fri, 12 Jan 2007, Michael Tokarev wrote: Justin Piszcz wrote: Using 4 raptor 150s: Without the tweaks, I get 111MB/s write and 87MB/s read. With the tweaks, 195MB/s write and 211MB/s read. Using kernel 2.6.19.1. Without the tweaks and with the tweaks: # Stripe tests

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Justin Piszcz
chunk size) On Fri, 12 Jan 2007, Justin Piszcz wrote: On Fri, 12 Jan 2007, Michael Tokarev wrote: Justin Piszcz wrote: Using 4 raptor 150s: Without the tweaks, I get 111MB/s write and 87MB/s read. With the tweaks, 195MB/s write and 211MB/s read. Using kernel 2.6.19.1

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Justin Piszcz
On Fri, 12 Jan 2007, Al Boldi wrote: Justin Piszcz wrote: RAID 5 TWEAKED: 1:06.41 elapsed @ 60% CPU This should be 1:14 not 1:06(was with a similarly sized file but not the same) the 1:14 is the same file as used with the other benchmarks. and to get that I used 256mb read-ahead

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Justin Piszcz
out 10737418240 bytes (11 GB) copied, 398.069 seconds, 27.0 MB/s Awful performance with your numbers/drop_caches settings.. ! What were your tests designed to show? Justin. On Fri, 12 Jan 2007, Justin Piszcz wrote: On Fri, 12 Jan 2007, Al Boldi wrote: Justin Piszcz wrote: RAID 5

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Justin Piszcz
On Sat, 13 Jan 2007, Al Boldi wrote: Justin Piszcz wrote: Btw, max sectors did improve my performance a little bit but stripe_cache+read_ahead were the main optimizations that made everything go faster by about ~1.5x. I have individual bonnie++ benchmarks of [only] the max_sector_kb

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-13 Thread Justin Piszcz
On Sat, 13 Jan 2007, Al Boldi wrote: Justin Piszcz wrote: On Sat, 13 Jan 2007, Al Boldi wrote: Justin Piszcz wrote: Btw, max sectors did improve my performance a little bit but stripe_cache+read_ahead were the main optimizations that made everything go faster by about ~1.5x

Re: bad performance on RAID 5

2007-01-17 Thread Justin Piszcz
Sevrin Robstad wrote: I'm suffering from bad performance on my RAID5. a echo check /sys/block/md0/md/sync_action gives a speed at only about 5000K/sec , and HIGH load average : # uptime 20:03:55 up 8 days, 19:55, 1 user, load average: 11.70, 4.04, 1.52 kernel is 2.6.18.1.2257.fc5 mdadm is

Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-20 Thread Justin Piszcz
My .config is attached, please let me know if any other information is needed and please CC (lkml) as I am not on the list, thanks! Running Kernel 2.6.19.2 on a MD RAID5 volume. Copying files over Samba to the RAID5 running XFS. Any idea what happened here? [473795.214705] BUG: unable to

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-20 Thread Justin Piszcz
On Sat, 20 Jan 2007, Justin Piszcz wrote: My .config is attached, please let me know if any other information is needed and please CC (lkml) as I am not on the list, thanks! Running Kernel 2.6.19.2 on a MD RAID5 volume. Copying files over Samba to the RAID5 running XFS. Any idea

2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-20 Thread Justin Piszcz
Perhaps its time to back to a stable (2.6.17.13 kernel)? Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 partition, the OOM killer goes into effect and kills almost all my processes. Completely 100% reproducible. Does 2.6.19.2 have some of memory allocation bug as

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-20 Thread Justin Piszcz
On Sat, 20 Jan 2007, Avuton Olrich wrote: On 1/20/07, Justin Piszcz [EMAIL PROTECTED] wrote: Perhaps its time to back to a stable (2.6.17.13 kernel)? Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 partition, the OOM killer goes into effect and kills almost all

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-20 Thread Justin Piszcz
On Sat, 20 Jan 2007, Justin Piszcz wrote: On Sat, 20 Jan 2007, Avuton Olrich wrote: On 1/20/07, Justin Piszcz [EMAIL PROTECTED] wrote: Perhaps its time to back to a stable (2.6.17.13 kernel)? Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 partition

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-21 Thread Justin Piszcz
On Sun, 21 Jan 2007, [EMAIL PROTECTED] wrote: From: Justin Piszcz [EMAIL PROTECTED] Date: Sat, Jan 20, 2007 at 04:03:42PM -0500 My swap is on, 2GB ram and 2GB of swap on this machine. I can't go back to 2.6.17.13 as it does not recognize the NICs in my machine correctly

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-21 Thread Justin Piszcz
On Sun, 21 Jan 2007, [EMAIL PROTECTED] wrote: From: Justin Piszcz [EMAIL PROTECTED] Date: Sat, Jan 20, 2007 at 04:03:42PM -0500 My swap is on, 2GB ram and 2GB of swap on this machine. I can't go back to 2.6.17.13 as it does not recognize the NICs in my machine correctly

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-21 Thread Justin Piszcz
On Sun, 21 Jan 2007, [EMAIL PROTECTED] wrote: From: Justin Piszcz [EMAIL PROTECTED] Date: Sun, Jan 21, 2007 at 11:48:07AM -0500 What about all of the changes with NAT? I see that it operates on level-3/network wise, I enabled that and backward compatiblity support as well

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible (multi-threaded USB no go)

2007-01-21 Thread Justin Piszcz
On Sun, 21 Jan 2007, Justin Piszcz wrote: Good luck, Jurriaan -- What does ELF stand for (in respect to Linux?) ELF is the first rock group that Ronnie James Dio performed with back in the early 1970's. In constrast, a.out is a misspelling of the French word

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible (multi-threaded USB no go)

2007-01-22 Thread Justin Piszcz
On Sun, 21 Jan 2007, Greg KH wrote: On Sun, Jan 21, 2007 at 12:29:51PM -0500, Justin Piszcz wrote: On Sun, 21 Jan 2007, Justin Piszcz wrote: Good luck, Jurriaan -- What does ELF stand for (in respect to Linux?) ELF is the first rock group

Re: change strip_cache_size freeze the whole raid

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, kyle wrote: Hi, Yesterday I tried to increase the value of strip_cache_size to see if I can get better performance or not. I increase the value from 2048 to something like 16384. After I did that, the raid5 freeze. Any proccess read / write to it stucked at D state.

Re: change strip_cache_size freeze the whole raid

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, kyle wrote: On Mon, 22 Jan 2007, kyle wrote: Hi, Yesterday I tried to increase the value of strip_cache_size to see if I can get better performance or not. I increase the value from 2048 to something like 16384. After I did that, the raid5 freeze.

Re: change strip_cache_size freeze the whole raid

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, Steve Cousins wrote: Justin Piszcz wrote: Yes, I noticed this bug too, if you change it too many times or change it at the 'wrong' time, it hangs up when you echo numbr /proc/stripe_cache_size. Basically don't run it more than once and don't run

Re: change strip_cache_size freeze the whole raid

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, Steve Cousins wrote: Justin Piszcz wrote: Yes, I noticed this bug too, if you change it too many times or change it at the 'wrong' time, it hangs up when you echo numbr /proc/stripe_cache_size. Basically don't run it more than once and don't run

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, Pavel Machek wrote: On Sun 2007-01-21 14:27:34, Justin Piszcz wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke the OOM killer and kill all of my processes? Doing this on a single disk 2.6.19.2 is OK, no issues. However

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-22 Thread Justin Piszcz
What's that? Software raid or hardware raid? If the latter, which driver? Software RAID (md) On Mon, 22 Jan 2007, Andrew Morton wrote: On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-23 Thread Justin Piszcz
On Tue, 23 Jan 2007, Neil Brown wrote: On Monday January 22, [EMAIL PROTECTED] wrote: Justin Piszcz wrote: My .config is attached, please let me know if any other information is needed and please CC (lkml) as I am not on the list, thanks! Running Kernel 2.6.19.2 on a MD RAID5

Re: change strip_cache_size freeze the whole raid

2007-01-23 Thread Justin Piszcz
I can try and do this later this week possibly. Justin. On Tue, 23 Jan 2007, Neil Brown wrote: On Monday January 22, [EMAIL PROTECTED] wrote: Hi, Yesterday I tried to increase the value of strip_cache_size to see if I can get better performance or not. I increase the value from 2048

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-23 Thread Justin Piszcz
On Tue, 23 Jan 2007, Michael Tokarev wrote: Justin Piszcz wrote: [] Is this a bug that can or will be fixed or should I disable pre-emption on critical and/or server machines? Disabling pre-emption on critical and/or server machines seems to be a good idea in the first place. IMHO

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-23 Thread Justin Piszcz
On Tue, 23 Jan 2007, Michael Tokarev wrote: Justin Piszcz wrote: On Tue, 23 Jan 2007, Michael Tokarev wrote: Disabling pre-emption on critical and/or server machines seems to be a good idea in the first place. IMHO anyway.. ;) So bottom line is make sure not to use preemption

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-24 Thread Justin Piszcz
On Mon, 22 Jan 2007, Chuck Ebbert wrote: Justin Piszcz wrote: My .config is attached, please let me know if any other information is needed and please CC (lkml) as I am not on the list, thanks! Running Kernel 2.6.19.2 on a MD RAID5 volume. Copying files over Samba to the RAID5

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
chipset also uses some memory, in any event mem=256 causes the machine to lockup before it can even get to the boot/init processes, the two leds on the keyboard were blinking, caps lock and scroll lock and I saw no console at all! Justin. On Mon, 22 Jan 2007, Justin Piszcz wrote: On Mon

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
On Mon, 22 Jan 2007, Andrew Morton wrote: On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke the OOM killer and kill all of my processes? What's that? Software raid or hardware

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
And FYI yes I used mem=256M just as you said, not mem=256. Justin. On Wed, 24 Jan 2007, Justin Piszcz wrote: Is it highmem-related? Can you try it with mem=256M? Bad idea, the kernel crashes burns when I use mem=256, I had to boot 2.6.20-rc5-6 single to get back into my machine, very

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
On Mon, 22 Jan 2007, Andrew Morton wrote: On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke the OOM killer and kill all of my processes? What's that? Software raid or hardware

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
On Thu, 25 Jan 2007, Pavel Machek wrote: Hi! Is it highmem-related? Can you try it with mem=256M? Bad idea, the kernel crashes burns when I use mem=256, I had to boot 2.6.20-rc5-6 single to get back into my machine, very nasty. Remember I use an onboard graphics controller

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
On Thu, 25 Jan 2007, Pavel Machek wrote: Hi! Is it highmem-related? Can you try it with mem=256M? Bad idea, the kernel crashes burns when I use mem=256, I had to boot 2.6.20-rc5-6 single to get back into my machine, very nasty. Remember I use an onboard graphics controller

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-25 Thread Justin Piszcz
On Thu, 25 Jan 2007, Pavel Machek wrote: Hi! Is it highmem-related? Can you try it with mem=256M? Bad idea, the kernel crashes burns when I use mem=256, I had to boot 2.6.20-rc5-6 single to get back into my machine, very nasty. Remember I use an onboard graphics controller

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-25 Thread Justin Piszcz
On Thu, 25 Jan 2007, Nick Piggin wrote: Justin Piszcz wrote: On Mon, 22 Jan 2007, Andrew Morton wrote: After the oom-killing, please see if you can free up the ZONE_NORMAL memory via a few `echo 3 /proc/sys/vm/drop_caches' commands. See if you can work out what happened

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-25 Thread Justin Piszcz
On Wed, 24 Jan 2007, Bill Cizek wrote: Justin Piszcz wrote: On Mon, 22 Jan 2007, Andrew Morton wrote: On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke the OOM killer

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-25 Thread Justin Piszcz
On Thu, 25 Jan 2007, Mark Hahn wrote: Something is seriously wrong with that OOM killer. do you know you don't have to operate in OOM-slaughter mode? vm.overcommit_memory = 2 in your /etc/sysctl.conf puts you into a mode where the kernel tracks your committed memory needs, and will

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-26 Thread Justin Piszcz
On Fri, 26 Jan 2007, Andrew Morton wrote: On Wed, 24 Jan 2007 18:37:15 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Without digging too deeply, I'd say you've hit the same bug Sami Farin and others have reported starting with 2.6.19: pages mapped with kmap_atomic() become

Re: slow 'check'

2007-02-10 Thread Justin Piszcz
On Sat, 10 Feb 2007, Eyal Lebedinsky wrote: I have a six-disk RAID5 over sata. First two disks are on the mobo and last four are on a Promise SATA-II-150-TX4. The sixth disk was added recently and I decided to run a 'check' periodically, and started one manually to see how long it should

Re: slow 'check'

2007-02-10 Thread Justin Piszcz
On Sat, 10 Feb 2007, Eyal Lebedinsky wrote: Justin Piszcz wrote: On Sat, 10 Feb 2007, Eyal Lebedinsky wrote: I have a six-disk RAID5 over sata. First two disks are on the mobo and last four are on a Promise SATA-II-150-TX4. The sixth disk was added recently and I decided to run a 'check

Re: Changing chunk size

2007-02-16 Thread Justin Piszcz
On Fri, 16 Feb 2007, Steve Cousins wrote: Bill Davidsen wrote: I'm sure slow is a relative term, compared to backing up TBs of data and trying to restore them. Not to mention the lack of inexpensive TB size backup media. That's totally unavailable at the moment, I'll live with what I

Re: mdadm --grow failed

2007-02-19 Thread Justin Piszcz
On Mon, 19 Feb 2007, Marc Marais wrote: On Sun, 18 Feb 2007 07:13:28 -0500 (EST), Justin Piszcz wrote On Sun, 18 Feb 2007, Marc Marais wrote: On Sun, 18 Feb 2007 20:39:09 +1100, Neil Brown wrote On Sunday February 18, [EMAIL PROTECTED] wrote: Ok, I understand the risks which is why I did

2.6.20: stripe_cache_size goes boom with 32mb

2007-02-23 Thread Justin Piszcz
Each of these are averaged over three runs with 6 SATA disks in a SW RAID 5 configuration: (dd if=/dev/zero of=file_1 bs=1M count=2000) 128k_stripe: 69.2MB/s 256k_stripe: 105.3MB/s 512k_stripe: 142.0MB/s 1024k_stripe: 144.6MB/s 2048k_stripe: 208.3MB/s 4096k_stripe: 223.6MB/s 8192k_stripe:

Re: 2.6.20: stripe_cache_size goes boom with 32mb

2007-02-23 Thread Justin Piszcz
have a good idea on what's happening :-) Cheers, Jason On Fri, 2007-02-23 at 06:41 -0500, Justin Piszcz wrote: Each of these are averaged over three runs with 6 SATA disks in a SW RAID 5 configuration: (dd if=/dev/zero of=file_1 bs=1M count=2000) 128k_stripe: 69.2MB/s 256k_stripe: 105.3MB/s

Re: nonzero mismatch_cnt with no earlier error

2007-02-24 Thread Justin Piszcz
Of course you could just run repair but then you would never know that mismatch_cnt was 0. Justin. On Sat, 24 Feb 2007, Justin Piszcz wrote: Perhaps, The way it works (I believe is as follows) 1. echo check sync_action 2. If mismatch_cnt 0 then run: 3. echo repair sync_action 4. Re-run

Re: nonzero mismatch_cnt with no earlier error

2007-02-24 Thread Justin Piszcz
. The mismatch_cnt returned to 0 at the start of the resync, but around the same time that it went up to 8 with the check, it went up to 8 in the resync. After the resync, it still is 8. I haven't ordered a check since the resync completed. On Sat, 2007-02-24 at 04:37 -0500, Justin Piszcz wrote: Of course you

Re: nonzero mismatch_cnt with no earlier error

2007-02-24 Thread Justin Piszcz
I called it a resync because that's what /proc/mdstat told me it was doing. On Sat, 2007-02-24 at 04:50 -0500, Justin Piszcz wrote: A resync? You're supposed to run a 'repair' are you not? Justin. On Sat, 24 Feb 2007, Jason Rainforest wrote: I tried doing a check, found a mismatch_cnt of 8 (7

Re: nonzero mismatch_cnt with no earlier error

2007-02-24 Thread Justin Piszcz
On Sat, 24 Feb 2007, Michael Tokarev wrote: Jason Rainforest wrote: I tried doing a check, found a mismatch_cnt of 8 (7*250Gb SW RAID5, multiple controllers on Linux 2.6.19.2, SMP x86-64 on Athlon64 X2 4200 +). I then ordered a resync. The mismatch_cnt returned to 0 at the start of As

Linux Software RAID Bitmap Question

2007-02-25 Thread Justin Piszcz
Anyone have a good explanation for the use of bitmaps? Anyone on the list use them? http://gentoo-wiki.com/HOWTO_Gentoo_Install_on_Software_RAID#Data_Scrubbing Provides an explanation on that page. I believe Neil stated that using bitmaps does incur a 10% performance penalty. If one's box

Re: nonzero mismatch_cnt with no earlier error

2007-02-25 Thread Justin Piszcz
On Sun, 25 Feb 2007, Christian Pernegger wrote: Sorry to hijack the thread a little but I just noticed that the mismatch_cnt for my mirror is at 256. I'd always thought the monthly check done by the mdadm Debian package does repair as well - apparently it doesn't. So I guess I should run

Re: trouble creating array

2007-02-25 Thread Justin Piszcz
On Sun, 25 Feb 2007, jahammonds prost wrote: Just built a new FC6 machine, with 5x 320Gb drives and 1x 300Gb drive. Made a 300Gb partition on all the drives /dev/hd{c,d,e} and /dev/sd{a,b,c}... Trying to create an array gave me an error, since it thought there was already an array on some

Re: Linux Software RAID Bitmap Question

2007-02-28 Thread Justin Piszcz
On Wed, 28 Feb 2007, dean gaudet wrote: On Mon, 26 Feb 2007, Neil Brown wrote: On Sunday February 25, [EMAIL PROTECTED] wrote: I believe Neil stated that using bitmaps does incur a 10% performance penalty. If one's box never (or rarely) crashes, is a bitmap needed? I think I said it can

Re: Growing a raid 6 array

2007-03-01 Thread Justin Piszcz
You can only grow a RAID5 array in Linux as of 2.6.20 AFAIK. Justin. On Thu, 1 Mar 2007, Laurent CARON wrote: Hi, As our storage needs are growing i'm in the process of growing a 6TB array to 9TB by changing the disks one by one (500GB to 750GB). I'll have to partition the new drives with

Re: Growing a raid 6 array

2007-03-01 Thread Justin Piszcz
On Fri, 2 Mar 2007, Laurent CARON wrote: Justin Piszcz wrote: You can only grow a RAID5 array in Linux as of 2.6.20 AFAIK. From the man page: Grow Grow (or shrink) an array, or otherwise reshape it in some way. Currently supported growth options including changing the active size

Re: RAID1, hot-swap and boot integrity

2007-03-02 Thread Justin Piszcz
On Fri, 2 Mar 2007, Mike Accetta wrote: We are using a RAID1 setup with two SATA disks on x86, using the whole disks as the array components. I'm pondering the following scenario. We will boot from whichever drive the BIOS has first in its boot list (the other drive will be second). In the

Does anyone on this list have a Raptor74/150 sw raid with 8 drives?

2007-03-02 Thread Justin Piszcz
Who runs software raid with PCI-e or dedicated PCI-X controllers? I was wondering what would one get with 10 150GB raptors, each on their own dedicated PCI-e card... When I max out two separate raids on my PCI-e based motherboard, I see speeds in excess of 430-450MB/s. 1. 4 raptor 150s 2.

Re: detecting/correcting _slightly_ flaky disks

2007-03-05 Thread Justin Piszcz
On Mon, 5 Mar 2007, Michael Stumpf wrote: I'm trying to assemble an array (raid 5) of 8 older, but not yet old age ATA 120 gig disks, but there is intermittent flakiness in one or more of the drives. Symptoms: * Won't boot sometimes. Even after moving to 2 power supplies and monitoring

Re: detecting/correcting _slightly_ flaky disks

2007-03-05 Thread Justin Piszcz
Besides being run for a long time, I don't see anything strange with this drive. Justin. On Mon, 5 Mar 2007, Michael Stumpf wrote: This is the drive I think is most suspect. What isn't obvious, because it isn't listed in the self test log, is between #1 and #2 there was an aborted, hung

Re: high mismatch count after scrub

2007-03-06 Thread Justin Piszcz
On Tue, 6 Mar 2007, Dexter Filmore wrote: xerxes:/sys/block/md0/md# cat mismatch_cnt 147248 Need to worry? If you have a swap file on this array, then that could explain it, so don't worry. Nope, swap is not on the array. Couple of loops tho. If not... maybe worry? I assume you did a

Re: sw raid0 read bottleneck

2007-03-13 Thread Justin Piszcz
On Tue, 13 Mar 2007, Tomka Gergely wrote: Hi! I am currently testing 3ware raid cards. Now i have 15 disks, and on these a swraid0. The write speed seems good (700 MBps), but the read performance only 350 MBps. Another problem when i try to read with two process, then the _sum_ of the read

Re: sw raid0 read bottleneck

2007-03-13 Thread Justin Piszcz
Nice. On Tue, 13 Mar 2007, Tomka Gergely wrote: On Tue, 13 Mar 2007, Tomka Gergely wrote: On Tue, 13 Mar 2007, Justin Piszcz wrote: Have you tried increasing your readahead values for the md device? Yes. No real change. According to my humble mental image, readahead not a too useful

Re: Data corruption on software raid.

2007-03-18 Thread Justin Piszcz
On Sun, 18 Mar 2007, Sander Smeenk wrote: Hello! Long story. Get some coke. I'm having an odd problem with using software raid on two Western Digital disks type WD2500JD-00F (250gb) connected to a Silicon Image Sil3112 PCI SATA conroller running with Linux 2.6.20, mdadm 2.5.6 [[ .. snip

LILO 22.6.1-9.3 not compatible with SW RAID1 metdata = 1.0

2007-03-26 Thread Justin Piszcz
Neil, Using: Debian Etch. I picked this up via http://anti.teamidiot.de/nei/2006/10/softraid_lilo/ via google cache. Basically, LILO will not even run correctly if the metadata is not 0.90. After I had done that, LILO ran successfully for the boot md device, but I still could not boot my

Re: Software RAID (non-preempt) server blocking question. (2.6.20.4)

2007-03-29 Thread Justin Piszcz
On Thu, 29 Mar 2007, Neil Brown wrote: On Tuesday March 27, [EMAIL PROTECTED] wrote: I ran a check on my SW RAID devices this morning. However, when I did so, I had a few lftp sessions open pulling files. After I executed the check, the lftp processes entered 'D' state and I could do

  1   2   3   4   >