Re: below 10MB/s write on raid5

2007-06-11 Thread Justin Piszcz
On Mon, 11 Jun 2007, Jon Nelson wrote: On Mon, 11 Jun 2007, Justin Piszcz wrote: On Mon, 11 Jun 2007, Dexter Filmore wrote: On Monday 11 June 2007 14:47:50 Justin Piszcz wrote: On Mon, 11 Jun 2007, Dexter Filmore wrote: I recently upgraded my file server, yet I'm still unsatisfied

Re: RAID 6 grow problem

2007-06-02 Thread Justin Piszcz
On Sat, 2 Jun 2007, Iain Rauch wrote: Hello, when I run: mdadm /dev/md1 --grow --raid-devices 16 --backup-file=/md1backup I get: mdadm: Need to backup 1792K of critical section.. mdadm: Cannot set device size/shape for /dev/md1: Invalid argument Any help? Iain - To unsubscribe from

Re: RAID 6 grow problem

2007-06-02 Thread Justin Piszcz
On Sat, 2 Jun 2007, Iain Rauch wrote: For the critical section part, it may be your syntax.. When I had the problem, Neil showed me the path! :) I don't think it is incorrect, before I thought it was supposted to specify an actual file so I 'touch'ed one and it says file exists. For your

Re: mismatch_cnt = 128 for root (/) md raid1 device

2007-05-30 Thread Justin Piszcz
Asking again.. On Sat, 26 May 2007, Justin Piszcz wrote: Kernel 2.6.21.3 Fri May 25 20:00:02 EDT 2007: Executing RAID health check for /dev/md0... Fri May 25 20:00:03 EDT 2007: Executing RAID health check for /dev/md1... Fri May 25 20:00:04 EDT 2007: Executing RAID health check for /dev/md2

mismatch_cnt = 128 for root (/) md raid1 device

2007-05-26 Thread Justin Piszcz
Kernel 2.6.21.3 Fri May 25 20:00:02 EDT 2007: Executing RAID health check for /dev/md0... Fri May 25 20:00:03 EDT 2007: Executing RAID health check for /dev/md1... Fri May 25 20:00:04 EDT 2007: Executing RAID health check for /dev/md2... Fri May 25 20:00:05 EDT 2007: Executing RAID health check

Re: Neil, bug in the minimum guaranteed patch?

2007-05-12 Thread Justin Piszcz
On Fri, 11 May 2007, Justin Piszcz wrote: It worked for a while but this time I ran my raid check while doing an rsync, for a while, it guaranteed 1MB/s but then: $ cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md1 : active raid1 sdb2[1] sda2[0] 136448 blocks [2/2

Neil, bug in the minimum guaranteed patch?

2007-05-11 Thread Justin Piszcz
It worked for a while but this time I ran my raid check while doing an rsync, for a while, it guaranteed 1MB/s but then: $ cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md1 : active raid1 sdb2[1] sda2[0] 136448 blocks [2/2] [UU] resync=DELAYED md2 : active

Re: Questions about the speed when MD-RAID array is being initialized.

2007-05-10 Thread Justin Piszcz
On Thu, 10 May 2007, Liang Yang wrote: Hi, I created a MD-RAID5 array using 8 Maxtor SAS Disk Drives (chunk size is 256k). I have measured the data transfer speed for single SAS disk drive (physical drive, not filesystem on it), it is roughly about 80~90MB/s. However, I notice MD also

Re: Questions about the speed when MD-RAID array is being initialized.

2007-05-10 Thread Justin Piszcz
in each disk platter? Thanks, Liang - Original Message - From: Justin Piszcz [EMAIL PROTECTED] To: Liang Yang [EMAIL PROTECTED] Cc: linux-raid@vger.kernel.org Sent: Thursday, May 10, 2007 2:33 PM Subject: Re: Questions about the speed when MD-RAID array is being initialized. On Thu, 10

Chaining sg lists for big I/O commands: Question

2007-05-09 Thread Justin Piszcz
http://kerneltrap.org/node/8176 I am a mdadm/disk/hard drive fanatic, I was curious: On i386, we can at most fit 256 scatterlist elements into a page, and on x86-64 we are stuck with 128. So that puts us somewhere between 512kb and 1024kb for a single IO. How come 32bit is 256 and 64 is only

Linux MD Raid Bug(?) w/Kernel sync_speed_min Option

2007-05-08 Thread Justin Piszcz
Kernel: 2.6.21.1 Here is the bug: md2: RAID1 (works fine) md3: RAID5 (only syncs at the sync_speed_min set by the kernel) If I do not run this command: echo 55000 /sys/block/md3/md/sync_speed_min I will get 2 megabytes per second check speed for RAID 5. However, the odd part is I can leave

Re: Linux MD Raid Bug(?) w/Kernel sync_speed_min Option

2007-05-08 Thread Justin Piszcz
On Tue, 8 May 2007, Neil Brown wrote: On Tuesday May 8, [EMAIL PROTECTED] wrote: Kernel: 2.6.21.1 Here is the bug: md2: RAID1 (works fine) md3: RAID5 (only syncs at the sync_speed_min set by the kernel) If I do not run this command: echo 55000 /sys/block/md3/md/sync_speed_min I will get

Re: mdadm array not found on reboot

2007-05-07 Thread Justin Piszcz
On Mon, 7 May 2007, Jeffrey B. Layton wrote: Hello, I apologize if this is a FAQ question or a typical newbie question, but by google efforts have yielded anything yet. I built a RAID-1 using mdadm (Centos 4.2 with 2.6.16.19 kernel and mdadm 1.6.0-2). It's just two SATA drives that I

Re: mdadm array not found on reboot

2007-05-07 Thread Justin Piszcz
On Mon, 7 May 2007, Jeffrey B. Layton wrote: Justin Piszcz wrote: On Mon, 7 May 2007, Jeffrey B. Layton wrote: Hello, I apologize if this is a FAQ question or a typical newbie question, but by google efforts have yielded anything yet. I built a RAID-1 using mdadm (Centos 4.2

Re: mdadm array not found on reboot

2007-05-07 Thread Justin Piszcz
On Mon, 7 May 2007, Jeffrey B. Layton wrote: Justin Piszcz wrote: On Mon, 7 May 2007, Jeffrey B. Layton wrote: Justin Piszcz wrote: On Mon, 7 May 2007, Jeffrey B. Layton wrote: Hello, I apologize if this is a FAQ question or a typical newbie question, but by google efforts have

Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?

2007-05-05 Thread Justin Piszcz
Question, I currently have a 965 chipset-based motherboard, use 4 port onboard and several PCI-e x1 controller cards for a raid 5 of 10 raptor drives. I get pretty decent speeds: [EMAIL PROTECTED] time dd if=/dev/zero of=100gb bs=1M count=102400 102400+0 records in 102400+0 records out

Re: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?

2007-05-05 Thread Justin Piszcz
On Sat, 5 May 2007, Emmanuel Florac wrote: Le Sat, 5 May 2007 12:33:49 -0400 (EDT) vous écriviez: However, if I want to upgrade to more than 12 disks, I am out of PCI-e slots, so I was wondering, does anyone on this list run a 16 port Areca or 3ware card and use it for JBOD? I don't use

Re: raid10 on centos 5

2007-05-04 Thread Justin Piszcz
cat /proc/mdstat is the raid10 personality installed? On Fri, 4 May 2007, Ruslan Sivak wrote: [EMAIL PROTECTED] wrote: No LVM over the two RAID 1's is more like RAID 1c which is just a concatenation of RAID 1's. You don't get the striping that you get in RAID 10. That's what I guessed.

Re: raid10 on centos 5

2007-05-04 Thread Justin Piszcz
Compile into the kernel, boot new kernel then create your RAID 10 volume with mdadm :) On Fri, 4 May 2007, Ruslan Sivak wrote: Justin Piszcz wrote: cat /proc/mdstat is the raid10 personality installed? No, it's not. How would I go about installing it? Personalities: [raid0] [raid1

Re: raid10 on centos 5

2007-05-04 Thread Justin Piszcz
Unsure for CentOS, I use Debian and always compile my own kernel. Justin. On Fri, 4 May 2007, Ruslan Sivak wrote: Justin Piszcz wrote: Compile into the kernel, boot new kernel then create your RAID 10 volume with mdadm :) So a custom kernel is needed? Is there a way to do a kickstart

RE: [PATCH 00/16] raid acceleration and asynchronous offload api for 2.6.22

2007-05-02 Thread Justin Piszcz
On Wed, 2 May 2007, Williams, Dan J wrote: From: Nick Piggin [mailto:[EMAIL PROTECTED] I am pleased to release this latest spin of the raid acceleration patches for merge consideration. This release aims to address all pending review items including MD bug fixes and async_tx api changes

RE: [PATCH 00/16] raid acceleration and asynchronous offload api for 2.6.22

2007-05-02 Thread Justin Piszcz
On Wed, 2 May 2007, Williams, Dan J wrote: From: Justin Piszcz [mailto:[EMAIL PROTECTED] I have not been following this closely, must you have an CONFIG_INTEL_IOP_ADMA piece of hardware and/or chipset to use this feature or can regular desktop users take hold of it as well? Currently

Re: XFS on x86_64 Linux Question

2007-04-28 Thread Justin Piszcz
Wow, probably better to stick with 32bit then? On Sat, 28 Apr 2007, Raz Ben-Jehuda(caro) wrote: Justin hello I have tested 32 to 64 bit porting of linux raid5 and xfs and LVM it worked. though i cannot say I have tested throughly. it was a POC. On 4/28/07, Justin Piszcz [EMAIL PROTECTED

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Justin Piszcz
On Sun, 22 Apr 2007, Pallai Roland wrote: On Sunday 22 April 2007 02:18:09 Justin Piszcz wrote: On Sat, 21 Apr 2007, Pallai Roland wrote: RAID5, chunk size 128k: # mdadm -C -n8 -l5 -c128 -z 1200 /dev/md/0 /dev/sd[ijklmnop] (waiting for sync, then mount, mkfs, etc) # blockdev --setra

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Justin Piszcz
On Sun, 22 Apr 2007, Pallai Roland wrote: On Sunday 22 April 2007 10:47:59 Justin Piszcz wrote: On Sun, 22 Apr 2007, Pallai Roland wrote: On Sunday 22 April 2007 02:18:09 Justin Piszcz wrote: How did you run your read test? I did run 100 parallel reader process (dd) top of XFS file

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Justin Piszcz
On Sun, 22 Apr 2007, Pallai Roland wrote: On Sunday 22 April 2007 12:23:12 Justin Piszcz wrote: On Sun, 22 Apr 2007, Pallai Roland wrote: On Sunday 22 April 2007 10:47:59 Justin Piszcz wrote: On Sun, 22 Apr 2007, Pallai Roland wrote: On Sunday 22 April 2007 02:18:09 Justin Piszcz wrote

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Justin Piszcz
On Sun, 22 Apr 2007, Pallai Roland wrote: On Sunday 22 April 2007 13:42:43 Justin Piszcz wrote: http://www.rhic.bnl.gov/hepix/talks/041019pm/schoen.pdf Check page 13 of 20. Thanks, interesting presentation. I'm working in the same area now, big media files and many clients. I spent some

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Justin Piszcz
On Sun, 22 Apr 2007, Pallai Roland wrote: On Sunday 22 April 2007 16:48:11 Justin Piszcz wrote: Have you also optimized your stripe cache for writes? Not yet. Is it worth it? -- d Yes, it is-- well, if write speed is important to you that is? Each of these write tests are averaged

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-22 Thread Justin Piszcz
On Sun, 22 Apr 2007, Mr. James W. Laferriere wrote: Hello Justin , On Sun, 22 Apr 2007, Justin Piszcz wrote: On Sun, 22 Apr 2007, Pallai Roland wrote: On Sunday 22 April 2007 16:48:11 Justin Piszcz wrote: Have you also optimized your stripe cache for writes? Not yet. Is it worth

Re: major performance drop on raid5 due to context switches caused by small max_hw_sectors [partially resolved]

2007-04-21 Thread Justin Piszcz
On Sat, 21 Apr 2007, Pallai Roland wrote: On Saturday 21 April 2007 07:47:49 you wrote: On 4/21/07, Pallai Roland [EMAIL PROTECTED] wrote: I made a software RAID5 array from 8 disks top on a HPT2320 card driven by hpt's driver. max_hw_sectors is 64Kb in this proprietary driver. I began to

Re: raid1 does not seem faster

2007-04-09 Thread Justin Piszcz
On Thu, 5 Apr 2007, Justin Piszcz wrote: On Thu, 5 Apr 2007, Iustin Pop wrote: On Wed, Apr 04, 2007 at 07:11:50PM -0400, Bill Davidsen wrote: You are correct, but I think if an optimization were to be done, some balance between the read time, seek time, and read size could be done. Using

Re: raid1 does not seem faster

2007-04-05 Thread Justin Piszcz
On Thu, 5 Apr 2007, Iustin Pop wrote: On Wed, Apr 04, 2007 at 07:11:50PM -0400, Bill Davidsen wrote: You are correct, but I think if an optimization were to be done, some balance between the read time, seek time, and read size could be done. Using more than one drive only makes sense when

Re: raid1 does not seem faster

2007-04-05 Thread Justin Piszcz
On Thu, 5 Apr 2007, Iustin Pop wrote: On Thu, Apr 05, 2007 at 04:11:35AM -0400, Justin Piszcz wrote: On Thu, 5 Apr 2007, Iustin Pop wrote: On Wed, Apr 04, 2007 at 07:11:50PM -0400, Bill Davidsen wrote: You are correct, but I think if an optimization were to be done, some balance between

Re: Kernel 2.6.20.4: Software RAID 5: ata13.00: (irq_stat 0x00020002, failed to transmit command FIS)

2007-04-05 Thread Justin Piszcz
On Thu, 5 Apr 2007, Justin Piszcz wrote: Had a quick question, this is the first time I have seen this happen, and it was not even under during heavy I/O, hardly anything was going on with the box at the time. .. snip .. # /usr/bin/time badblocks -b 512 -s -v -w /dev/sdl Checking for bad

Re: Any Intel folks on the list? Intel PCI-E bridge ACPI resource question

2007-04-05 Thread Justin Piszcz
My .config is attached.. I cannot reproduce this problem, it only happened once, but I want to find out how to make sure it does not happen again. On Thu, 5 Apr 2007, Justin Piszcz wrote: On Thu, 5 Apr 2007, Justin Piszcz wrote: http://www.ussg.iu.edu/hypermail/linux/kernel/0701.3/0315

Re: Software RAID (non-preempt) server blocking question. (2.6.20.4)

2007-03-30 Thread Justin Piszcz
On Fri, 30 Mar 2007, Neil Brown wrote: On Thursday March 29, [EMAIL PROTECTED] wrote: Did you look at cat /proc/mdstat ?? What sort of speed was the check running at? Around 44MB/s. I do use the following optimization, perhaps a bad idea if I want other processes to 'stay alive'? echo

Re: is this raid5 OK ?

2007-03-30 Thread Justin Piszcz
On Fri, 30 Mar 2007, Rainer Fuegenstein wrote: Bill Davidsen wrote: This still looks odd, why should it behave like this. I have created a lot of arrays (when I was doing the RAID5 speed testing thread), and never had anything like this. I'd like to see dmesg to see if there was an error

Re: Software RAID (non-preempt) server blocking question. (2.6.20.4)

2007-03-29 Thread Justin Piszcz
On Thu, 29 Mar 2007, Neil Brown wrote: On Tuesday March 27, [EMAIL PROTECTED] wrote: I ran a check on my SW RAID devices this morning. However, when I did so, I had a few lftp sessions open pulling files. After I executed the check, the lftp processes entered 'D' state and I could do

Re: Software RAID (non-preempt) server blocking question. (2.6.20.4)

2007-03-29 Thread Justin Piszcz
On Thu, 29 Mar 2007, Henrique de Moraes Holschuh wrote: On Thu, 29 Mar 2007, Justin Piszcz wrote: Did you look at cat /proc/mdstat ?? What sort of speed was the check running at? Around 44MB/s. I do use the following optimization, perhaps a bad idea if I want other processes to 'stay alive

Re: is this raid5 OK ?

2007-03-29 Thread Justin Piszcz
On Thu, 29 Mar 2007, Rainer Fuegenstein wrote: hi, I manually created my first raid5 on 4 400 GB pata harddisks: [EMAIL PROTECTED] ~]# mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 --spare-devices=0 /dev/hde1 /dev/hdf1 /dev/hdg1 /dev/hdh1 mdadm: layout defaults to

LILO 22.6.1-9.3 not compatible with SW RAID1 metdata = 1.0

2007-03-26 Thread Justin Piszcz
Neil, Using: Debian Etch. I picked this up via http://anti.teamidiot.de/nei/2006/10/softraid_lilo/ via google cache. Basically, LILO will not even run correctly if the metadata is not 0.90. After I had done that, LILO ran successfully for the boot md device, but I still could not boot my

Re: Data corruption on software raid.

2007-03-18 Thread Justin Piszcz
On Sun, 18 Mar 2007, Sander Smeenk wrote: Hello! Long story. Get some coke. I'm having an odd problem with using software raid on two Western Digital disks type WD2500JD-00F (250gb) connected to a Silicon Image Sil3112 PCI SATA conroller running with Linux 2.6.20, mdadm 2.5.6 [[ .. snip

Re: sw raid0 read bottleneck

2007-03-13 Thread Justin Piszcz
On Tue, 13 Mar 2007, Tomka Gergely wrote: Hi! I am currently testing 3ware raid cards. Now i have 15 disks, and on these a swraid0. The write speed seems good (700 MBps), but the read performance only 350 MBps. Another problem when i try to read with two process, then the _sum_ of the read

Re: sw raid0 read bottleneck

2007-03-13 Thread Justin Piszcz
Nice. On Tue, 13 Mar 2007, Tomka Gergely wrote: On Tue, 13 Mar 2007, Tomka Gergely wrote: On Tue, 13 Mar 2007, Justin Piszcz wrote: Have you tried increasing your readahead values for the md device? Yes. No real change. According to my humble mental image, readahead not a too useful

Re: high mismatch count after scrub

2007-03-06 Thread Justin Piszcz
On Tue, 6 Mar 2007, Dexter Filmore wrote: xerxes:/sys/block/md0/md# cat mismatch_cnt 147248 Need to worry? If you have a swap file on this array, then that could explain it, so don't worry. Nope, swap is not on the array. Couple of loops tho. If not... maybe worry? I assume you did a

Re: detecting/correcting _slightly_ flaky disks

2007-03-05 Thread Justin Piszcz
On Mon, 5 Mar 2007, Michael Stumpf wrote: I'm trying to assemble an array (raid 5) of 8 older, but not yet old age ATA 120 gig disks, but there is intermittent flakiness in one or more of the drives. Symptoms: * Won't boot sometimes. Even after moving to 2 power supplies and monitoring

Re: detecting/correcting _slightly_ flaky disks

2007-03-05 Thread Justin Piszcz
Besides being run for a long time, I don't see anything strange with this drive. Justin. On Mon, 5 Mar 2007, Michael Stumpf wrote: This is the drive I think is most suspect. What isn't obvious, because it isn't listed in the self test log, is between #1 and #2 there was an aborted, hung

Re: RAID1, hot-swap and boot integrity

2007-03-02 Thread Justin Piszcz
On Fri, 2 Mar 2007, Mike Accetta wrote: We are using a RAID1 setup with two SATA disks on x86, using the whole disks as the array components. I'm pondering the following scenario. We will boot from whichever drive the BIOS has first in its boot list (the other drive will be second). In the

Does anyone on this list have a Raptor74/150 sw raid with 8 drives?

2007-03-02 Thread Justin Piszcz
Who runs software raid with PCI-e or dedicated PCI-X controllers? I was wondering what would one get with 10 150GB raptors, each on their own dedicated PCI-e card... When I max out two separate raids on my PCI-e based motherboard, I see speeds in excess of 430-450MB/s. 1. 4 raptor 150s 2.

Re: Growing a raid 6 array

2007-03-01 Thread Justin Piszcz
You can only grow a RAID5 array in Linux as of 2.6.20 AFAIK. Justin. On Thu, 1 Mar 2007, Laurent CARON wrote: Hi, As our storage needs are growing i'm in the process of growing a 6TB array to 9TB by changing the disks one by one (500GB to 750GB). I'll have to partition the new drives with

Re: Growing a raid 6 array

2007-03-01 Thread Justin Piszcz
On Fri, 2 Mar 2007, Laurent CARON wrote: Justin Piszcz wrote: You can only grow a RAID5 array in Linux as of 2.6.20 AFAIK. From the man page: Grow Grow (or shrink) an array, or otherwise reshape it in some way. Currently supported growth options including changing the active size

Re: Linux Software RAID Bitmap Question

2007-02-28 Thread Justin Piszcz
On Wed, 28 Feb 2007, dean gaudet wrote: On Mon, 26 Feb 2007, Neil Brown wrote: On Sunday February 25, [EMAIL PROTECTED] wrote: I believe Neil stated that using bitmaps does incur a 10% performance penalty. If one's box never (or rarely) crashes, is a bitmap needed? I think I said it can

Linux Software RAID Bitmap Question

2007-02-25 Thread Justin Piszcz
Anyone have a good explanation for the use of bitmaps? Anyone on the list use them? http://gentoo-wiki.com/HOWTO_Gentoo_Install_on_Software_RAID#Data_Scrubbing Provides an explanation on that page. I believe Neil stated that using bitmaps does incur a 10% performance penalty. If one's box

Re: nonzero mismatch_cnt with no earlier error

2007-02-25 Thread Justin Piszcz
On Sun, 25 Feb 2007, Christian Pernegger wrote: Sorry to hijack the thread a little but I just noticed that the mismatch_cnt for my mirror is at 256. I'd always thought the monthly check done by the mdadm Debian package does repair as well - apparently it doesn't. So I guess I should run

Re: trouble creating array

2007-02-25 Thread Justin Piszcz
On Sun, 25 Feb 2007, jahammonds prost wrote: Just built a new FC6 machine, with 5x 320Gb drives and 1x 300Gb drive. Made a 300Gb partition on all the drives /dev/hd{c,d,e} and /dev/sd{a,b,c}... Trying to create an array gave me an error, since it thought there was already an array on some

Re: nonzero mismatch_cnt with no earlier error

2007-02-24 Thread Justin Piszcz
Of course you could just run repair but then you would never know that mismatch_cnt was 0. Justin. On Sat, 24 Feb 2007, Justin Piszcz wrote: Perhaps, The way it works (I believe is as follows) 1. echo check sync_action 2. If mismatch_cnt 0 then run: 3. echo repair sync_action 4. Re-run

Re: nonzero mismatch_cnt with no earlier error

2007-02-24 Thread Justin Piszcz
. The mismatch_cnt returned to 0 at the start of the resync, but around the same time that it went up to 8 with the check, it went up to 8 in the resync. After the resync, it still is 8. I haven't ordered a check since the resync completed. On Sat, 2007-02-24 at 04:37 -0500, Justin Piszcz wrote: Of course you

Re: nonzero mismatch_cnt with no earlier error

2007-02-24 Thread Justin Piszcz
I called it a resync because that's what /proc/mdstat told me it was doing. On Sat, 2007-02-24 at 04:50 -0500, Justin Piszcz wrote: A resync? You're supposed to run a 'repair' are you not? Justin. On Sat, 24 Feb 2007, Jason Rainforest wrote: I tried doing a check, found a mismatch_cnt of 8 (7

Re: nonzero mismatch_cnt with no earlier error

2007-02-24 Thread Justin Piszcz
On Sat, 24 Feb 2007, Michael Tokarev wrote: Jason Rainforest wrote: I tried doing a check, found a mismatch_cnt of 8 (7*250Gb SW RAID5, multiple controllers on Linux 2.6.19.2, SMP x86-64 on Athlon64 X2 4200 +). I then ordered a resync. The mismatch_cnt returned to 0 at the start of As

2.6.20: stripe_cache_size goes boom with 32mb

2007-02-23 Thread Justin Piszcz
Each of these are averaged over three runs with 6 SATA disks in a SW RAID 5 configuration: (dd if=/dev/zero of=file_1 bs=1M count=2000) 128k_stripe: 69.2MB/s 256k_stripe: 105.3MB/s 512k_stripe: 142.0MB/s 1024k_stripe: 144.6MB/s 2048k_stripe: 208.3MB/s 4096k_stripe: 223.6MB/s 8192k_stripe:

Re: 2.6.20: stripe_cache_size goes boom with 32mb

2007-02-23 Thread Justin Piszcz
have a good idea on what's happening :-) Cheers, Jason On Fri, 2007-02-23 at 06:41 -0500, Justin Piszcz wrote: Each of these are averaged over three runs with 6 SATA disks in a SW RAID 5 configuration: (dd if=/dev/zero of=file_1 bs=1M count=2000) 128k_stripe: 69.2MB/s 256k_stripe: 105.3MB/s

Re: mdadm --grow failed

2007-02-19 Thread Justin Piszcz
On Mon, 19 Feb 2007, Marc Marais wrote: On Sun, 18 Feb 2007 07:13:28 -0500 (EST), Justin Piszcz wrote On Sun, 18 Feb 2007, Marc Marais wrote: On Sun, 18 Feb 2007 20:39:09 +1100, Neil Brown wrote On Sunday February 18, [EMAIL PROTECTED] wrote: Ok, I understand the risks which is why I did

Re: Changing chunk size

2007-02-16 Thread Justin Piszcz
On Fri, 16 Feb 2007, Steve Cousins wrote: Bill Davidsen wrote: I'm sure slow is a relative term, compared to backing up TBs of data and trying to restore them. Not to mention the lack of inexpensive TB size backup media. That's totally unavailable at the moment, I'll live with what I

Re: slow 'check'

2007-02-10 Thread Justin Piszcz
On Sat, 10 Feb 2007, Eyal Lebedinsky wrote: I have a six-disk RAID5 over sata. First two disks are on the mobo and last four are on a Promise SATA-II-150-TX4. The sixth disk was added recently and I decided to run a 'check' periodically, and started one manually to see how long it should

Re: slow 'check'

2007-02-10 Thread Justin Piszcz
On Sat, 10 Feb 2007, Eyal Lebedinsky wrote: Justin Piszcz wrote: On Sat, 10 Feb 2007, Eyal Lebedinsky wrote: I have a six-disk RAID5 over sata. First two disks are on the mobo and last four are on a Promise SATA-II-150-TX4. The sixth disk was added recently and I decided to run a 'check

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-26 Thread Justin Piszcz
On Fri, 26 Jan 2007, Andrew Morton wrote: On Wed, 24 Jan 2007 18:37:15 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Without digging too deeply, I'd say you've hit the same bug Sami Farin and others have reported starting with 2.6.19: pages mapped with kmap_atomic() become

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-25 Thread Justin Piszcz
On Thu, 25 Jan 2007, Pavel Machek wrote: Hi! Is it highmem-related? Can you try it with mem=256M? Bad idea, the kernel crashes burns when I use mem=256, I had to boot 2.6.20-rc5-6 single to get back into my machine, very nasty. Remember I use an onboard graphics controller

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-25 Thread Justin Piszcz
On Thu, 25 Jan 2007, Nick Piggin wrote: Justin Piszcz wrote: On Mon, 22 Jan 2007, Andrew Morton wrote: After the oom-killing, please see if you can free up the ZONE_NORMAL memory via a few `echo 3 /proc/sys/vm/drop_caches' commands. See if you can work out what happened

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-25 Thread Justin Piszcz
On Wed, 24 Jan 2007, Bill Cizek wrote: Justin Piszcz wrote: On Mon, 22 Jan 2007, Andrew Morton wrote: On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke the OOM killer

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-25 Thread Justin Piszcz
On Thu, 25 Jan 2007, Mark Hahn wrote: Something is seriously wrong with that OOM killer. do you know you don't have to operate in OOM-slaughter mode? vm.overcommit_memory = 2 in your /etc/sysctl.conf puts you into a mode where the kernel tracks your committed memory needs, and will

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-24 Thread Justin Piszcz
On Mon, 22 Jan 2007, Chuck Ebbert wrote: Justin Piszcz wrote: My .config is attached, please let me know if any other information is needed and please CC (lkml) as I am not on the list, thanks! Running Kernel 2.6.19.2 on a MD RAID5 volume. Copying files over Samba to the RAID5

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
chipset also uses some memory, in any event mem=256 causes the machine to lockup before it can even get to the boot/init processes, the two leds on the keyboard were blinking, caps lock and scroll lock and I saw no console at all! Justin. On Mon, 22 Jan 2007, Justin Piszcz wrote: On Mon

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
On Mon, 22 Jan 2007, Andrew Morton wrote: On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke the OOM killer and kill all of my processes? What's that? Software raid or hardware

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
And FYI yes I used mem=256M just as you said, not mem=256. Justin. On Wed, 24 Jan 2007, Justin Piszcz wrote: Is it highmem-related? Can you try it with mem=256M? Bad idea, the kernel crashes burns when I use mem=256, I had to boot 2.6.20-rc5-6 single to get back into my machine, very

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
On Mon, 22 Jan 2007, Andrew Morton wrote: On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke the OOM killer and kill all of my processes? What's that? Software raid or hardware

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
On Thu, 25 Jan 2007, Pavel Machek wrote: Hi! Is it highmem-related? Can you try it with mem=256M? Bad idea, the kernel crashes burns when I use mem=256, I had to boot 2.6.20-rc5-6 single to get back into my machine, very nasty. Remember I use an onboard graphics controller

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Justin Piszcz
On Thu, 25 Jan 2007, Pavel Machek wrote: Hi! Is it highmem-related? Can you try it with mem=256M? Bad idea, the kernel crashes burns when I use mem=256, I had to boot 2.6.20-rc5-6 single to get back into my machine, very nasty. Remember I use an onboard graphics controller

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-23 Thread Justin Piszcz
On Tue, 23 Jan 2007, Neil Brown wrote: On Monday January 22, [EMAIL PROTECTED] wrote: Justin Piszcz wrote: My .config is attached, please let me know if any other information is needed and please CC (lkml) as I am not on the list, thanks! Running Kernel 2.6.19.2 on a MD RAID5

Re: change strip_cache_size freeze the whole raid

2007-01-23 Thread Justin Piszcz
I can try and do this later this week possibly. Justin. On Tue, 23 Jan 2007, Neil Brown wrote: On Monday January 22, [EMAIL PROTECTED] wrote: Hi, Yesterday I tried to increase the value of strip_cache_size to see if I can get better performance or not. I increase the value from 2048

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-23 Thread Justin Piszcz
On Tue, 23 Jan 2007, Michael Tokarev wrote: Justin Piszcz wrote: [] Is this a bug that can or will be fixed or should I disable pre-emption on critical and/or server machines? Disabling pre-emption on critical and/or server machines seems to be a good idea in the first place. IMHO

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-23 Thread Justin Piszcz
On Tue, 23 Jan 2007, Michael Tokarev wrote: Justin Piszcz wrote: On Tue, 23 Jan 2007, Michael Tokarev wrote: Disabling pre-emption on critical and/or server machines seems to be a good idea in the first place. IMHO anyway.. ;) So bottom line is make sure not to use preemption

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible (multi-threaded USB no go)

2007-01-22 Thread Justin Piszcz
On Sun, 21 Jan 2007, Greg KH wrote: On Sun, Jan 21, 2007 at 12:29:51PM -0500, Justin Piszcz wrote: On Sun, 21 Jan 2007, Justin Piszcz wrote: Good luck, Jurriaan -- What does ELF stand for (in respect to Linux?) ELF is the first rock group

Re: change strip_cache_size freeze the whole raid

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, kyle wrote: Hi, Yesterday I tried to increase the value of strip_cache_size to see if I can get better performance or not. I increase the value from 2048 to something like 16384. After I did that, the raid5 freeze. Any proccess read / write to it stucked at D state.

Re: change strip_cache_size freeze the whole raid

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, kyle wrote: On Mon, 22 Jan 2007, kyle wrote: Hi, Yesterday I tried to increase the value of strip_cache_size to see if I can get better performance or not. I increase the value from 2048 to something like 16384. After I did that, the raid5 freeze.

Re: change strip_cache_size freeze the whole raid

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, Steve Cousins wrote: Justin Piszcz wrote: Yes, I noticed this bug too, if you change it too many times or change it at the 'wrong' time, it hangs up when you echo numbr /proc/stripe_cache_size. Basically don't run it more than once and don't run

Re: change strip_cache_size freeze the whole raid

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, Steve Cousins wrote: Justin Piszcz wrote: Yes, I noticed this bug too, if you change it too many times or change it at the 'wrong' time, it hangs up when you echo numbr /proc/stripe_cache_size. Basically don't run it more than once and don't run

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-22 Thread Justin Piszcz
On Mon, 22 Jan 2007, Pavel Machek wrote: On Sun 2007-01-21 14:27:34, Justin Piszcz wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke the OOM killer and kill all of my processes? Doing this on a single disk 2.6.19.2 is OK, no issues. However

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-22 Thread Justin Piszcz
What's that? Software raid or hardware raid? If the latter, which driver? Software RAID (md) On Mon, 22 Jan 2007, Andrew Morton wrote: On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: Why does copying an 18GB on a 74GB raptor raid1 cause the kernel

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-21 Thread Justin Piszcz
On Sun, 21 Jan 2007, [EMAIL PROTECTED] wrote: From: Justin Piszcz [EMAIL PROTECTED] Date: Sat, Jan 20, 2007 at 04:03:42PM -0500 My swap is on, 2GB ram and 2GB of swap on this machine. I can't go back to 2.6.17.13 as it does not recognize the NICs in my machine correctly

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-21 Thread Justin Piszcz
On Sun, 21 Jan 2007, [EMAIL PROTECTED] wrote: From: Justin Piszcz [EMAIL PROTECTED] Date: Sat, Jan 20, 2007 at 04:03:42PM -0500 My swap is on, 2GB ram and 2GB of swap on this machine. I can't go back to 2.6.17.13 as it does not recognize the NICs in my machine correctly

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-21 Thread Justin Piszcz
On Sun, 21 Jan 2007, [EMAIL PROTECTED] wrote: From: Justin Piszcz [EMAIL PROTECTED] Date: Sun, Jan 21, 2007 at 11:48:07AM -0500 What about all of the changes with NAT? I see that it operates on level-3/network wise, I enabled that and backward compatiblity support as well

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible (multi-threaded USB no go)

2007-01-21 Thread Justin Piszcz
On Sun, 21 Jan 2007, Justin Piszcz wrote: Good luck, Jurriaan -- What does ELF stand for (in respect to Linux?) ELF is the first rock group that Ronnie James Dio performed with back in the early 1970's. In constrast, a.out is a misspelling of the French word

Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-20 Thread Justin Piszcz
My .config is attached, please let me know if any other information is needed and please CC (lkml) as I am not on the list, thanks! Running Kernel 2.6.19.2 on a MD RAID5 volume. Copying files over Samba to the RAID5 running XFS. Any idea what happened here? [473795.214705] BUG: unable to

Re: Kernel 2.6.19.2 New RAID 5 Bug (oops when writing Samba - RAID5)

2007-01-20 Thread Justin Piszcz
On Sat, 20 Jan 2007, Justin Piszcz wrote: My .config is attached, please let me know if any other information is needed and please CC (lkml) as I am not on the list, thanks! Running Kernel 2.6.19.2 on a MD RAID5 volume. Copying files over Samba to the RAID5 running XFS. Any idea

2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-20 Thread Justin Piszcz
Perhaps its time to back to a stable (2.6.17.13 kernel)? Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 partition, the OOM killer goes into effect and kills almost all my processes. Completely 100% reproducible. Does 2.6.19.2 have some of memory allocation bug as

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-20 Thread Justin Piszcz
On Sat, 20 Jan 2007, Avuton Olrich wrote: On 1/20/07, Justin Piszcz [EMAIL PROTECTED] wrote: Perhaps its time to back to a stable (2.6.17.13 kernel)? Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 partition, the OOM killer goes into effect and kills almost all

Re: 2.6.19.2, cp 18gb_file 18gb_file.2 = OOM killer, 100% reproducible

2007-01-20 Thread Justin Piszcz
On Sat, 20 Jan 2007, Justin Piszcz wrote: On Sat, 20 Jan 2007, Avuton Olrich wrote: On 1/20/07, Justin Piszcz [EMAIL PROTECTED] wrote: Perhaps its time to back to a stable (2.6.17.13 kernel)? Anyway, when I run a cp 18gb_file 18gb_file.2 on a dual raptor sw raid1 partition

Re: bad performance on RAID 5

2007-01-17 Thread Justin Piszcz
Sevrin Robstad wrote: I'm suffering from bad performance on my RAID5. a echo check /sys/block/md0/md/sync_action gives a speed at only about 5000K/sec , and HIGH load average : # uptime 20:03:55 up 8 days, 19:55, 1 user, load average: 11.70, 4.04, 1.52 kernel is 2.6.18.1.2257.fc5 mdadm is

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-13 Thread Justin Piszcz
On Sat, 13 Jan 2007, Al Boldi wrote: Justin Piszcz wrote: On Sat, 13 Jan 2007, Al Boldi wrote: Justin Piszcz wrote: Btw, max sectors did improve my performance a little bit but stripe_cache+read_ahead were the main optimizations that made everything go faster by about ~1.5x

Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read 195MB/s write)

2007-01-12 Thread Justin Piszcz
On Fri, 12 Jan 2007, Michael Tokarev wrote: Justin Piszcz wrote: Using 4 raptor 150s: Without the tweaks, I get 111MB/s write and 87MB/s read. With the tweaks, 195MB/s write and 211MB/s read. Using kernel 2.6.19.1. Without the tweaks and with the tweaks: # Stripe tests

<    1   2   3   4   >