Re: [BUG] OOPS 2.6.24.2 raid5 write with ioatdma

2008-02-15 Thread Dan Williams
this locally. Regards, Dan ioat: fix 'ack' handling, driver must ensure that 'ack' is zero From: Dan Williams [EMAIL PROTECTED] Initialize 'ack' to zero in case the descriptor has been recycled. Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/dma/ioat_dma.c |2 ++ 1 files changed, 2

Re: [PATCH 001 of 6] md: Fix an occasional deadlock in raid5

2008-01-15 Thread Dan Williams
heheh. it's really easy to reproduce the hang without the patch -- i could hang the box in under 20 min on 2.6.22+ w/XFS and raid5 on 7x750GB. i'll try with ext3... Dan's experiences suggest it won't happen with ext3 (or is even more rare), which would explain why this has is overall a rare

Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-10 Thread Dan Williams
do not see this change making the situation any worse. In fact, it may make it a bit better since there is a higher chance for the thread submitting i/o to MD to do its own i/o to the backing disks. Reviewed-by: Dan Williams [EMAIL PROTECTED] - To unsubscribe from this list: send the line

Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-09 Thread Dan Williams
On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote: On Sat, 29 Dec 2007, Dan Williams wrote: On Dec 29, 2007 1:58 PM, dean gaudet [EMAIL PROTECTED] wrote: On Sat, 29 Dec 2007, Dan Williams wrote: On Dec 29, 2007 9:48 AM, dean gaudet [EMAIL PROTECTED] wrote: hmm bummer

Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-09 Thread Dan Williams
On Jan 9, 2008 5:09 PM, Neil Brown [EMAIL PROTECTED] wrote: On Wednesday January 9, [EMAIL PROTECTED] wrote: On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote: i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1

Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-09 Thread Dan Williams
On Wed, 2008-01-09 at 20:57 -0700, Neil Brown wrote: So I'm incline to leave it as do as much work as is available to be done as that is simplest. But I can probably be talked out of it with a convincing argument Well, in an age of CFS and CFQ it smacks of 'unfairness'. But does that

Re: Raid 1, new disk can't be added after replacing faulty disk

2008-01-07 Thread Dan Williams
On Jan 7, 2008 6:44 AM, Radu Rendec [EMAIL PROTECTED] wrote: I'm experiencing trouble when trying to add a new disk to a raid 1 array after having replaced a faulty disk. [..] # mdadm --version mdadm - v2.6.2 - 21st May 2007 [..] However, this happens with both mdadm 2.6.2 and 2.6.4. I

Re: [PATCH] md: Fix data corruption when a degraded raid5 array is reshaped.

2008-01-03 Thread Dan Williams
: [EMAIL PROTECTED] Cc: Dan Williams [EMAIL PROTECTED] Signed-off-by: Neil Brown [EMAIL PROTECTED] Acked-by: Dan Williams [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http

Re: [PATCH] md: Fix data corruption when a degraded raid5 array is reshaped.

2008-01-03 Thread Dan Williams
now is safer long-term. This bug exists in 2.6.23 and 2.6.24-rc Cc: [EMAIL PROTECTED] Cc: Dan Williams [EMAIL PROTECTED] Signed-off-by: Neil Brown [EMAIL PROTECTED] Acked-by: Dan Williams [EMAIL PROTECTED] On closer look the safer test is: !test_bit(STRIPE_OP_COMPUTE_BLK, sh

mdadm: unable to add a disk to degraded raid1 array

2007-12-29 Thread Dan Williams
In case someone else happens upon this I have found that mdadm = v2.6.2 cannot add a disk to a degraded raid1 array created with mdadm 2.6.2. I bisected the problem down to mdadm git commit 2fb749d1b7588985b1834e43de4ec5685d0b8d26 which appears to make an incompatible change to the super block's

Re: 2.6.24-rc6 reproducible raid5 hang

2007-12-29 Thread Dan Williams
On Dec 29, 2007 9:48 AM, dean gaudet [EMAIL PROTECTED] wrote: hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on the same 64k chunk array and had raised the stripe_cache_size to 1024... and got a hang. this time i grabbed stripe_cache_active before bumping the size

Re: 2.6.24-rc6 reproducible raid5 hang

2007-12-29 Thread Dan Williams
On Dec 29, 2007 1:58 PM, dean gaudet [EMAIL PROTECTED] wrote: On Sat, 29 Dec 2007, Dan Williams wrote: On Dec 29, 2007 9:48 AM, dean gaudet [EMAIL PROTECTED] wrote: hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on the same 64k chunk array and had raised

Re: HELP! New disks being dropped from RAID 6 array on every reboot

2007-11-23 Thread Dan Williams
On Nov 23, 2007 11:19 AM, Joshua Johnson [EMAIL PROTECTED] wrote: Greetings, long time listener, first time caller. I recently replaced a disk in my existing 8 disk RAID 6 array. Previously, all disks were PATA drives connected to the motherboard IDE and 3 promise Ultra 100/133 controllers.

Re: PROBLEM: raid5 hangs

2007-11-14 Thread Dan Williams
On Nov 14, 2007 5:05 PM, Justin Piszcz [EMAIL PROTECTED] wrote: On Wed, 14 Nov 2007, Bill Davidsen wrote: Justin Piszcz wrote: This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5 bio* patches are applied. Note below he's running 2.6.22.3 which doesn't have the bug

Re: kernel panic (2.6.23.1-fc7) in drivers/md/raid5.c:144

2007-11-13 Thread Dan Williams
[ Adding Neil, stable@, DaveJ, and GregKH to the cc ] On Nov 13, 2007 11:20 AM, Peter [EMAIL PROTECTED] wrote: Hi I had a 3 disc raid5 array running fine with Fedora 7 (32bit) kernel 2.6.23.1-fc7 on an old Athlon XP using a two sata_sil cards. I replaced the hardware with an Athlon64 X2

Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23

2007-11-13 Thread Dan Williams
On Nov 13, 2007 5:23 PM, Greg KH [EMAIL PROTECTED] wrote: On Tue, Nov 13, 2007 at 04:22:14PM -0800, Greg KH wrote: On Mon, Oct 22, 2007 at 05:15:27PM +1000, NeilBrown wrote: It appears that a couple of bugs slipped in to md for 2.6.23. These two patches fix them and are appropriate for

Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23

2007-11-13 Thread Dan Williams
void handle_stripe5(struct stripe_head *sh) raid5-fix-unending-write-sequence.patch is in -mm and I believe is waiting on an Acked-by from Neil? thanks, greg k-h Thanks, Dan raid5: fix clearing of biofill operations From: Dan Williams [EMAIL PROTECTED] ops_complete_biofill() runs outside

[PATCH] raid5: fix unending write sequence

2007-11-08 Thread Dan Williams
From: Dan Williams [EMAIL PROTECTED] debug output from Joël's system handling stripe 7629696, state=0x14 cnt=1, pd_idx=2 ops=0:0:0 check 5: state 0x6 toread read write f800ffcffcc0 written check 4: state 0x6 toread read

Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-06 Thread Dan Williams
, pending)) - ops_run_postxor(sh, tx); + ops_run_postxor(sh, tx, pending); if (test_bit(STRIPE_OP_CHECK, pending)) ops_run_check(sh); raid5: fix unending write sequence From: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 16

Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-05 Thread Dan Williams
On 11/4/07, Justin Piszcz [EMAIL PROTECTED] wrote: On Mon, 5 Nov 2007, Neil Brown wrote: On Sunday November 4, [EMAIL PROTECTED] wrote: # ps auxww | grep D USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 273 0.0 0.0 0 0 ?DOct21

Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-05 Thread Dan Williams
On 11/5/07, Justin Piszcz [EMAIL PROTECTED] wrote: [..] Are you seeing the same md thread takes 100% of the CPU that Joël is reporting? Yes, in another e-mail I posted the top output with md3_raid5 at 100%. This seems too similar to Joël's situation for them not to be correlated, and it

Re: Bug in processing dependencies by async_tx_submit() ?

2007-11-01 Thread Dan Williams
On 11/1/07, Yuri Tikhonov [EMAIL PROTECTED] wrote: Hi Dan, Honestly I tried to fix this quickly using the approach similar to proposed by you, with one addition though (in fact, deletion of BUG_ON(chan == tx-chan) in async_tx_run_dependencies()). And this led to Kernel stack overflow.

Re: [BUG] Raid1/5 over iSCSI trouble

2007-10-27 Thread Dan Williams
On 10/27/07, BERTRAND Joël [EMAIL PROTECTED] wrote: Dan Williams wrote: Can you collect some oprofile data, as Ming suggested, so we can maybe see what md_d0_raid5 and istd1 are fighting about? Hopefully it is as painless to run on sparc as it is on IA: opcontrol --start --vmlinux

Re: MD driver document

2007-10-24 Thread Dan Williams
On 10/24/07, tirumalareddy marri [EMAIL PROTECTED] wrote: Hi, I am looking for best way of understanding MD driver(including raid5/6) architecture. I am developing driver for one of the PPC based SOC. I have done some code reading and tried to use HW debugger to walk through the code.

Re: [BUG] Raid1/5 over iSCSI trouble

2007-10-24 Thread Dan Williams
On 10/24/07, BERTRAND Joël [EMAIL PROTECTED] wrote: Hello, Any news about this trouble ? Any idea ? I'm trying to fix it, but I don't see any specific interaction between raid5 and istd. Does anyone try to reproduce this bug on another arch than sparc64 ? I only use sparc32

Re: async_tx: get best channel

2007-10-23 Thread Dan Williams
On Fri, 2007-10-19 at 05:23 -0700, Yuri Tikhonov wrote: Hello Dan, Hi Yuri, sorry it has taken me so long to get back to you... I have a suggestion regarding the async_tx_find_channel() procedure. First, a little introduction. Some processors (e.g. ppc440spe) have several DMA

Re: [BUG] Raid1/5 over iSCSI trouble

2007-10-19 Thread Dan Williams
On Fri, 2007-10-19 at 14:04 -0700, BERTRAND Joël wrote: Sorry for this last mail. I have found another mistake, but I don't know if this bug comes from iscsi-target or raid5 itself. iSCSI target is disconnected because istd1 and md_d0_raid5 kernel threads use 100% of CPU each !

Re: [BUG] Raid5 trouble

2007-10-19 Thread Dan Williams
On Fri, 2007-10-19 at 01:04 -0700, BERTRAND Joël wrote: I never see any oops with this patch. But I cannot create a RAID1 array with a local RAID5 volume and a foreign RAID5 array exported by iSCSI. iSCSI seems to works fine, but RAID1 creation randomly aborts due to a unknown SCSI

Re: [BUG] Raid5 trouble

2007-10-17 Thread Dan Williams
On 10/17/07, Dan Williams [EMAIL PROTECTED] wrote: On 10/17/07, BERTRAND Joël [EMAIL PROTECTED] wrote: BERTRAND Joël wrote: Hello, I run 2.6.23 linux kernel on two T1000 (sparc64) servers. Each server has a partitionable raid5 array (/dev/md/d0) and I have to synchronize

Re: [BUG] Raid5 trouble

2007-10-17 Thread Dan Williams
into handle_stripe5 and adds some debug information. -- Dan raid5: fix clearing of biofill operations (try2) From: Dan Williams [EMAIL PROTECTED] ops_complete_biofill() runs outside of spin_lock(sh-lock) and clears the 'pending' and 'ack' bits. Since the test_and_ack_op() macro only checks against 'complete

Re: mdadm: /dev/sda1 is too small: 0K

2007-10-13 Thread Dan Williams
On 10/13/07, Hod Greeley [EMAIL PROTECTED] wrote: Hello, I tried to create a raid device starting with foo:~ 1032% mdadm --create -l1 -n2 /dev/md1 /dev/sda1 missing mdadm: /dev/sda1 is too small: 0K mdadm: create aborted Quick sanity check, is /dev/sda1 still a block device node with

Re: [PATCH -mm 0/4] raid5: stripe_queue (+20% to +90% write performance)

2007-10-09 Thread Dan Williams
On Mon, 2007-10-08 at 23:21 -0700, Neil Brown wrote: On Saturday October 6, [EMAIL PROTECTED] wrote: Neil, Here is the latest spin of the 'stripe_queue' implementation. Thanks to raid6+bitmap testing done by Mr. James W. Laferriere there have been several cleanups and fixes since the

Re: [PATCH -mm 0/4] raid5: stripe_queue (+20% to +90% write performance)

2007-10-07 Thread Dan Williams
On 10/6/07, Justin Piszcz [EMAIL PROTECTED] wrote: On Sat, 6 Oct 2007, Dan Williams wrote: Neil, Here is the latest spin of the 'stripe_queue' implementation. Thanks to raid6+bitmap testing done by Mr. James W. Laferriere there have been several cleanups and fixes since the last

[PATCH -mm 0/4] raid5: stripe_queue (+20% to +90% write performance)

2007-10-06 Thread Dan Williams
. Andrew, These are updated in the git-md-accel tree, but I will work the finalized versions through Neil's 'Signed-off-by' path. Dan Williams (4): raid5: add the stripe_queue object for tracking raid io requests (rev3) raid5: split allocation of stripe_heads and stripe_queues

[PATCH -mm 1/4] raid5: add the stripe_queue object for tracking raid io requests (rev3)

2007-10-06 Thread Dan Williams
-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 564 +++- include/linux/raid/raid5.h | 28 +- 2 files changed, 364 insertions(+), 228 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index f96dea9..a13de7d 100644

[PATCH -mm 2/4] raid5: split allocation of stripe_heads and stripe_queues

2007-10-06 Thread Dan Williams
. Laferriere [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 316 include/linux/raid/raid5.h | 11 +- 2 files changed, 239 insertions(+), 88 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md

[PATCH -mm 3/4] raid5: convert add_stripe_bio to add_queue_bio

2007-10-06 Thread Dan Williams
a stripe_head is attached. Tested-by: Mr. James W. Laferriere [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 53 include/linux/raid/raid5.h |6 + 2 files changed, 40 insertions(+), 19 deletions

[PATCH -mm 4/4] raid5: use stripe_queues to prioritize the most deserving requests (rev7)

2007-10-06 Thread Dan Williams
and release_queue to remove STRIPE_QUEUE_HANDLE and sq-sh back references * kill init_sh and allocate init_sq on the stack Tested-by: Mr. James W. Laferriere [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 843

[GIT PULL] async-tx/md-accel fixes and documentation for 2.6.23

2007-09-24 Thread Dan Williams
Linus, please pull from: git://lost.foo-projects.org/~dwillia2/git/iop async-tx-fixes-for-linus to receive: Dan Williams (3): async_tx: usage documentation and developer notes (v2) async_tx: fix dma_wait_for_async_tx raid5: fix 2 bugs in ops_complete_biofill The raid5

[PATCH 2.6.23-rc7 0/3] async_tx and md-accel fixes for 2.6.23

2007-09-20 Thread Dan Williams
Fix a couple bugs and provide documentation for the async_tx api. Neil, please 'ack' patch #3. git://lost.foo-projects.org/~dwillia2/git/iop async-tx-fixes-for-linus Dan Williams (3): async_tx: usage documentation and developer notes async_tx: fix dma_wait_for_async_tx raid5

[PATCH 2.6.23-rc7 1/3] async_tx: usage documentation and developer notes

2007-09-20 Thread Dan Williams
Signed-off-by: Dan Williams [EMAIL PROTECTED] --- Documentation/crypto/async-tx-api.txt | 217 + 1 files changed, 217 insertions(+), 0 deletions(-) diff --git a/Documentation/crypto/async-tx-api.txt b/Documentation/crypto/async-tx-api.txt new file mode 100644

[PATCH 2.6.23-rc7 2/3] async_tx: fix dma_wait_for_async_tx

2007-09-20 Thread Dan Williams
Fix dma_wait_for_async_tx to not loop forever in the case where a dependency chain is longer than two entries. This condition will not happen with current in-kernel drivers, but fix it for future drivers. Found-by: Saeed Bishara [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED

[PATCH 2.6.23-rc7 3/3] raid5: fix ops_complete_biofill

2007-09-20 Thread Dan Williams
). ops_complete_biofill can run in tasklet context, so rather than upgrading all the stripe locks from spin_lock to spin_lock_bh this patch just moves read completion handling back into handle_stripe. Found-by: Yuri Tikhonov [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c

Re: md raid acceleration and the async_tx api

2007-09-13 Thread Dan Williams
On 9/13/07, Yuri Tikhonov [EMAIL PROTECTED] wrote: Hi Dan, On Friday 07 September 2007 20:02, you wrote: You need to fetch from the 'md-for-linus' tree. But I have attached them as well. git fetch git://lost.foo-projects.org/~dwillia2/git/iop md-for-linus:md-for-linus Thanks.

Re: md raid acceleration and the async_tx api

2007-08-30 Thread Dan Williams
-for-linus:refs/heads/md-for-linus raid5: fix the 'more_to_read' case in ops_complete_biofill From: Dan Williams [EMAIL PROTECTED] Prevent ops_complete_biofill from running concurrently with add_queue_bio --- drivers/md/raid5.c | 33 +++-- 1 files changed, 19

Re: [md-accel PATCH 16/19] dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines

2007-08-30 Thread Dan Williams
On 8/30/07, saeed bishara [EMAIL PROTECTED] wrote: you are right, I've another question regarding the function dma_wait_for_async_tx from async_tx.c, here is the body of the code: /* poll through the dependency chain, return when tx is complete */ 1.do { 2. iter

Re: raid5:md3: kernel BUG , followed by , Silent halt .

2007-08-27 Thread Dan Williams
On 8/25/07, Mr. James W. Laferriere [EMAIL PROTECTED] wrote: Hello Dan , On Mon, 20 Aug 2007, Dan Williams wrote: On 8/18/07, Mr. James W. Laferriere [EMAIL PROTECTED] wrote: Hello All , Here we go again . Again attempting to do bonnie++ testing on a small array

Re: Patch for boot-time assembly of v1.x-metadata-based soft (MD) arrays

2007-08-26 Thread Dan Williams
On 8/26/07, Justin Piszcz [EMAIL PROTECTED] wrote: On Sun, 26 Aug 2007, Abe Skolnik wrote: Dear Mr./Dr./Prof. Brown et al, I recently had the unpleasant experience of creating an MD array for the purpose of booting off it and then not being able to do so. Since I had already made

Re: Patch for boot-time assembly of v1.x-metadata-based soft (MD) arrays: reasoning and future plans

2007-08-26 Thread Dan Williams
On 8/26/07, Abe Skolnik [EMAIL PROTECTED] wrote: Dear Mr./Dr. Williams, Just Dan is fine :-) Because you can rely on the configuration file to be certain about which disks to pull in and which to ignore. Without the config file the auto-detect routine may not always do the right thing

Re: raid5:md3: kernel BUG , followed by , Silent halt .

2007-08-20 Thread Dan Williams
(stripe-queue) , Dan Williams [EMAIL PROTECTED] Hello James, Thanks for the report. I tried to reproduce this on my system, no luck. However it looks like their is a potential race between 'handle_queue' and 'add_queue_bio'. The attached patch moves these critical sections under spin_lock(sq-lock

Re: bonnie++ benchmarks for ext2,ext3,ext4,jfs,reiserfs,xfs,zfs on software raid 5

2007-07-30 Thread Dan Williams
[trimmed all but linux-raid from the cc] On 7/30/07, Justin Piszcz [EMAIL PROTECTED] wrote: CONFIG: Software RAID 5 (400GB x 6): Default mkfs parameters for all filesystems. Kernel was 2.6.21 or 2.6.22, did these awhile ago. Can you give 2.6.22.1-iop1 a try to see what affect it has on

[GIT PATCH 0/2] stripe-queue for 2.6.23 consideration

2007-07-22 Thread Dan Williams
insertions(+), 407 deletions(-) Dan Williams (2): raid5: add the stripe_queue object for tracking raid io requests (take2) raid5: use stripe_queues to prioritize the most deserving requests (take4) I initially considered them 2.6.24 material but after fixing the sync+io data corruption

[GIT PATCH 1/2] raid5: add the stripe_queue object for tracking raid io requests (take2)

2007-07-22 Thread Dan Williams
. Pre-patch throughput hovers at ~85MB/s for this dd command. Changes in take2: * leave the flags with the buffers, prevents a data corruption issue whereby stale buffer state flags are attached to newly initialized buffers Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c

[RFT] 2.6.22.1-iop1 for improved sequential write performance (stripe-queue)

2007-07-19 Thread Dan Williams
Per Bill Davidsen's request I have made available a 2.6.22.1 based kernel with the current raid5 performance changes I have been working on: 1/ Offload engine acceleration (recently merged for the 2.6.23 development cycle) 2/ Stripe-queue, an evolutionary change to the raid5 queuing model (take4)

[GIT PULL] ioat fixes, raid5 acceleration, and the async_tx api

2007-07-13 Thread Dan Williams
offload copies for TCP when there will be a context switch Dan Aloni (1): I/OAT: fix I/OAT for kexec Dan Williams (20): dmaengine: refactor dmaengine around dma_async_tx_descriptor dmaengine: make clients responsible for managing channels xor: make 'xor_blocks' a library

[-mm PATCH 0/2] 74% decrease in dispatched writes, stripe-queue take3

2007-07-13 Thread Dan Williams
Neil, Andrew, The following patches replace the stripe-queue patches currently in -mm. Following your suggestion, Neil, I gathered blktrace data on the number of reads generated by sequential write stimulus. It turns out that reduced pre-reading is not the cause of the performance increase, but

[-mm PATCH 1/2] raid5: add the stripe_queue object for tracking raid io requests (take2)

2007-07-13 Thread Dan Williams
compared to ~120MB/s without. Pre-patch throughput hovers at ~85MB/s for this dd command. Changes in take2: * leave the flags with the buffers, prevents a data corruption issue whereby stale buffer state flags are attached to newly initialized buffers Signed-off-by: Dan Williams [EMAIL PROTECTED

[RFC PATCH 0/2] raid5: 65% sequential-write performance improvement, stripe-queue take2

2007-07-03 Thread Dan Williams
The first take of the stripe-queue implementation[1] had a performance limiting bug in __wait_for_inactive_queue. Fixing that issue drastically changed the performance characteristics. The following data from tiobench shows the relative performance difference of the stripe-queue patchset. Unit

Re: [md-accel PATCH 03/19] xor: make 'xor_blocks' a library routine for use with async_tx

2007-06-27 Thread Dan Williams
[ trimmed the cc ] On 6/26/07, Satyam Sharma [EMAIL PROTECTED] wrote: Hi Dan, [ Minor thing ... ] Not a problem, thanks for taking a look... On 6/27/07, Dan Williams [EMAIL PROTECTED] wrote: The async_tx api tries to use a dma engine for an operation, but will fall back to an optimized

[RFC PATCH 0/2] An evolutionary change to the raid456 queuing model

2007-06-27 Thread Dan Williams
Raz's stripe-deadline patch illuminated the fact that the current queuing model leaves write performance on the table in some cases. The following patches introduce a new queuing model which attempts to recover this performance. On an ARM based iop13xx platform I see an averaged %14.7 increase

[md-accel PATCH 00/19] md raid acceleration and the async_tx api

2007-06-26 Thread Dan Williams
to pull for 2.6.23. git://lost.foo-projects.org/~dwillia2/git/iop md-accel-linus Dan Williams (19): dmaengine: refactor dmaengine around dma_async_tx_descriptor dmaengine: make clients responsible for managing channels xor: make 'xor_blocks' a library routine for use

[md-accel PATCH 01/19] dmaengine: refactor dmaengine around dma_async_tx_descriptor

2007-06-26 Thread Dan Williams
, set_dest, and tx_submit descriptor specific methods Cc: Jeff Garzik [EMAIL PROTECTED] Cc: Chris Leech [EMAIL PROTECTED] Cc: Shannon Nelson [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/dma/dmaengine.c | 182 ++ drivers/dma/ioatdma.c

[md-accel PATCH 03/19] xor: make 'xor_blocks' a library routine for use with async_tx

2007-06-26 Thread Dan Williams
xor_block = xor_blocks, suggested by Adrian Bunk * ensure that xor.o initializes before md.o in the built-in case * checkpatch.pl fixes * mark calibrate_xor_blocks __init, Adrian Bunk Cc: Adrian Bunk [EMAIL PROTECTED] Cc: NeilBrown [EMAIL PROTECTED] Cc: Herbert Xu [EMAIL PROTECTED] Signed-off-by: Dan

[md-accel PATCH 04/19] async_tx: add the async_tx api

2007-06-26 Thread Dan Williams
share algorithms in the future * move large inline functions into c files * checkpatch.pl fixes * gpl v2 only correction Cc: Herbert Xu [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] Acked-By: NeilBrown [EMAIL PROTECTED] --- crypto/Kconfig |6 crypto/Makefile

[md-accel PATCH 06/19] raid5: replace custom debug PRINTKs with standard pr_debug

2007-06-26 Thread Dan Williams
Replaces PRINTK with pr_debug, and kills the RAID5_DEBUG definition in favor of the global DEBUG definition. To get local debug messages just add '#define DEBUG' to the top of the file. Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 116

[md-accel PATCH 07/19] md: raid5_run_ops - run stripe operations outside sh-lock

2007-06-26 Thread Dan Williams
ops_complete_biofill * remove test_and_set/test_and_clear BUG_ONs, Neil Brown * remove explicit interrupt handling for channel switching, this feature was absorbed (i.e. it is now implicit) by the async_tx api Signed-off-by: Dan Williams [EMAIL PROTECTED] Acked-By: NeilBrown [EMAIL PROTECTED] --- drivers

[md-accel PATCH 08/19] md: common infrastructure for running operations with raid5_run_ops

2007-06-26 Thread Dan Williams
and completion of operations. Signed-off-by: Dan Williams [EMAIL PROTECTED] Acked-By: NeilBrown [EMAIL PROTECTED] --- drivers/md/raid5.c | 67 +--- 1 files changed, 58 insertions(+), 9 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c

[md-accel PATCH 10/19] md: handle_stripe5 - add request/completion logic for async compute ops

2007-06-26 Thread Dan Williams
the R5_Wantcompute flag there is no facility to pass the async_tx dependency chain across successive calls to raid5_run_ops. The req_compute variable protects against this case. Changelog: * remove the req_compute BUG_ON Signed-off-by: Dan Williams [EMAIL PROTECTED] Acked-By: NeilBrown [EMAIL PROTECTED

[md-accel PATCH 11/19] md: handle_stripe5 - add request/completion logic for async check ops

2007-06-26 Thread Dan Williams
. Changelog: * remove test_and_set/test_and_clear BUG_ONs, Neil Brown Signed-off-by: Dan Williams [EMAIL PROTECTED] Acked-By: NeilBrown [EMAIL PROTECTED] --- drivers/md/raid5.c | 84 1 files changed, 65 insertions(+), 19 deletions(-) diff --git

[md-accel PATCH 12/19] md: handle_stripe5 - add request/completion logic for async read ops

2007-06-26 Thread Dan Williams
arrive while raid5_run_ops is running they will not be handled until handle_stripe is scheduled to run again. Changelog: * cleanup to_read and to_fill accounting * do not fail reads that have reached the cache Signed-off-by: Dan Williams [EMAIL PROTECTED] Acked-By: NeilBrown [EMAIL PROTECTED

[md-accel PATCH 13/19] md: handle_stripe5 - add request/completion logic for async expand ops

2007-06-26 Thread Dan Williams
differentiate expand operations from normal write operations. Signed-off-by: Dan Williams [EMAIL PROTECTED] Acked-By: NeilBrown [EMAIL PROTECTED] --- drivers/md/raid5.c | 50 ++ 1 files changed, 38 insertions(+), 12 deletions(-) diff --git a/drivers/md

[md-accel PATCH 15/19] md: remove raid5 compute_block and compute_parity5

2007-06-26 Thread Dan Williams
replaced by raid5_run_ops Signed-off-by: Dan Williams [EMAIL PROTECTED] Acked-By: NeilBrown [EMAIL PROTECTED] --- drivers/md/raid5.c | 124 1 files changed, 0 insertions(+), 124 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md

[md-accel PATCH 05/19] raid5: refactor handle_stripe5 and handle_stripe6 (v2)

2007-06-26 Thread Dan Williams
. The following new routines are shared between raid5 and raid6: handle_completed_write_requests handle_requests_to_failed_array handle_stripe_expansion Changes in v2: * fixed 'conf-raid_disk-1' for the raid6 'handle_stripe_expansion' path Signed-off-by: Dan Williams [EMAIL

[md-accel PATCH 17/19] iop13xx: surface the iop13xx adma units to the iop-adma driver

2007-06-26 Thread Dan Williams
error fix from Kirill A. Shutemov * rebase for async_tx changes * add interrupt support * do not call platform register macros in driver code * remove unnecessary ARM assembly statement * checkpatch.pl fixes * gpl v2 only correction Cc: Russell King [EMAIL PROTECTED] Signed-off-by: Dan Williams

[md-accel PATCH 18/19] iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driver

2007-06-26 Thread Dan Williams
] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- arch/arm/mach-iop32x/glantank.c|2 arch/arm/mach-iop32x/iq31244.c |5 arch/arm/mach-iop32x/iq80321.c |3 arch/arm/mach-iop32x/n2100.c |2 arch/arm/mach-iop33x/iq80331.c |3 arch/arm

[md-accel PATCH 19/19] ARM: Add drivers/dma to arch/arm/Kconfig

2007-06-26 Thread Dan Williams
Cc: Russell King [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- arch/arm/Kconfig |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 50d9f3e..0cb2d4f 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig

Re: [md-accel PATCH 00/19] md raid acceleration and the async_tx api

2007-06-26 Thread Dan Williams
On 6/26/07, Mr. James W. Laferriere [EMAIL PROTECTED] wrote: Hello Dan , On Tue, 26 Jun 2007, Dan Williams wrote: Greetings, Per Andrew's suggestion this is the md raid5 acceleration patch set updated with more thorough changelogs to lower the barrier to entry for reviewers

Re: stripe_cache_size and performance

2007-06-25 Thread Dan Williams
7. And now, the question: the best absolute 'write' performance comes with a stripe_cache_size value of 4096 (for my setup). However, any value of stripe_cache_size above 384 really, really hurts 'check' (and rebuild, one can assume) performance. Why? Question: After performance goes bad does

[PATCH git-md-accel 0/2] raid5 refactor, and pr_debug cleanup

2007-06-18 Thread Dan Williams
Andrew's concerns about the commit messages. Dan Williams (2): raid5: refactor handle_stripe5 and handle_stripe6 raid5: replace custom debug print with standard pr_debug - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More

[PATCH git-md-accel 1/2] raid5: refactor handle_stripe5 and handle_stripe6

2007-06-18 Thread Dan Williams
. The following new routines are shared between raid5 and raid6: handle_completed_write_requests handle_requests_to_failed_array handle_stripe_expansion Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 1484

[PATCH git-md-accel 2/2] raid5: replace custom debug print with standard pr_debug

2007-06-18 Thread Dan Williams
Replaces PRINTK with pr_debug, and kills the RAID5_DEBUG definition in favor of the global DEBUG definition. To get local debug messages just add '#define DEBUG' to the top of the file. Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 116

[PATCH] md: comment add_stripe_bio

2007-06-05 Thread Dan Williams
From: Dan Williams [EMAIL PROTECTED] Document the overloading of struct bio fields. Signed-off-by: Dan Williams [EMAIL PROTECTED] --- [ drop this if you think it is too much commenting/unnecessary, but I figured I would leave some breadcrumbs for the next guy. ] drivers/md/raid5.c | 26

[PATCH 00/16] raid acceleration and asynchronous offload api for 2.6.22

2007-05-02 Thread Dan Williams
I am pleased to release this latest spin of the raid acceleration patches for merge consideration. This release aims to address all pending review items including MD bug fixes and async_tx api changes from Neil, and concerns on channel management from Chris and others. Data integrity tests using

[PATCH 01/16] dmaengine: add base support for the async_tx api

2007-05-02 Thread Dan Williams
to interoperate with async_tx calls * hookup ioat_tx_submit * convert channel capabilities to a 'cpumask_t like' bitmap Cc: Chris Leech [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/dma/dmaengine.c | 182 + drivers/dma/ioatdma.c

[PATCH 02/16] dmaengine: move channel management to the client

2007-05-02 Thread Dan Williams
to ignore a channel if it does not meet extra client specific constraints beyond simple base capabilities. This patch also fixes up the NET_DMA client to use the new mechanism. Cc: Chris Leech [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/dma/dmaengine.c | 206

[PATCH 03/16] ARM: Add drivers/dma to arch/arm/Kconfig

2007-05-02 Thread Dan Williams
Cc: Russell King [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- arch/arm/Kconfig |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index e7baca2..74077e3 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig

[PATCH 04/16] dmaengine: add the async_tx api

2007-05-02 Thread Dan Williams
counts = 1 * implicitly handle hardware concerns like channel switching and interrupts, Neil Brown * remove the per operation type list, and distribute operation capabilities evenly amongst the available channels * simplify async_tx_find_channel to optimize the fast path Signed-off-by: Dan Williams

[PATCH 05/16] md: add raid5_run_ops and support routines

2007-05-02 Thread Dan Williams
* remove test_and_set/test_and_clear BUG_ONs, Neil Brown * remove explicit interrupt handling Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 539 include/linux/raid/raid5.h | 63 + 2 files changed, 599 insertions

[PATCH 06/16] md: use raid5_run_ops for stripe cache operations

2007-05-02 Thread Dan Williams
to clear 'pending' and 'ack'. Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 65 +--- 1 files changed, 56 insertions(+), 9 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 0251bca..14e9f6a 100644

[PATCH 07/16] md: move write operations to raid5_run_ops

2007-05-02 Thread Dan Williams
* remove test_and_set/test_and_clear BUG_ONs, Neil Brown Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 151 +--- 1 files changed, 130 insertions(+), 21 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index

[PATCH 08/16] md: move raid5 compute block operations to raid5_run_ops

2007-05-02 Thread Dan Williams
. Changelog: * remove the req_compute BUG_ON Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 126 +++- 1 files changed, 94 insertions(+), 32 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 03a435d

[PATCH 09/16] md: move raid5 parity checks to raid5_run_ops

2007-05-02 Thread Dan Williams
-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 80 1 files changed, 61 insertions(+), 19 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 844bd9b..f8a4522 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c

[PATCH 10/16] md: satisfy raid5 read requests via raid5_run_ops

2007-05-02 Thread Dan Williams
Use raid5_run_ops to carry out the memory copies for a raid5 read request. Changelog: * cleanup to_read and to_fill accounting * do not fail reads that have reached the cache Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 61

[PATCH 11/16] md: use async_tx and raid5_run_ops for raid5 expansion operations

2007-05-02 Thread Dan Williams
. The bulk copy operation to the new stripe is handled inline by async_tx. Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 48 1 files changed, 36 insertions(+), 12 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md

[PATCH 12/16] md: move raid5 io requests to raid5_run_ops

2007-05-02 Thread Dan Williams
handle_stripe now only updates the state of stripes. All execution of operations is moved to raid5_run_ops. Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 68 1 files changed, 10 insertions(+), 58 deletions(-) diff

[PATCH 13/16] md: remove raid5 compute_block and compute_parity5

2007-05-02 Thread Dan Williams
replaced by raid5_run_ops Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/md/raid5.c | 124 1 files changed, 0 insertions(+), 124 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index c9b91e3..74ce354 100644

[PATCH 14/16] dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines

2007-05-02 Thread Dan Williams
* convert capabilities over to dma_cap_mask_t Signed-off-by: Dan Williams [EMAIL PROTECTED] --- drivers/dma/Kconfig |8 drivers/dma/Makefile|1 drivers/dma/iop-adma.c | 1464 +++ include/asm-arm/hardware

[PATCH 15/16] iop13xx: Surface the iop13xx adma units to the iop-adma driver

2007-05-02 Thread Dan Williams
error fix from Kirill A. Shutemov * rebase for async_tx changes * add interrupt support * do not call platform register macros in driver code Cc: Russell King [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- arch/arm/mach-iop13xx/setup.c | 208 include/asm

[PATCH 16/16] iop3xx: Surface the iop3xx DMA and AAU units to the iop-adma driver

2007-05-02 Thread Dan Williams
boards * do not call platform register macros in driver code * remove switch() statements for compatible register offsets/layouts * change over to bitmap based capabilities Cc: Russell King [EMAIL PROTECTED] Signed-off-by: Dan Williams [EMAIL PROTECTED] --- arch/arm/mach-iop32x/glantank.c

Re: RAID rebuild on Create

2007-04-30 Thread Dan Williams
On 4/30/07, Jan Engelhardt [EMAIL PROTECTED] wrote: Hi list, when a user does `mdadm -C /dev/md0 -l any -n whatever fits devices`, the array gets rebuilt for at least RAID1 and RAID5, even if the disk contents are most likely not of importance (otherwise we would not be creating a raid array

  1   2   3   >