Re: [md-accel PATCH 16/19] dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines

2007-08-30 Thread Dan Williams
On 8/30/07, saeed bishara <[EMAIL PROTECTED]> wrote:
> you are right, I've another question regarding the function
> dma_wait_for_async_tx from async_tx.c, here is the body of the code:
>/* poll through the dependency chain, return when tx is complete */
> 1.do {
> 2. iter = tx;
> 3. while (iter->cookie == -EBUSY)
> 4. iter = iter->parent;
> 5.
> 6. status = dma_sync_wait(iter->chan, iter->cookie);
> 7. } while (status == DMA_IN_PROGRESS || (iter != tx));
>
> assume that:
> - The interrupt capability is not provided.
> - Request A was sent to chan 0
> - Request B that depends on A is sent to chan 1
> - Request C that depends on B is send to chan 2.
> - Also, assume that when C is handled by async_tx_submit(), B is still
> not queued to the dmaengine (cookie equals to -EBUSY).
>
> In this case, dma_wait_for_async_tx will be called for C, now, it
> looks for me that the do while will loop forever, even when A gets
> completed. this is because the iter will point to B after line 4, thus
> the iter != tx (C) will always become true.
>
You are right.  There are no drivers in the tree that can hit this,
but it needs to be fixed up.

I'll submit the following change:

diff --git a/crypto/async_tx/async_tx.c b/crypto/async_tx/async_tx.c
index 0350071..bc18cbb 100644
--- a/crypto/async_tx/async_tx.c
+++ b/crypto/async_tx/async_tx.c
@@ -80,6 +80,7 @@ dma_wait_for_async_tx(struct dma_async_tx_descriptor *tx)
 {
enum dma_status status;
struct dma_async_tx_descriptor *iter;
+   struct dma_async_tx_descriptor *parent;

if (!tx)
return DMA_SUCCESS;
@@ -87,8 +88,15 @@ dma_wait_for_async_tx(struct dma_async_tx_descriptor *tx)
/* poll through the dependency chain, return when tx is complete */
do {
iter = tx;
-   while (iter->cookie == -EBUSY)
-   iter = iter->parent;
+
+   /* find the root of the unsubmitted dependency chain */
+   while (iter->cookie == -EBUSY) {
+   parent = iter->parent;
+   if (parent && parent->cookie == -EBUSY)
+   iter = iter->parent;
+   else
+   break;
+   }

> saeed

Regards,
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


PAGE_SIZE=8K and bitmap

2007-08-30 Thread Jan Engelhardt
Hi,


a while back I reported a bug for 2.6.21 where creating an MD raid array 
with internal bitmap on a sparc64 system does not work. I have not yet 
heard back (or I forget); has this been addressed yet?
(mdadm -C /dev/md0 -l 1 -n 2 -e 1.0 -b internal /dev/ram[01])


thanks,
Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md raid acceleration and the async_tx api

2007-08-30 Thread Dan Williams
On 8/30/07, Yuri Tikhonov <[EMAIL PROTECTED]> wrote:
>
>  Hi Dan,
>
> On Monday 27 August 2007 23:12, you wrote:
> > This still looks racy...  I think the complete fix is to make the
> > R5_Wantfill and dev_q->toread accesses atomic.  Please test the
> > following patch (also attached) and let me know if it fixes what you are
> > seeing:
>
>  Your approach doesn't help, the Bonnie++ utility hangs up during the
> ReWriting stage.
>
Looking at it again I see that what I added would not affect the
failure you are seeing.  However I noticed that you are using a broken
version of the stripe-queue and cache_arbiter patches.  In the current
revisions the dev_q->flags field has been moved back to dev->flags
which fixes a data corruption issue and could potentially address the
hang you are seeing.  The latest revisions are:
raid5: add the stripe_queue object for tracking raid io requests (rev2)
raid5: use stripe_queues to prioritize the "most deserving" requests (rev6)

>  Note that before applying your patch I rolled my fix in the
> ops_complete_biofill() function back. Do I understand it right that your
> patch should be used *instead* of my one rather than *with* it ?
>
You understood correctly.  The attached patch integrates your change
to keep R5_Wantfill set while also protecting the 'more_to_read' case.
 Please try it on top of the latest stripe-queue changes [1] (instead
of the other proposed patches) .

>  Regards, Yuri

Thanks,
Dan

[1] git fetch -f git://lost.foo-projects.org/~dwillia2/git/iop
md-for-linus:refs/heads/md-for-linus
raid5: fix the 'more_to_read' case in ops_complete_biofill

From: Dan Williams <[EMAIL PROTECTED]>

Prevent ops_complete_biofill from running concurrently with add_queue_bio

---

 drivers/md/raid5.c |   33 +++--
 1 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 2f9022d..1c591d3 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -828,22 +828,19 @@ static void ops_complete_biofill(void *stripe_head_ref)
 		struct r5dev *dev = &sh->dev[i];
 		struct r5_queue_dev *dev_q = &sq->dev[i];
 
-		/* check if this stripe has new incoming reads */
+		/* 1/ acknowledge completion of a biofill operation
+		 * 2/ check if we need to reply to a read request.
+		 * 3/ check if we need to reschedule handle_stripe
+		 */
 		if (dev_q->toread)
 			more_to_read++;
 
-		/* acknowledge completion of a biofill operation */
-		/* and check if we need to reply to a read request
-		*/
-		if (test_bit(R5_Wantfill, &dev->flags) && !dev_q->toread) {
+		if (test_bit(R5_Wantfill, &dev->flags)) {
 			struct bio *rbi, *rbi2;
-			clear_bit(R5_Wantfill, &dev->flags);
 
-			/* The access to dev->read is outside of the
-			 * spin_lock_irq(&conf->device_lock), but is protected
-			 * by the STRIPE_OP_BIOFILL pending bit
-			 */
-			BUG_ON(!dev->read);
+			if (!dev_q->toread)
+clear_bit(R5_Wantfill, &dev->flags);
+
 			rbi = dev->read;
 			dev->read = NULL;
 			while (rbi && rbi->bi_sector <
@@ -899,8 +896,15 @@ static void ops_run_biofill(struct stripe_head *sh)
 	}
 
 	atomic_inc(&sh->count);
+
+	/* spin_lock prevents ops_complete_biofill from running concurrently
+	 * with add_queue_bio in the synchronous case
+	 */
+	spin_lock(&sq->lock);
 	async_trigger_callback(ASYNC_TX_DEP_ACK | ASYNC_TX_ACK, tx,
 		ops_complete_biofill, sh);
+	spin_unlock(&sq->lock);
+
 }
 
 static void ops_complete_compute5(void *stripe_head_ref)
@@ -2279,7 +2283,8 @@ static int add_queue_bio(struct stripe_queue *sq, struct bio *bi, int dd_idx,
 		(unsigned long long)bi->bi_sector,
 		(unsigned long long)sq->sector);
 
-	spin_lock(&sq->lock);
+	/* prevent asynchronous completions from running */
+	spin_lock_bh(&sq->lock);
 	spin_lock_irq(&conf->device_lock);
 	sh = sq->sh;
 	if (forwrite) {
@@ -2306,7 +2311,7 @@ static int add_queue_bio(struct stripe_queue *sq, struct bio *bi, int dd_idx,
 	*bip = bi;
 	bi->bi_phys_segments ++;
 	spin_unlock_irq(&conf->device_lock);
-	spin_unlock(&sq->lock);
+	spin_unlock_bh(&sq->lock);
 
 	pr_debug("added bi b#%llu to stripe s#%llu, disk %d.\n",
 		(unsigned long long)bi->bi_sector,
@@ -2339,7 +2344,7 @@ static int add_queue_bio(struct stripe_queue *sq, struct bio *bi, int dd_idx,
  overlap:
 	set_bit(R5_Overlap, &sh->dev[dd_idx].flags);
 	spin_unlock_irq(&conf->device_lock);
-	spin_unlock(&sq->lock);
+	spin_unlock_bh(&sq->lock);
 	return 0;
 }
 


Re: [md-accel PATCH 16/19] dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines

2007-08-30 Thread saeed bishara
you are right, I've another question regarding the function
dma_wait_for_async_tx from async_tx.c, here is the body of the code:
   /* poll through the dependency chain, return when tx is complete */
1.do {
2. iter = tx;
3. while (iter->cookie == -EBUSY)
4. iter = iter->parent;
5.
6. status = dma_sync_wait(iter->chan, iter->cookie);
7. } while (status == DMA_IN_PROGRESS || (iter != tx));

assume that:
- The interrupt capability is not provided.
- Request A was sent to chan 0
- Request B that depends on A is sent to chan 1
- Request C that depends on B is send to chan 2.
- Also, assume that when C is handled by async_tx_submit(), B is still
not queued to the dmaengine (cookie equals to -EBUSY).

In this case, dma_wait_for_async_tx will be called for C, now, it
looks for me that the do while will loop forever, even when A gets
completed. this is because the iter will point to B after line 4, thus
the iter != tx (C) will always become true.

saeed
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md raid acceleration and the async_tx api

2007-08-30 Thread Yuri Tikhonov

 Hi Dan,

On Monday 27 August 2007 23:12, you wrote:
> This still looks racy...  I think the complete fix is to make the
> R5_Wantfill and dev_q->toread accesses atomic.  Please test the
> following patch (also attached) and let me know if it fixes what you are
> seeing:

 Your approach doesn't help, the Bonnie++ utility hangs up during the 
ReWriting stage.

 Note that before applying your patch I rolled my fix in the 
ops_complete_biofill() function back. Do I understand it right that your 
patch should be used *instead* of my one rather than *with* it ?

 Regards, Yuri
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html