Re: [md-accel PATCH 16/19] dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines
On 8/30/07, saeed bishara <[EMAIL PROTECTED]> wrote: > you are right, I've another question regarding the function > dma_wait_for_async_tx from async_tx.c, here is the body of the code: >/* poll through the dependency chain, return when tx is complete */ > 1.do { > 2. iter = tx; > 3. while (iter->cookie == -EBUSY) > 4. iter = iter->parent; > 5. > 6. status = dma_sync_wait(iter->chan, iter->cookie); > 7. } while (status == DMA_IN_PROGRESS || (iter != tx)); > > assume that: > - The interrupt capability is not provided. > - Request A was sent to chan 0 > - Request B that depends on A is sent to chan 1 > - Request C that depends on B is send to chan 2. > - Also, assume that when C is handled by async_tx_submit(), B is still > not queued to the dmaengine (cookie equals to -EBUSY). > > In this case, dma_wait_for_async_tx will be called for C, now, it > looks for me that the do while will loop forever, even when A gets > completed. this is because the iter will point to B after line 4, thus > the iter != tx (C) will always become true. > You are right. There are no drivers in the tree that can hit this, but it needs to be fixed up. I'll submit the following change: diff --git a/crypto/async_tx/async_tx.c b/crypto/async_tx/async_tx.c index 0350071..bc18cbb 100644 --- a/crypto/async_tx/async_tx.c +++ b/crypto/async_tx/async_tx.c @@ -80,6 +80,7 @@ dma_wait_for_async_tx(struct dma_async_tx_descriptor *tx) { enum dma_status status; struct dma_async_tx_descriptor *iter; + struct dma_async_tx_descriptor *parent; if (!tx) return DMA_SUCCESS; @@ -87,8 +88,15 @@ dma_wait_for_async_tx(struct dma_async_tx_descriptor *tx) /* poll through the dependency chain, return when tx is complete */ do { iter = tx; - while (iter->cookie == -EBUSY) - iter = iter->parent; + + /* find the root of the unsubmitted dependency chain */ + while (iter->cookie == -EBUSY) { + parent = iter->parent; + if (parent && parent->cookie == -EBUSY) + iter = iter->parent; + else + break; + } > saeed Regards, Dan - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
PAGE_SIZE=8K and bitmap
Hi, a while back I reported a bug for 2.6.21 where creating an MD raid array with internal bitmap on a sparc64 system does not work. I have not yet heard back (or I forget); has this been addressed yet? (mdadm -C /dev/md0 -l 1 -n 2 -e 1.0 -b internal /dev/ram[01]) thanks, Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: md raid acceleration and the async_tx api
On 8/30/07, Yuri Tikhonov <[EMAIL PROTECTED]> wrote: > > Hi Dan, > > On Monday 27 August 2007 23:12, you wrote: > > This still looks racy... I think the complete fix is to make the > > R5_Wantfill and dev_q->toread accesses atomic. Please test the > > following patch (also attached) and let me know if it fixes what you are > > seeing: > > Your approach doesn't help, the Bonnie++ utility hangs up during the > ReWriting stage. > Looking at it again I see that what I added would not affect the failure you are seeing. However I noticed that you are using a broken version of the stripe-queue and cache_arbiter patches. In the current revisions the dev_q->flags field has been moved back to dev->flags which fixes a data corruption issue and could potentially address the hang you are seeing. The latest revisions are: raid5: add the stripe_queue object for tracking raid io requests (rev2) raid5: use stripe_queues to prioritize the "most deserving" requests (rev6) > Note that before applying your patch I rolled my fix in the > ops_complete_biofill() function back. Do I understand it right that your > patch should be used *instead* of my one rather than *with* it ? > You understood correctly. The attached patch integrates your change to keep R5_Wantfill set while also protecting the 'more_to_read' case. Please try it on top of the latest stripe-queue changes [1] (instead of the other proposed patches) . > Regards, Yuri Thanks, Dan [1] git fetch -f git://lost.foo-projects.org/~dwillia2/git/iop md-for-linus:refs/heads/md-for-linus raid5: fix the 'more_to_read' case in ops_complete_biofill From: Dan Williams <[EMAIL PROTECTED]> Prevent ops_complete_biofill from running concurrently with add_queue_bio --- drivers/md/raid5.c | 33 +++-- 1 files changed, 19 insertions(+), 14 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 2f9022d..1c591d3 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -828,22 +828,19 @@ static void ops_complete_biofill(void *stripe_head_ref) struct r5dev *dev = &sh->dev[i]; struct r5_queue_dev *dev_q = &sq->dev[i]; - /* check if this stripe has new incoming reads */ + /* 1/ acknowledge completion of a biofill operation + * 2/ check if we need to reply to a read request. + * 3/ check if we need to reschedule handle_stripe + */ if (dev_q->toread) more_to_read++; - /* acknowledge completion of a biofill operation */ - /* and check if we need to reply to a read request - */ - if (test_bit(R5_Wantfill, &dev->flags) && !dev_q->toread) { + if (test_bit(R5_Wantfill, &dev->flags)) { struct bio *rbi, *rbi2; - clear_bit(R5_Wantfill, &dev->flags); - /* The access to dev->read is outside of the - * spin_lock_irq(&conf->device_lock), but is protected - * by the STRIPE_OP_BIOFILL pending bit - */ - BUG_ON(!dev->read); + if (!dev_q->toread) +clear_bit(R5_Wantfill, &dev->flags); + rbi = dev->read; dev->read = NULL; while (rbi && rbi->bi_sector < @@ -899,8 +896,15 @@ static void ops_run_biofill(struct stripe_head *sh) } atomic_inc(&sh->count); + + /* spin_lock prevents ops_complete_biofill from running concurrently + * with add_queue_bio in the synchronous case + */ + spin_lock(&sq->lock); async_trigger_callback(ASYNC_TX_DEP_ACK | ASYNC_TX_ACK, tx, ops_complete_biofill, sh); + spin_unlock(&sq->lock); + } static void ops_complete_compute5(void *stripe_head_ref) @@ -2279,7 +2283,8 @@ static int add_queue_bio(struct stripe_queue *sq, struct bio *bi, int dd_idx, (unsigned long long)bi->bi_sector, (unsigned long long)sq->sector); - spin_lock(&sq->lock); + /* prevent asynchronous completions from running */ + spin_lock_bh(&sq->lock); spin_lock_irq(&conf->device_lock); sh = sq->sh; if (forwrite) { @@ -2306,7 +2311,7 @@ static int add_queue_bio(struct stripe_queue *sq, struct bio *bi, int dd_idx, *bip = bi; bi->bi_phys_segments ++; spin_unlock_irq(&conf->device_lock); - spin_unlock(&sq->lock); + spin_unlock_bh(&sq->lock); pr_debug("added bi b#%llu to stripe s#%llu, disk %d.\n", (unsigned long long)bi->bi_sector, @@ -2339,7 +2344,7 @@ static int add_queue_bio(struct stripe_queue *sq, struct bio *bi, int dd_idx, overlap: set_bit(R5_Overlap, &sh->dev[dd_idx].flags); spin_unlock_irq(&conf->device_lock); - spin_unlock(&sq->lock); + spin_unlock_bh(&sq->lock); return 0; }
Re: [md-accel PATCH 16/19] dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines
you are right, I've another question regarding the function dma_wait_for_async_tx from async_tx.c, here is the body of the code: /* poll through the dependency chain, return when tx is complete */ 1.do { 2. iter = tx; 3. while (iter->cookie == -EBUSY) 4. iter = iter->parent; 5. 6. status = dma_sync_wait(iter->chan, iter->cookie); 7. } while (status == DMA_IN_PROGRESS || (iter != tx)); assume that: - The interrupt capability is not provided. - Request A was sent to chan 0 - Request B that depends on A is sent to chan 1 - Request C that depends on B is send to chan 2. - Also, assume that when C is handled by async_tx_submit(), B is still not queued to the dmaengine (cookie equals to -EBUSY). In this case, dma_wait_for_async_tx will be called for C, now, it looks for me that the do while will loop forever, even when A gets completed. this is because the iter will point to B after line 4, thus the iter != tx (C) will always become true. saeed - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: md raid acceleration and the async_tx api
Hi Dan, On Monday 27 August 2007 23:12, you wrote: > This still looks racy... I think the complete fix is to make the > R5_Wantfill and dev_q->toread accesses atomic. Please test the > following patch (also attached) and let me know if it fixes what you are > seeing: Your approach doesn't help, the Bonnie++ utility hangs up during the ReWriting stage. Note that before applying your patch I rolled my fix in the ops_complete_biofill() function back. Do I understand it right that your patch should be used *instead* of my one rather than *with* it ? Regards, Yuri - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html