Re[2]: Bug in processing dependencies by async_tx_submit() ?

2007-11-12 Thread Yuri Tikhonov

 Hi Dan,

On 02.11.2007, 3:36:50 you wrote:

 I am preparing a new patch that replaces ASYNC_TX_DEP_ACK with
 ASYNC_TX_CHAIN_ACK.  The plan is to make the entire chain of
 dependencies available up until the last transaction is submitted.
 This allows the entire dependency chain to be walked at
 async_tx_submit time so that we can properly handle these multiple
 dependency cases.  I'll send it out when it passes my internal
 tests...

 Meanwhile I've implemented my fix to this issue. With this the tests, which
previously failed (mkfs got stuck), now pass successfully for me. Would you 
please take a look? Thanks in advance.

diff --git a/crypto/async_tx/async_tx.c b/crypto/async_tx/async_tx.c
index bc18cbb..6d77ae6 100644
--- a/crypto/async_tx/async_tx.c
+++ b/crypto/async_tx/async_tx.c
@@ -92,7 +92,7 @@ dma_wait_for_async_tx(struct dma_async_tx_descriptor *tx)
/* find the root of the unsubmitted dependency chain */
while (iter-cookie == -EBUSY) {
parent = iter-parent;
-   if (parent  parent-cookie == -EBUSY)
+   if (parent)
iter = iter-parent;
else
break;
@@ -120,10 +120,11 @@ async_tx_run_dependencies(struct dma_async_tx_descriptor 
*tx)
depend_node) {
chan = dep_tx-chan;
dev = chan-device;
-   /* we can't depend on ourselves */
-   BUG_ON(chan == tx-chan);
list_del(dep_tx-depend_node);
-   tx-tx_submit(dep_tx);
+   dep_tx-tx_submit(dep_tx);
+
+   /* we no longer have a parent */
+   dep_tx-parent = NULL;

/* we need to poke the engine as client code does not
 * know about dependency submission events
@@ -409,25 +410,41 @@ async_tx_submit(struct dma_chan *chan, struct 
dma_async_tx_descriptor *tx,
/* set this new tx to run after depend_tx if:
 * 1/ a dependency exists (depend_tx is !NULL)
 * 2/ the tx can not be submitted to the current channel
+* 3/ the depend_tx has a parent
 */
-   if (depend_tx  depend_tx-chan != chan) {
+   if (depend_tx  (depend_tx-chan != chan || depend_tx-parent)) {
/* if ack is already set then we cannot be sure
 * we are referring to the correct operation
 */
BUG_ON(depend_tx-ack);

-   tx-parent = depend_tx;
spin_lock_bh(depend_tx-lock);
+   tx-parent = depend_tx;
list_add_tail(tx-depend_node, depend_tx-depend_list);
if (depend_tx-cookie == 0) {
-   struct dma_chan *dep_chan = depend_tx-chan;
-   struct dma_device *dep_dev = dep_chan-device;
-   dep_dev-device_dependency_added(dep_chan);
-   }
-   spin_unlock_bh(depend_tx-lock);
+   /* depend_tx has been completed, run our dep
+* manually
+*/
+   async_tx_run_dependencies(depend_tx);
+   spin_unlock_bh(depend_tx-lock);
+   } else {
+   /* depend_tx still in fly */
+   spin_unlock_bh(depend_tx-lock);


-   /* schedule an interrupt to trigger the channel switch */
-   async_trigger_callback(ASYNC_TX_ACK, depend_tx, NULL, NULL);
+   /* schedule an interrupt to trigger the channel
+* switch or dependencies submittion
+*/
+   if (!(flags  ASYNC_TX_INT)  (depend_tx-chan != chan 
||
+   !depend_tx-callback))
+   async_trigger_callback(ASYNC_TX_ACK | 
ASYNC_TX_INT,
+   depend_tx, NULL, NULL);
+
+   /* flush the parent if it's not submitted yet */
+   spin_lock_bh(depend_tx-lock);
+   depend_tx-chan-device-device_issue_pending(
+   depend_tx-chan);
+   spin_unlock_bh(depend_tx-lock);
+   }
} else {
tx-parent = NULL;
tx-tx_submit(tx);
diff --git a/include/linux/async_tx.h b/include/linux/async_tx.h
index aea0402..ee09315 100644
--- a/include/linux/async_tx.h
+++ b/include/linux/async_tx.h
@@ -67,6 +67,7 @@ enum async_tx_flags {
ASYNC_TX_KMAP_SRC= (1  5),
ASYNC_TX_KMAP_DST= (1  6),
ASYNC_TX_ASYNC_ONLY  = (1  7),
+   ASYNC_TX_INT = (1  8),
 };

 #ifdef CONFIG_DMA_ENGINE

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a new raid6 with bitmap does not clear bits during resync

2007-11-12 Thread Bill Davidsen

Neil Brown wrote:

On Thursday November 8, [EMAIL PROTECTED] wrote:
  

Hi,

I have created a new raid6:

md0 : active raid6 sdb1[0] sdl1[5] sdj1[4] sdh1[3] sdf1[2] sdd1[1]
  6834868224 blocks level 6, 512k chunk, algorithm 2 [6/6] [UU]
  []  resync = 21.5% (368216964/1708717056) 
finish=448.5min speed=49808K/sec
  bitmap: 204/204 pages [816KB], 4096KB chunk

The raid is totally idle, not mounted and nothing.

So why does the bitmap: 204/204 not sink? I would expect it to clear
bits as it resyncs so it should count slowly down to 0. As a side
effect of the bitmap being all dirty the resync will restart from the
beginning when the system is hard reset. As you can imagine that is
pretty anoying.

On the other hand on a clean shutdown it seems the bitmap gets updated
before stopping the array:

md3 : active raid6 sdc1[0] sdm1[5] sdk1[4] sdi1[3] sdg1[2] sde1[1]
  6834868224 blocks level 6, 512k chunk, algorithm 2 [6/6] [UU]
  [===.]  resync = 38.4% (656155264/1708717056) 
finish=17846.4min speed=982K/sec
  bitmap: 187/204 pages [748KB], 4096KB chunk

Consequently the rebuild did restart and is already further along.




Thanks for the report.

  

Any ideas why that is so?



Yes.  The following patch should explain (a bit tersely) why this was
so, and should also fix it so it will no longer be so.  Test reports
always welcome.

NeilBrown

Status: ok

Update md bitmap during resync.

Currently and md array with a write-intent bitmap does not updated
that bitmap to reflect successful partial resync.  Rather the entire
bitmap is updated when the resync completes.

This is because there is no guarentee that resync requests will
complete in order, and tracking each request individually is
unnecessarily burdensome.

However there is value in regularly updating the bitmap, so add code
to periodically pause while all pending sync requests complete, then
update the bitmap.  Doing this only every few seconds (the same as the
bitmap update time) does not notciable affect resync performance.
  


I wonder if a minimum time and minimum number of stripes would be 
better. If a resync is going slowly because it's going over a slow link 
to iSCSI, nbd, or a box of cheap drives fed off a single USB port, just 
writing the updated bitmap may represent as much data as has been 
resynced in the time slice.


Not a suggestion, but a request for your thoughts on that.

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a new raid6 with bitmap does not clear bits during resync

2007-11-12 Thread Neil Brown
On Monday November 12, [EMAIL PROTECTED] wrote:
 Neil Brown wrote:
 
  However there is value in regularly updating the bitmap, so add code
  to periodically pause while all pending sync requests complete, then
  update the bitmap.  Doing this only every few seconds (the same as the
  bitmap update time) does not notciable affect resync performance.

 
 I wonder if a minimum time and minimum number of stripes would be 
 better. If a resync is going slowly because it's going over a slow link 
 to iSCSI, nbd, or a box of cheap drives fed off a single USB port, just 
 writing the updated bitmap may represent as much data as has been 
 resynced in the time slice.
 
 Not a suggestion, but a request for your thoughts on that.

Thanks for your thoughts.
Choosing how often to update the bitmap during a sync is certainly not
trivial.   In different situations, different requirements might rule.

I chose to base it on time, and particularly on the time we already
have for how soon to write back clean bits to the bitmap because it
is fairly easy to users to understand the implications (if I set the
time to 30 seconds, then I might have to repeat 30second of resync)
and it is already configurable (via the --delay option to --create
--bitmap).

Presumably if someone has a very slow system and wanted to use
bitmaps, they would set --delay relatively large to reduce the cost
and still provide significant benefits.  This would effect both normal
clean-bit writeback and during-resync clean-bit-writeback.

Hope that clarifies my approach.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html