Re: Building a new raid6 with bitmap does not clear bits during resync

2007-11-18 Thread Goswin von Brederlow
Neil Brown [EMAIL PROTECTED] writes:

 On Monday November 12, [EMAIL PROTECTED] wrote:
 Neil Brown wrote:
 
  However there is value in regularly updating the bitmap, so add code
  to periodically pause while all pending sync requests complete, then
  update the bitmap.  Doing this only every few seconds (the same as the
  bitmap update time) does not notciable affect resync performance.

 
 I wonder if a minimum time and minimum number of stripes would be 
 better. If a resync is going slowly because it's going over a slow link 
 to iSCSI, nbd, or a box of cheap drives fed off a single USB port, just 
 writing the updated bitmap may represent as much data as has been 
 resynced in the time slice.
 
 Not a suggestion, but a request for your thoughts on that.

 Thanks for your thoughts.
 Choosing how often to update the bitmap during a sync is certainly not
 trivial.   In different situations, different requirements might rule.

 I chose to base it on time, and particularly on the time we already
 have for how soon to write back clean bits to the bitmap because it
 is fairly easy to users to understand the implications (if I set the
 time to 30 seconds, then I might have to repeat 30second of resync)
 and it is already configurable (via the --delay option to --create
 --bitmap).

 Presumably if someone has a very slow system and wanted to use
 bitmaps, they would set --delay relatively large to reduce the cost
 and still provide significant benefits.  This would effect both normal
 clean-bit writeback and during-resync clean-bit-writeback.

 Hope that clarifies my approach.

 Thanks,
 NeilBrown

We are talking about 12-24h resync times here under idle
conditions. Syncing only once per minute is perfectly acceptable.

MfG
Goswin
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a new raid6 with bitmap does not clear bits during resync

2007-11-14 Thread Bill Davidsen

Neil Brown wrote:

On Monday November 12, [EMAIL PROTECTED] wrote:
  

Neil Brown wrote:


However there is value in regularly updating the bitmap, so add code
to periodically pause while all pending sync requests complete, then
update the bitmap.  Doing this only every few seconds (the same as the
bitmap update time) does not notciable affect resync performance.
  
  
I wonder if a minimum time and minimum number of stripes would be 
better. If a resync is going slowly because it's going over a slow link 
to iSCSI, nbd, or a box of cheap drives fed off a single USB port, just 
writing the updated bitmap may represent as much data as has been 
resynced in the time slice.


Not a suggestion, but a request for your thoughts on that.



Thanks for your thoughts.
Choosing how often to update the bitmap during a sync is certainly not
trivial.   In different situations, different requirements might rule.

I chose to base it on time, and particularly on the time we already
have for how soon to write back clean bits to the bitmap because it
is fairly easy to users to understand the implications (if I set the
time to 30 seconds, then I might have to repeat 30second of resync)
and it is already configurable (via the --delay option to --create
--bitmap).
  


Sounds right, that part of it is pretty user friendly.

Presumably if someone has a very slow system and wanted to use
bitmaps, they would set --delay relatively large to reduce the cost
and still provide significant benefits.  This would effect both normal
clean-bit writeback and during-resync clean-bit-writeback.

Hope that clarifies my approach.
  


Easy to implement and understand is always a strong point, and a user 
can make an informed decision. Thanks for the discussion.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a new raid6 with bitmap does not clear bits during resync

2007-11-12 Thread Bill Davidsen

Neil Brown wrote:

On Thursday November 8, [EMAIL PROTECTED] wrote:
  

Hi,

I have created a new raid6:

md0 : active raid6 sdb1[0] sdl1[5] sdj1[4] sdh1[3] sdf1[2] sdd1[1]
  6834868224 blocks level 6, 512k chunk, algorithm 2 [6/6] [UU]
  []  resync = 21.5% (368216964/1708717056) 
finish=448.5min speed=49808K/sec
  bitmap: 204/204 pages [816KB], 4096KB chunk

The raid is totally idle, not mounted and nothing.

So why does the bitmap: 204/204 not sink? I would expect it to clear
bits as it resyncs so it should count slowly down to 0. As a side
effect of the bitmap being all dirty the resync will restart from the
beginning when the system is hard reset. As you can imagine that is
pretty anoying.

On the other hand on a clean shutdown it seems the bitmap gets updated
before stopping the array:

md3 : active raid6 sdc1[0] sdm1[5] sdk1[4] sdi1[3] sdg1[2] sde1[1]
  6834868224 blocks level 6, 512k chunk, algorithm 2 [6/6] [UU]
  [===.]  resync = 38.4% (656155264/1708717056) 
finish=17846.4min speed=982K/sec
  bitmap: 187/204 pages [748KB], 4096KB chunk

Consequently the rebuild did restart and is already further along.




Thanks for the report.

  

Any ideas why that is so?



Yes.  The following patch should explain (a bit tersely) why this was
so, and should also fix it so it will no longer be so.  Test reports
always welcome.

NeilBrown

Status: ok

Update md bitmap during resync.

Currently and md array with a write-intent bitmap does not updated
that bitmap to reflect successful partial resync.  Rather the entire
bitmap is updated when the resync completes.

This is because there is no guarentee that resync requests will
complete in order, and tracking each request individually is
unnecessarily burdensome.

However there is value in regularly updating the bitmap, so add code
to periodically pause while all pending sync requests complete, then
update the bitmap.  Doing this only every few seconds (the same as the
bitmap update time) does not notciable affect resync performance.
  


I wonder if a minimum time and minimum number of stripes would be 
better. If a resync is going slowly because it's going over a slow link 
to iSCSI, nbd, or a box of cheap drives fed off a single USB port, just 
writing the updated bitmap may represent as much data as has been 
resynced in the time slice.


Not a suggestion, but a request for your thoughts on that.

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a new raid6 with bitmap does not clear bits during resync

2007-11-12 Thread Neil Brown
On Monday November 12, [EMAIL PROTECTED] wrote:
 Neil Brown wrote:
 
  However there is value in regularly updating the bitmap, so add code
  to periodically pause while all pending sync requests complete, then
  update the bitmap.  Doing this only every few seconds (the same as the
  bitmap update time) does not notciable affect resync performance.

 
 I wonder if a minimum time and minimum number of stripes would be 
 better. If a resync is going slowly because it's going over a slow link 
 to iSCSI, nbd, or a box of cheap drives fed off a single USB port, just 
 writing the updated bitmap may represent as much data as has been 
 resynced in the time slice.
 
 Not a suggestion, but a request for your thoughts on that.

Thanks for your thoughts.
Choosing how often to update the bitmap during a sync is certainly not
trivial.   In different situations, different requirements might rule.

I chose to base it on time, and particularly on the time we already
have for how soon to write back clean bits to the bitmap because it
is fairly easy to users to understand the implications (if I set the
time to 30 seconds, then I might have to repeat 30second of resync)
and it is already configurable (via the --delay option to --create
--bitmap).

Presumably if someone has a very slow system and wanted to use
bitmaps, they would set --delay relatively large to reduce the cost
and still provide significant benefits.  This would effect both normal
clean-bit writeback and during-resync clean-bit-writeback.

Hope that clarifies my approach.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a new raid6 with bitmap does not clear bits during resync

2007-11-11 Thread Neil Brown
On Thursday November 8, [EMAIL PROTECTED] wrote:
 Hi,
 
 I have created a new raid6:
 
 md0 : active raid6 sdb1[0] sdl1[5] sdj1[4] sdh1[3] sdf1[2] sdd1[1]
   6834868224 blocks level 6, 512k chunk, algorithm 2 [6/6] [UU]
   []  resync = 21.5% (368216964/1708717056) 
 finish=448.5min speed=49808K/sec
   bitmap: 204/204 pages [816KB], 4096KB chunk
 
 The raid is totally idle, not mounted and nothing.
 
 So why does the bitmap: 204/204 not sink? I would expect it to clear
 bits as it resyncs so it should count slowly down to 0. As a side
 effect of the bitmap being all dirty the resync will restart from the
 beginning when the system is hard reset. As you can imagine that is
 pretty anoying.
 
 On the other hand on a clean shutdown it seems the bitmap gets updated
 before stopping the array:
 
 md3 : active raid6 sdc1[0] sdm1[5] sdk1[4] sdi1[3] sdg1[2] sde1[1]
   6834868224 blocks level 6, 512k chunk, algorithm 2 [6/6] [UU]
   [===.]  resync = 38.4% (656155264/1708717056) 
 finish=17846.4min speed=982K/sec
   bitmap: 187/204 pages [748KB], 4096KB chunk
 
 Consequently the rebuild did restart and is already further along.
 

Thanks for the report.

 
 Any ideas why that is so?

Yes.  The following patch should explain (a bit tersely) why this was
so, and should also fix it so it will no longer be so.  Test reports
always welcome.

NeilBrown

Status: ok

Update md bitmap during resync.

Currently and md array with a write-intent bitmap does not updated
that bitmap to reflect successful partial resync.  Rather the entire
bitmap is updated when the resync completes.

This is because there is no guarentee that resync requests will
complete in order, and tracking each request individually is
unnecessarily burdensome.

However there is value in regularly updating the bitmap, so add code
to periodically pause while all pending sync requests complete, then
update the bitmap.  Doing this only every few seconds (the same as the
bitmap update time) does not notciable affect resync performance.

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/bitmap.c |   34 +-
 ./drivers/md/raid1.c  |1 +
 ./drivers/md/raid10.c |2 ++
 ./drivers/md/raid5.c  |3 +++
 ./include/linux/raid/bitmap.h |3 +++
 5 files changed, 38 insertions(+), 5 deletions(-)

diff .prev/drivers/md/bitmap.c ./drivers/md/bitmap.c
--- .prev/drivers/md/bitmap.c   2007-10-22 16:55:52.0 +1000
+++ ./drivers/md/bitmap.c   2007-11-12 16:36:30.0 +1100
@@ -1349,14 +1349,38 @@ void bitmap_close_sync(struct bitmap *bi
 */
sector_t sector = 0;
int blocks;
-   if (!bitmap) return;
+   if (!bitmap)
+   return;
while (sector  bitmap-mddev-resync_max_sectors) {
bitmap_end_sync(bitmap, sector, blocks, 0);
-/*
-   if (sector  500) printk(bitmap_close_sync: sec %llu blks 
%d\n,
-(unsigned long long)sector, blocks);
-*/ sector += blocks;
+   sector += blocks;
+   }
+}
+
+void bitmap_cond_end_sync(struct bitmap *bitmap, sector_t sector)
+{
+   sector_t s = 0;
+   int blocks;
+
+   if (!bitmap)
+   return;
+   if (sector == 0) {
+   bitmap-last_end_sync = jiffies;
+   return;
+   }
+   if (time_before(jiffies, (bitmap-last_end_sync
+ + bitmap-daemon_sleep * HZ)))
+   return;
+   wait_event(bitmap-mddev-recovery_wait,
+  atomic_read(bitmap-mddev-recovery_active) == 0);
+
+   sector = ~((1ULL  CHUNK_BLOCK_SHIFT(bitmap)) - 1);
+   s = 0;
+   while (s  sector  s  bitmap-mddev-resync_max_sectors) {
+   bitmap_end_sync(bitmap, s, blocks, 0);
+   s += blocks;
}
+   bitmap-last_end_sync = jiffies;
 }
 
 static void bitmap_set_memory_bits(struct bitmap *bitmap, sector_t offset, int 
needed)

diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
--- .prev/drivers/md/raid10.c   2007-10-30 13:50:45.0 +1100
+++ ./drivers/md/raid10.c   2007-11-12 16:06:39.0 +1100
@@ -1671,6 +1671,8 @@ static sector_t sync_request(mddev_t *md
if (!go_faster  conf-nr_waiting)
msleep_interruptible(1000);
 
+   bitmap_cond_end_sync(mddev-bitmap, sector_nr);
+
/* Again, very different code for resync and recovery.
 * Both must result in an r10bio with a list of bios that
 * have bi_end_io, bi_sector, bi_bdev set,

diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c2007-10-30 13:50:45.0 +1100
+++ ./drivers/md/raid1.c2007-11-12 16:06:12.0 +1100
@@ -1685,6 +1685,7 @@ static sector_t sync_request(mddev_t *md
if (!go_faster  conf-nr_waiting)
msleep_interruptible(1000);