Re: [patch] md: pass down BIO_RW_SYNC in raid{1,10}

2007-01-09 Thread Mike Snitzer

On 1/8/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Mon, 8 Jan 2007 10:08:34 +0100
Lars Ellenberg <[EMAIL PROTECTED]> wrote:

> md raidX make_request functions strip off the BIO_RW_SYNC flag,
> thus introducing additional latency.
>
> fixing this in raid1 and raid10 seems to be straight forward enough.
>
> for our particular usage case in DRBD, passing this flag improved
> some initialization time from ~5 minutes to ~5 seconds.

That sounds like a significant fix.


So will this fix also improve performance associated with raid1's
internal bitmap support?  What is the scope of the performance
problems this fix will address?  That is, what are some other examples
of where users might see a benefit from this patch?

regards,
Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] md: pass down BIO_RW_SYNC in raid{1,10}

2007-01-09 Thread Lars Ellenberg
/ 2007-01-09 08:12:25 +0100
\ Jens Axboe:
> Ack from me as well, it's really a quite nasty bug from a performance
> POV. Not just for DRDB, but for io schedulers as well.

its DRBD, btw... we still work on the database, too, of course :)

something should be done with the other raid levels as well,
but how to do that was not as obvious to me, unfortunately.

-- 
: Lars EllenbergTel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH  Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europehttp://www.linbit.com :
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] md: pass down BIO_RW_SYNC in raid{1,10}

2007-01-08 Thread Jens Axboe
On Tue, Jan 09 2007, Neil Brown wrote:
> On Monday January 8, [EMAIL PROTECTED] wrote:
> > On Mon, 8 Jan 2007 10:08:34 +0100
> > Lars Ellenberg <[EMAIL PROTECTED]> wrote:
> > 
> > > md raidX make_request functions strip off the BIO_RW_SYNC flag,
> > > thus introducing additional latency.
> > > 
> > > fixing this in raid1 and raid10 seems to be straight forward enough.
> > > 
> > > for our particular usage case in DRBD, passing this flag improved
> > > some initialization time from ~5 minutes to ~5 seconds.
> > 
> > That sounds like a significant fix.
> > 
> > This patch also applies to 2.6.19 and I have tagged it for a -stable
> > backport.  Neil, are you OK with that?
> 
> Yes, I'm OK with that, thanks.

Ack from me as well, it's really a quite nasty bug from a performance
POV. Not just for DRDB, but for io schedulers as well.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] md: pass down BIO_RW_SYNC in raid{1,10}

2007-01-08 Thread Neil Brown
On Monday January 8, [EMAIL PROTECTED] wrote:
> On Mon, 8 Jan 2007 10:08:34 +0100
> Lars Ellenberg <[EMAIL PROTECTED]> wrote:
> 
> > md raidX make_request functions strip off the BIO_RW_SYNC flag,
> > thus introducing additional latency.
> > 
> > fixing this in raid1 and raid10 seems to be straight forward enough.
> > 
> > for our particular usage case in DRBD, passing this flag improved
> > some initialization time from ~5 minutes to ~5 seconds.
> 
> That sounds like a significant fix.
> 
> This patch also applies to 2.6.19 and I have tagged it for a -stable
> backport.  Neil, are you OK with that?

Yes, I'm OK with that, thanks.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] md: pass down BIO_RW_SYNC in raid{1,10}

2007-01-08 Thread Andrew Morton
On Mon, 8 Jan 2007 10:08:34 +0100
Lars Ellenberg <[EMAIL PROTECTED]> wrote:

> md raidX make_request functions strip off the BIO_RW_SYNC flag,
> thus introducing additional latency.
> 
> fixing this in raid1 and raid10 seems to be straight forward enough.
> 
> for our particular usage case in DRBD, passing this flag improved
> some initialization time from ~5 minutes to ~5 seconds.

That sounds like a significant fix.

This patch also applies to 2.6.19 and I have tagged it for a -stable
backport.  Neil, are you OK with that?

> Acked-by: NeilBrown <[EMAIL PROTECTED]>
> Signed-off-by: Lars Ellenberg <[EMAIL PROTECTED]>
>
> ---
> 
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index b30f74b..164b25d 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -775,6 +775,7 @@ static int make_request(request_queue_t 
>   struct bio_list bl;
>   struct page **behind_pages = NULL;
>   const int rw = bio_data_dir(bio);
> + const int do_sync = bio_sync(bio);
>   int do_barriers;
>  
>   /*
> @@ -835,7 +836,7 @@ static int make_request(request_queue_t 
>   read_bio->bi_sector = r1_bio->sector + 
> mirror->rdev->data_offset;
>   read_bio->bi_bdev = mirror->rdev->bdev;
>   read_bio->bi_end_io = raid1_end_read_request;
> - read_bio->bi_rw = READ;
> + read_bio->bi_rw = READ | do_sync;
>   read_bio->bi_private = r1_bio;
>  
>   generic_make_request(read_bio);
> @@ -906,7 +907,7 @@ #endif
>   mbio->bi_sector = r1_bio->sector + 
> conf->mirrors[i].rdev->data_offset;
>   mbio->bi_bdev = conf->mirrors[i].rdev->bdev;
>   mbio->bi_end_io = raid1_end_write_request;
> - mbio->bi_rw = WRITE | do_barriers;
> + mbio->bi_rw = WRITE | do_barriers | do_sync;
>   mbio->bi_private = r1_bio;
>  
>   if (behind_pages) {
> @@ -941,6 +942,8 @@ #endif
>   blk_plug_device(mddev->queue);
>   spin_unlock_irqrestore(&conf->device_lock, flags);
>  
> + if (do_sync)
> + md_wakeup_thread(mddev->thread);
>  #if 0
>   while ((bio = bio_list_pop(&bl)) != NULL)
>   generic_make_request(bio);
> @@ -1541,6 +1544,7 @@ static void raid1d(mddev_t *mddev)
>* We already have a nr_pending reference on these 
> rdevs.
>*/
>   int i;
> + const int do_sync = bio_sync(r1_bio->master_bio);
>   clear_bit(R1BIO_BarrierRetry, &r1_bio->state);
>   clear_bit(R1BIO_Barrier, &r1_bio->state);
>   for (i=0; i < conf->raid_disks; i++)
> @@ -1561,7 +1565,7 @@ static void raid1d(mddev_t *mddev)
>   
> conf->mirrors[i].rdev->data_offset;
>   bio->bi_bdev = 
> conf->mirrors[i].rdev->bdev;
>   bio->bi_end_io = 
> raid1_end_write_request;
> - bio->bi_rw = WRITE;
> + bio->bi_rw = WRITE | do_sync;
>   bio->bi_private = r1_bio;
>   r1_bio->bios[i] = bio;
>   generic_make_request(bio);
> @@ -1593,6 +1597,7 @@ static void raid1d(mddev_t *mddev)
>  (unsigned long long)r1_bio->sector);
>   raid_end_bio_io(r1_bio);
>   } else {
> + const int do_sync = 
> bio_sync(r1_bio->master_bio);
>   r1_bio->bios[r1_bio->read_disk] =
>   mddev->ro ? IO_BLOCKED : NULL;
>   r1_bio->read_disk = disk;
> @@ -1608,7 +1613,7 @@ static void raid1d(mddev_t *mddev)
>   bio->bi_sector = r1_bio->sector + 
> rdev->data_offset;
>   bio->bi_bdev = rdev->bdev;
>   bio->bi_end_io = raid1_end_read_request;
> - bio->bi_rw = READ;
> + bio->bi_rw = READ | do_sync;
>   bio->bi_private = r1_bio;
>   unplug = 1;
>   generic_make_request(bio);
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index f014191..a9401c0 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -782,6 +782,7 @@ static int make_request(request_queue_t 
>   int i;
>   int chunk_sects = conf->chunk_mask + 1;
>   const int rw = bio_data_dir(bio);
> + const int do_sync = bio_sync(bio);
>   struct bio_list bl;
>   unsigned long flags;
>  
> @@ -863,7 +864,7 @@ static int make_request(request_queue_t 
>   mirror->rdev->data_offset;
>   read_bio->bi_bdev = mirror->rdev->bdev;
>  

[patch] md: pass down BIO_RW_SYNC in raid{1,10}

2007-01-08 Thread Lars Ellenberg
md raidX make_request functions strip off the BIO_RW_SYNC flag,
thus introducing additional latency.

fixing this in raid1 and raid10 seems to be straight forward enough.

for our particular usage case in DRBD, passing this flag improved
some initialization time from ~5 minutes to ~5 seconds.

Acked-by: NeilBrown <[EMAIL PROTECTED]>
Signed-off-by: Lars Ellenberg <[EMAIL PROTECTED]>

---

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index b30f74b..164b25d 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -775,6 +775,7 @@ static int make_request(request_queue_t 
struct bio_list bl;
struct page **behind_pages = NULL;
const int rw = bio_data_dir(bio);
+   const int do_sync = bio_sync(bio);
int do_barriers;
 
/*
@@ -835,7 +836,7 @@ static int make_request(request_queue_t 
read_bio->bi_sector = r1_bio->sector + 
mirror->rdev->data_offset;
read_bio->bi_bdev = mirror->rdev->bdev;
read_bio->bi_end_io = raid1_end_read_request;
-   read_bio->bi_rw = READ;
+   read_bio->bi_rw = READ | do_sync;
read_bio->bi_private = r1_bio;
 
generic_make_request(read_bio);
@@ -906,7 +907,7 @@ #endif
mbio->bi_sector = r1_bio->sector + 
conf->mirrors[i].rdev->data_offset;
mbio->bi_bdev = conf->mirrors[i].rdev->bdev;
mbio->bi_end_io = raid1_end_write_request;
-   mbio->bi_rw = WRITE | do_barriers;
+   mbio->bi_rw = WRITE | do_barriers | do_sync;
mbio->bi_private = r1_bio;
 
if (behind_pages) {
@@ -941,6 +942,8 @@ #endif
blk_plug_device(mddev->queue);
spin_unlock_irqrestore(&conf->device_lock, flags);
 
+   if (do_sync)
+   md_wakeup_thread(mddev->thread);
 #if 0
while ((bio = bio_list_pop(&bl)) != NULL)
generic_make_request(bio);
@@ -1541,6 +1544,7 @@ static void raid1d(mddev_t *mddev)
 * We already have a nr_pending reference on these 
rdevs.
 */
int i;
+   const int do_sync = bio_sync(r1_bio->master_bio);
clear_bit(R1BIO_BarrierRetry, &r1_bio->state);
clear_bit(R1BIO_Barrier, &r1_bio->state);
for (i=0; i < conf->raid_disks; i++)
@@ -1561,7 +1565,7 @@ static void raid1d(mddev_t *mddev)

conf->mirrors[i].rdev->data_offset;
bio->bi_bdev = 
conf->mirrors[i].rdev->bdev;
bio->bi_end_io = 
raid1_end_write_request;
-   bio->bi_rw = WRITE;
+   bio->bi_rw = WRITE | do_sync;
bio->bi_private = r1_bio;
r1_bio->bios[i] = bio;
generic_make_request(bio);
@@ -1593,6 +1597,7 @@ static void raid1d(mddev_t *mddev)
   (unsigned long long)r1_bio->sector);
raid_end_bio_io(r1_bio);
} else {
+   const int do_sync = 
bio_sync(r1_bio->master_bio);
r1_bio->bios[r1_bio->read_disk] =
mddev->ro ? IO_BLOCKED : NULL;
r1_bio->read_disk = disk;
@@ -1608,7 +1613,7 @@ static void raid1d(mddev_t *mddev)
bio->bi_sector = r1_bio->sector + 
rdev->data_offset;
bio->bi_bdev = rdev->bdev;
bio->bi_end_io = raid1_end_read_request;
-   bio->bi_rw = READ;
+   bio->bi_rw = READ | do_sync;
bio->bi_private = r1_bio;
unplug = 1;
generic_make_request(bio);
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index f014191..a9401c0 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -782,6 +782,7 @@ static int make_request(request_queue_t 
int i;
int chunk_sects = conf->chunk_mask + 1;
const int rw = bio_data_dir(bio);
+   const int do_sync = bio_sync(bio);
struct bio_list bl;
unsigned long flags;
 
@@ -863,7 +864,7 @@ static int make_request(request_queue_t 
mirror->rdev->data_offset;
read_bio->bi_bdev = mirror->rdev->bdev;
read_bio->bi_end_io = raid10_end_read_request;
-   read_bio->bi_rw = READ;
+   read_bio->bi_rw = READ | do_sync;
read_bio->bi_private = r10_bio;
 
generic_make_request(read_bio);
@@ -909,7 +910,7 @@ static int make_request(request_queue_t