Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Jens Axboe

On Tue, Sep 19 2000, Andrea Arcangeli wrote:
> > 7[3] 8[2] 9[1] 10[0] 3[3] 4[2] 5[1] 6[0] 1[3] 2[2]
>  p
> With point `p' I mean the request after last barrier in the queue.

Ah, I suspected we were talking past each other.

> Then when we try to insert 99 it goes here:
> 
>   100[0] 102[3] 103[3] 104[3] 99[3]
>   p
> 
> So we have two low peaks in the not starving queue and we should move the p
> to the latest on the right.

Ok good, I've read Peter's patch now. Looks good, I've put it in my
tree as well and will do some testing.

> Also we should make different cases in function of what p->prev is
> (barrier/head/real_head/normalreq).
> 
> I don't think it's worthwhile (even with the current algorithm where it's easy
> to account for p).

I suspect you are right, it's marginal.

-- 
* Jens Axboe <[EMAIL PROTECTED]>
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Andrea Arcangeli

On Tue, Sep 19, 2000 at 10:38:52PM +0200, Jens Axboe wrote:
> I haven't had a chance to really look at Peter's mods yet, but surely
> the current elevator can have many entries with 0 sequence. As an
> example, say the start sequence is 3 and we received request sector
> 10...1 in descending order. The sorted order would then be
> 
> 7[3] 8[2] 9[1] 10[0] 3[3] 4[2] 5[1] 6[0] 1[3] 2[2]
   p
With point `p' I mean the request after last barrier in the queue.

With the current algorithm we can trivially account for p each time
we do elevator_sequence--. elevator_sequence can't go below zero
and we never play with stuff before the last barrier. So the first
time !--req->elevator_sequence is true we would know that req->next
is the new p (if it's not real_head).

Then we would insert at the left of p if the new req is lower than p.


With the modification instead we could have this queue:

100[0]

Then we insert 102 103 and 104:

100[0] 102[3] 103[3] 104[3]
   p

Then when we try to insert 99 it goes here:

100[0] 102[3] 103[3] 104[3] 99[3]
p

So we have two low peaks in the not starving queue and we should move the p
to the latest on the right.

Also we should make different cases in function of what p->prev is
(barrier/head/real_head/normalreq).

I don't think it's worthwhile (even with the current algorithm where it's easy
to account for p).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Jens Axboe

On Tue, Sep 19 2000, Andrea Arcangeli wrote:
> > I don't see any reason why there can't be multiple p points in the current
> > elevator.
> 
> Without the proposed modification after the last barrier in the queue all the
> requests should be in order.

I haven't had a chance to really look at Peter's mods yet, but surely
the current elevator can have many entries with 0 sequence. As an
example, say the start sequence is 3 and we received request sector
10...1 in descending order. The sorted order would then be

7[3] 8[2] 9[1] 10[0] 3[3] 4[2] 5[1] 6[0] 1[3] 2[2]

(sector[sequence])

-- 
* Jens Axboe <[EMAIL PROTECTED]>
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Andrea Arcangeli

On Tue, Sep 19, 2000 at 09:41:17PM +0200, Jens Axboe wrote:
> I don't see any reason why there can't be multiple p points in the current
> elevator.

Without the proposed modification after the last barrier in the queue all the
requests should be in order.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Jens Axboe

On Tue, Sep 19 2000, Andrea Arcangeli wrote:
> > But there may be several p points in the queue, how are you going
> > to keep track of all of them?
> 
> With the current elevator there should be only 1 p point, but with the

I don't see any reason why there can't be multiple p points in the current
elevator.

> modification Peter suggested there can be more and we should track the one
> more on the back of the queue. I don't think it's worthwhile.

Agree, I don't think the added complexity would be worth it.

-- 
* Jens Axboe <[EMAIL PROTECTED]>
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Andrea Arcangeli

On Tue, Sep 19, 2000 at 09:17:51PM +0200, Jens Axboe wrote:
> But there may be several p points in the queue, how are you going
> to keep track of all of them?

With the current elevator there should be only 1 p point, but with the
modification Peter suggested there can be more and we should track the one more
on the back of the queue. I don't think it's worthwhile.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Andrea Arcangeli

On Tue, Sep 19, 2000 at 08:30:05PM +0200, Peter Osterlund wrote:
> It is however possible to decide in O(1) time if the correct insertion
> point is at the end of the queue. We have to keep track of the point,

Right.

> [..] How long
> can the request queue be? Does it have a fixed upper size, or is it

It have a fixed upper size (it's a #define).

> [..] Note that the worst case
> complexity is still O(n) for all algorithms discussed so far.

Right.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Jens Axboe

On Tue, Sep 19 2000, Peter Osterlund wrote:
> > 2) the block number is smaller than head (or head->next
> >if the current request is unplugged)
> 
> The second condition is not so simple in the case of latency control.
> Consider the following queue:
> 
> sector:   100 200 300 400 10 20 30
> sequence: 1   1   1   1   0  1  1
> 
> In this case it would be correct to insert 150 at the end even though
> it is >100, because no more requests are allowed to pass the "10"
> request.
> 
> It is however possible to decide in O(1) time if the correct insertion
> point is at the end of the queue. We have to keep track of the point,
> p, where no more requests may pass. (10 in the example above.) The logic
> would then be:
> 
>   int insert_at_tail = 0;
>   if (IN_ORDER(p, last)) {
>   if (IN_ORDER(last, req) || IN_ORDER(req, p))
>   insert_at_tail = 1;
>   } else {
>   if (IN_ORDER(last, req) && IN_ORDER(req, p))
>   insert_at_tail = 1;
>   }
>   if (insert_at_tail) {
>   /* Do it in O(1) */
>   } else {
>   /* Do normal O(n) scan and update latencies */
>   }

But there may be several p points in the queue, how are you going
to keep track of all of them?

> The question is if this is worth the extra code complexity. How long
> can the request queue be? Does it have a fixed upper size, or is it
> limited only by available memory? If the request queue is always
> short, the O(n) complexity shouldn't matter. Note that the worst case
> complexity is still O(n) for all algorithms discussed so far.

See QUEUE_NR_REQUESTS in blkdev.h -- the standard size is 256 requests
per queue.

-- 
* Jens Axboe <[EMAIL PROTECTED]>
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Jens Axboe

On Tue, Sep 19 2000, Rik van Riel wrote:
> This is a bug in Andrea's idea.  The request should only
> be inserted at the end of the list if:
> 
> 1) the block numbre is bigger than head->prev (which you
>already have)

Of course.

> 2) the block number is smaller than head (or head->next
>if the current request is unplugged)

Not necessarily, that's what we have the sequencing numbers for. It
still allows sorting, as long as we are allowed to pass the request.
And note that it is only head->next when the queue is unplugged with
an active head.

-- 
* Jens Axboe <[EMAIL PROTECTED]>
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Peter Osterlund

Rik van Riel <[EMAIL PROTECTED]> writes:

> This is a bug in Andrea's idea.  The request should only
> be inserted at the end of the list if:
> 
> 1) the block numbre is bigger than head->prev (which you
>already have)
> 
> AND
> 
> 2) the block number is smaller than head (or head->next
>if the current request is unplugged)

The second condition is not so simple in the case of latency control.
Consider the following queue:

sector:   100 200 300 400 10 20 30
sequence: 1   1   1   1   0  1  1

In this case it would be correct to insert 150 at the end even though
it is >100, because no more requests are allowed to pass the "10"
request.

It is however possible to decide in O(1) time if the correct insertion
point is at the end of the queue. We have to keep track of the point,
p, where no more requests may pass. (10 in the example above.) The logic
would then be:

int insert_at_tail = 0;
if (IN_ORDER(p, last)) {
if (IN_ORDER(last, req) || IN_ORDER(req, p))
insert_at_tail = 1;
} else {
if (IN_ORDER(last, req) && IN_ORDER(req, p))
insert_at_tail = 1;
}
if (insert_at_tail) {
/* Do it in O(1) */
} else {
/* Do normal O(n) scan and update latencies */
}

The question is if this is worth the extra code complexity. How long
can the request queue be? Does it have a fixed upper size, or is it
limited only by available memory? If the request queue is always
short, the O(n) complexity shouldn't matter. Note that the worst case
complexity is still O(n) for all algorithms discussed so far.

-- 
Peter Österlund  Email: [EMAIL PROTECTED]
Sköndalsvägen 35[EMAIL PROTECTED]
S-128 66 Sköndal Home page: http://home1.swipnet.se/~w-15919
Sweden   Phone: +46 8 942647

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Andrea Arcangeli

On Tue, Sep 19, 2000 at 07:17:42AM -0300, Rik van Riel wrote:
> This is a bug in Andrea's idea.  The request should only
> be inserted at the end of the list if:
> 
> 1) the block numbre is bigger than head->prev (which you
>already have)

If you read the code you'll see that in his previous patch he wasn't doing
that.  That's what I suggested to change to return in O(1) behaviour.

> 2) the block number is smaller than head (or head->next
>if the current request is unplugged)

You're wrong. While the queue is unplugged there are peaks in the queue caused
by the latency control and head->next is not guarnateed to be the lower block
in the queue.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Andrea Arcangeli

On Tue, Sep 19, 2000 at 04:50:26AM -0300, Rik van Riel wrote:
> When the latency gets above 10 minutes, I'd call it starvation ;)

Me too, no argument about that.

> Not average latency no, but the starvation issue should be
> fixed...

2.2.18pre9aa1 delivers acceptable latency for me with the same
latency control logic of 2.4.x. I will check test9 soon.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Rik van Riel

On 18 Sep 2000, Peter Osterlund wrote:
> Andrea Arcangeli <[EMAIL PROTECTED]> writes:
> 
> > > The only disadvantage I can see is that the new patch doesn't handle
> > > consecutive insertions in O(1) time, but then again, the pre-latency
> > 
> > We can still do that by trivially fixing a bit your code. You should first
> > check if the new inserted request is over the last in the current queue before
> > entering the tmp1/tmp2 logic.
> 
> Yes this can be done, but it will affect where requests are inserted.
> Suppose the queue currently contains:
> 
>   100 200 300 400 10 20 30
> 
> If request 150 is to be inserted, then with my previous patch it
> will be inserted between 100 and 200, but with the proposed
> change it will instead be inserted at the end.

This is a bug in Andrea's idea.  The request should only
be inserted at the end of the list if:

1) the block numbre is bigger than head->prev (which you
   already have)

AND

2) the block number is smaller than head (or head->next
   if the current request is unplugged)

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Rik van Riel

On Mon, 18 Sep 2000, Andrea Arcangeli wrote:
> On Sun, Sep 17, 2000 at 05:38:17PM -0300, Rik van Riel wrote:
> > Yes, I run my system with elvtune -r 250 -w 500.
> 
> Ok (sorry for asking, it was just to be sure).
> 
> > But even with -r 5 -w 5, I saw starvation. This, and
> 
> I'd call it "too high latency", not starvation.

When the latency gets above 10 minutes, I'd call it starvation ;)

> > the code, is a clear sign to me that Peter is absolutely
> > right and his corrections to the code should be merged.
> 
> Peter correction can improve performance but it shouldn't
> decrease the latency significantly.

Not average latency no, but the starvation issue should be
fixed...

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Rik van Riel

On Mon, 18 Sep 2000, Andrea Arcangeli wrote:
 On Sun, Sep 17, 2000 at 05:38:17PM -0300, Rik van Riel wrote:
  Yes, I run my system with elvtune -r 250 -w 500.
 
 Ok (sorry for asking, it was just to be sure).
 
  But even with -r 5 -w 5, I saw starvation. This, and
 
 I'd call it "too high latency", not starvation.

When the latency gets above 10 minutes, I'd call it starvation ;)

  the code, is a clear sign to me that Peter is absolutely
  right and his corrections to the code should be merged.
 
 Peter correction can improve performance but it shouldn't
 decrease the latency significantly.

Not average latency no, but the starvation issue should be
fixed...

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Rik van Riel

On 18 Sep 2000, Peter Osterlund wrote:
 Andrea Arcangeli [EMAIL PROTECTED] writes:
 
   The only disadvantage I can see is that the new patch doesn't handle
   consecutive insertions in O(1) time, but then again, the pre-latency
  
  We can still do that by trivially fixing a bit your code. You should first
  check if the new inserted request is over the last in the current queue before
  entering the tmp1/tmp2 logic.
 
 Yes this can be done, but it will affect where requests are inserted.
 Suppose the queue currently contains:
 
   100 200 300 400 10 20 30
 
 If request 150 is to be inserted, then with my previous patch it
 will be inserted between 100 and 200, but with the proposed
 change it will instead be inserted at the end.

This is a bug in Andrea's idea.  The request should only
be inserted at the end of the list if:

1) the block numbre is bigger than head-prev (which you
   already have)

AND

2) the block number is smaller than head (or head-next
   if the current request is unplugged)

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Andrea Arcangeli

On Tue, Sep 19, 2000 at 07:17:42AM -0300, Rik van Riel wrote:
 This is a bug in Andrea's idea.  The request should only
 be inserted at the end of the list if:
 
 1) the block numbre is bigger than head-prev (which you
already have)

If you read the code you'll see that in his previous patch he wasn't doing
that.  That's what I suggested to change to return in O(1) behaviour.

 2) the block number is smaller than head (or head-next
if the current request is unplugged)

You're wrong. While the queue is unplugged there are peaks in the queue caused
by the latency control and head-next is not guarnateed to be the lower block
in the queue.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-19 Thread Peter Osterlund

Rik van Riel [EMAIL PROTECTED] writes:

 This is a bug in Andrea's idea.  The request should only
 be inserted at the end of the list if:
 
 1) the block numbre is bigger than head-prev (which you
already have)
 
 AND
 
 2) the block number is smaller than head (or head-next
if the current request is unplugged)

The second condition is not so simple in the case of latency control.
Consider the following queue:

sector:   100 200 300 400 10 20 30
sequence: 1   1   1   1   0  1  1

In this case it would be correct to insert 150 at the end even though
it is 100, because no more requests are allowed to pass the "10"
request.

It is however possible to decide in O(1) time if the correct insertion
point is at the end of the queue. We have to keep track of the point,
p, where no more requests may pass. (10 in the example above.) The logic
would then be:

int insert_at_tail = 0;
if (IN_ORDER(p, last)) {
if (IN_ORDER(last, req) || IN_ORDER(req, p))
insert_at_tail = 1;
} else {
if (IN_ORDER(last, req)  IN_ORDER(req, p))
insert_at_tail = 1;
}
if (insert_at_tail) {
/* Do it in O(1) */
} else {
/* Do normal O(n) scan and update latencies */
}

The question is if this is worth the extra code complexity. How long
can the request queue be? Does it have a fixed upper size, or is it
limited only by available memory? If the request queue is always
short, the O(n) complexity shouldn't matter. Note that the worst case
complexity is still O(n) for all algorithms discussed so far.

-- 
Peter Österlund  Email: [EMAIL PROTECTED]
Sköndalsvägen 35[EMAIL PROTECTED]
S-128 66 Sköndal Home page: http://home1.swipnet.se/~w-15919
Sweden   Phone: +46 8 942647

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Andrea Arcangeli

On Tue, Sep 19, 2000 at 10:38:52PM +0200, Jens Axboe wrote:
 I haven't had a chance to really look at Peter's mods yet, but surely
 the current elevator can have many entries with 0 sequence. As an
 example, say the start sequence is 3 and we received request sector
 10...1 in descending order. The sorted order would then be
 
 7[3] 8[2] 9[1] 10[0] 3[3] 4[2] 5[1] 6[0] 1[3] 2[2]
   p
With point `p' I mean the request after last barrier in the queue.

With the current algorithm we can trivially account for p each time
we do elevator_sequence--. elevator_sequence can't go below zero
and we never play with stuff before the last barrier. So the first
time !--req-elevator_sequence is true we would know that req-next
is the new p (if it's not real_head).

Then we would insert at the left of p if the new req is lower than p.


With the modification instead we could have this queue:

100[0]

Then we insert 102 103 and 104:

100[0] 102[3] 103[3] 104[3]
   p

Then when we try to insert 99 it goes here:

100[0] 102[3] 103[3] 104[3] 99[3]
p

So we have two low peaks in the not starving queue and we should move the p
to the latest on the right.

Also we should make different cases in function of what p-prev is
(barrier/head/real_head/normalreq).

I don't think it's worthwhile (even with the current algorithm where it's easy
to account for p).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: An elevator algorithm (patch)

2000-09-19 Thread Jens Axboe

On Tue, Sep 19 2000, Andrea Arcangeli wrote:
  7[3] 8[2] 9[1] 10[0] 3[3] 4[2] 5[1] 6[0] 1[3] 2[2]
  p
 With point `p' I mean the request after last barrier in the queue.

Ah, I suspected we were talking past each other.

 Then when we try to insert 99 it goes here:
 
   100[0] 102[3] 103[3] 104[3] 99[3]
   p
 
 So we have two low peaks in the not starving queue and we should move the p
 to the latest on the right.

Ok good, I've read Peter's patch now. Looks good, I've put it in my
tree as well and will do some testing.

 Also we should make different cases in function of what p-prev is
 (barrier/head/real_head/normalreq).
 
 I don't think it's worthwhile (even with the current algorithm where it's easy
 to account for p).

I suspect you are right, it's marginal.

-- 
* Jens Axboe [EMAIL PROTECTED]
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-18 Thread Peter Osterlund

Andrea Arcangeli <[EMAIL PROTECTED]> writes:

> > The only disadvantage I can see is that the new patch doesn't handle
> > consecutive insertions in O(1) time, but then again, the pre-latency
> 
> We can still do that by trivially fixing a bit your code. You should first
> check if the new inserted request is over the last in the current queue before
> entering the tmp1/tmp2 logic.

Yes this can be done, but it will affect where requests are inserted.
Suppose the queue currently contains:

100 200 300 400 10 20 30

If request 150 is to be inserted, then with my previous patch it will
be inserted between 100 and 200, but with the proposed change it will
instead be inserted at the end. This is good for latency because there
will be less reordering, but potentially harmfull for streaming
performance because the total disk head traveling distance increases.

Anyway, I tested a new patch with your suggestion, and indeed this
change doesn't seem to cause too many extra seeks. I set the r/w
latencies to 1000/5000, and the iozone and Bonnie test produced almost
the same values as before.

I also noted that I could (almost) play an mp3 file with a 48Kb
buffer during "cp /dev/zero tmpfile". There is one rather large
skip a few seconds after the copy starts, but after that there are
only small occasional skips. I also ran iozone while playing the mp3,
and there were no major sound skips, and iozone only ran 7% slower
than without mpg123 running.

Thanks again for you comments. Here is the new patch:

--- linux-2.4.0-test8/drivers/block/elevator.c.orig Sun Sep 17 00:05:03 2000
+++ linux-2.4.0-test8/drivers/block/elevator.c  Mon Sep 18 22:40:53 2000
@@ -34,20 +34,57 @@
struct list_head *real_head,
struct list_head *head, int orig_latency)
 {
-   struct list_head *entry = real_head;
-   struct request *tmp;
+   struct list_head *last = real_head->prev;
+   struct list_head *insert_after = last;
+   struct list_head *entry;
+   struct request *tmp1, *tmp2;
+   int do_insert;
 
req->elevator_sequence = orig_latency;
 
-   while ((entry = entry->prev) != head) {
-   tmp = blkdev_entry_to_request(entry);
-   if (IN_ORDER(tmp, req))
+   if (last == head)
+   goto out;
+
+   entry = last;
+   tmp1 = blkdev_entry_to_request(entry);
+   if (IN_ORDER(tmp1, req))
+   goto out;
+   do {
+   tmp2 = tmp1;
+   entry = entry->prev;
+   tmp1 = blkdev_entry_to_request(entry);
+   if (!tmp2->elevator_sequence)
break;
-   if (!tmp->elevator_sequence)
+   do_insert = 0;
+   if (entry == real_head) {
+   /*
+* Since we don't know were the disk head is
+* currently located, we can not really know
+* if the request should be inserted here. The
+* best bet is probably not to insert the
+* request here, because otherwise the
+* elevator would be unfair to sectors at the
+* end of the disk.
+*/
+   } else if (IN_ORDER(tmp1, tmp2)) {
+   if (IN_ORDER(tmp1, req) && IN_ORDER(req, tmp2))
+   do_insert = 1;
+   } else {
+   if (IN_ORDER(tmp1, req) || IN_ORDER(req, tmp2))
+   do_insert = 1;
+   }
+   if (do_insert) {
+   insert_after = entry;
+   do {
+   entry = entry->next;
+   tmp1 = blkdev_entry_to_request(entry);
+   tmp1->elevator_sequence--;
+   } while (entry != last);
break;
-   tmp->elevator_sequence--;
-   }
-   list_add(>queue, entry);
+   }
+   } while (entry != head);
+out:
+   list_add(>queue, insert_after);
 }
 
 int elevator_linus_merge(request_queue_t *q, struct request **req,

-- 
Peter Österlund  Email: [EMAIL PROTECTED]
Sköndalsvägen 35[EMAIL PROTECTED]
S-128 66 Sköndal Home page: http://home1.swipnet.se/~w-15919
Sweden   Phone: +46 8 942647

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-18 Thread Andrea Arcangeli

On Sun, Sep 17, 2000 at 08:57:00PM +0200, Peter Osterlund wrote:
> The new patch will obey the latency requirement but still keep disk
> seeks to a minimum. This makes it possible to use much smaller latency

Yes I agree that's a much nicer behaviour with strict latency requirements
(that's what 2.2.1[67] is doing btw).

> values without losing streaming performance. (I used "elvtune -r 1
> -w 20" and didn't notice any slowdown in iozone and Bonnie. It is
> still faster than plain 2.4.0-test8)

You should be able to use 250/500 as well without slowdown.

> Yes, that would be even better, but IMHO the new patch is already
> better than what's currently in the kernel. (I don't think the

With scsi the new patch won't make any difference with the default latency
settings but I agree it's better for the other cases.

> The only disadvantage I can see is that the new patch doesn't handle
> consecutive insertions in O(1) time, but then again, the pre-latency

We can still do that by trivially fixing a bit your code. You should first
check if the new inserted request is over the last in the current queue before
entering the tmp1/tmp2 logic. If the new request is higher than read_head->prev
then you should insert at the tail of the queue in O(1).  Please re-do the
patch doing this O(1) optimization for the common case then I'll recommend it
for integration (probably I can integrate it into my tree with lots of other
fixes from me and Jens and then I can upload a jumbo patch to test). I can
also integrate the suggestion from Stephen of accouting a "pass" due a merged
request as 1 and a "pass" due a separate scsi command as 2. Such heuristic
makes perfect sense to me too.

> + if (entry == real_head) {
> + /*
> +  * Since we don't know were the disk head is
> +  * currently located, we can not really know
> +  * if the request should be inserted here. The
> +  * best bet is probably not to insert the
> +  * request here, because otherwise the
> +  * elevator would be unfair to sectors at the
> +  * end of the disk.
> +  */

Agreed. Then once we'll add some more memory about the last position of the head
to the request_queue_t we'll be able to do even better choice.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-18 Thread Peter Osterlund

Andrea Arcangeli [EMAIL PROTECTED] writes:

  The only disadvantage I can see is that the new patch doesn't handle
  consecutive insertions in O(1) time, but then again, the pre-latency
 
 We can still do that by trivially fixing a bit your code. You should first
 check if the new inserted request is over the last in the current queue before
 entering the tmp1/tmp2 logic.

Yes this can be done, but it will affect where requests are inserted.
Suppose the queue currently contains:

100 200 300 400 10 20 30

If request 150 is to be inserted, then with my previous patch it will
be inserted between 100 and 200, but with the proposed change it will
instead be inserted at the end. This is good for latency because there
will be less reordering, but potentially harmfull for streaming
performance because the total disk head traveling distance increases.

Anyway, I tested a new patch with your suggestion, and indeed this
change doesn't seem to cause too many extra seeks. I set the r/w
latencies to 1000/5000, and the iozone and Bonnie test produced almost
the same values as before.

I also noted that I could (almost) play an mp3 file with a 48Kb
buffer during "cp /dev/zero tmpfile". There is one rather large
skip a few seconds after the copy starts, but after that there are
only small occasional skips. I also ran iozone while playing the mp3,
and there were no major sound skips, and iozone only ran 7% slower
than without mpg123 running.

Thanks again for you comments. Here is the new patch:

--- linux-2.4.0-test8/drivers/block/elevator.c.orig Sun Sep 17 00:05:03 2000
+++ linux-2.4.0-test8/drivers/block/elevator.c  Mon Sep 18 22:40:53 2000
@@ -34,20 +34,57 @@
struct list_head *real_head,
struct list_head *head, int orig_latency)
 {
-   struct list_head *entry = real_head;
-   struct request *tmp;
+   struct list_head *last = real_head-prev;
+   struct list_head *insert_after = last;
+   struct list_head *entry;
+   struct request *tmp1, *tmp2;
+   int do_insert;
 
req-elevator_sequence = orig_latency;
 
-   while ((entry = entry-prev) != head) {
-   tmp = blkdev_entry_to_request(entry);
-   if (IN_ORDER(tmp, req))
+   if (last == head)
+   goto out;
+
+   entry = last;
+   tmp1 = blkdev_entry_to_request(entry);
+   if (IN_ORDER(tmp1, req))
+   goto out;
+   do {
+   tmp2 = tmp1;
+   entry = entry-prev;
+   tmp1 = blkdev_entry_to_request(entry);
+   if (!tmp2-elevator_sequence)
break;
-   if (!tmp-elevator_sequence)
+   do_insert = 0;
+   if (entry == real_head) {
+   /*
+* Since we don't know were the disk head is
+* currently located, we can not really know
+* if the request should be inserted here. The
+* best bet is probably not to insert the
+* request here, because otherwise the
+* elevator would be unfair to sectors at the
+* end of the disk.
+*/
+   } else if (IN_ORDER(tmp1, tmp2)) {
+   if (IN_ORDER(tmp1, req)  IN_ORDER(req, tmp2))
+   do_insert = 1;
+   } else {
+   if (IN_ORDER(tmp1, req) || IN_ORDER(req, tmp2))
+   do_insert = 1;
+   }
+   if (do_insert) {
+   insert_after = entry;
+   do {
+   entry = entry-next;
+   tmp1 = blkdev_entry_to_request(entry);
+   tmp1-elevator_sequence--;
+   } while (entry != last);
break;
-   tmp-elevator_sequence--;
-   }
-   list_add(req-queue, entry);
+   }
+   } while (entry != head);
+out:
+   list_add(req-queue, insert_after);
 }
 
 int elevator_linus_merge(request_queue_t *q, struct request **req,

-- 
Peter Österlund  Email: [EMAIL PROTECTED]
Sköndalsvägen 35[EMAIL PROTECTED]
S-128 66 Sköndal Home page: http://home1.swipnet.se/~w-15919
Sweden   Phone: +46 8 942647

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Peter Osterlund

On Sun, 17 Sep 2000, Andrea Arcangeli wrote:

: And with the default latency values ("infinite") with the test2
: elevator if you're using scsi as your device, the patch can't make
: runtime differences either.

The test2 elevator (assuming it is the same as the test8 version) in the
infinite latency case will always send the request with the lowest sector
number to the drive. (The request queue will always be sorted, since the
elevator function degenerates to insertion sort.) Do you really suggest
that this is as good as a real elevator algorithm?

-- 
Peter Österlund  Email: [EMAIL PROTECTED]
Sköndalsvägen 35[EMAIL PROTECTED]
S-128 66 Sköndal Home page: http://home1.swipnet.se/~w-15919
Sweden   Phone: +46 8 942647


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Andrea Arcangeli

On Sun, Sep 17, 2000 at 04:05:55PM -0300, Rik van Riel wrote:
> to be addressed ASAP. I've witnessed this starvation happen
> a couple of times and it's a really big problem...

Did you enabled the latency control as I suggested you a few days ago?

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Rik van Riel

On Sun, 17 Sep 2000, Andrea Arcangeli wrote:
> On Sun, Sep 17, 2000 at 04:05:55PM -0300, Rik van Riel wrote:
> > to be addressed ASAP. I've witnessed this starvation happen
> > a couple of times and it's a really big problem...
> 
> Did you enabled the latency control as I suggested you a few days ago?

Yes, I run my system with elvtune -r 250 -w 500.

But even with -r 5 -w 5, I saw starvation. This, and
the code, is a clear sign to me that Peter is absolutely
right and his corrections to the code should be merged.

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Marcelo Tosatti


On Sun, 17 Sep 2000, Andrea Arcangeli wrote:

 

> If nobody does that before me I will try this "remeber last position of the
> head" idea in my blkdev tree (there are many other pending elevator fixes in
> it) as soon as I finished with 2.2.18pre9aa1 LFS nfsv3 and as soon as I finish
> the fix for the spinlocks (this spinlock "memory" thing got starved over and
> over again unfortunately, at least it's not an urgent fix :).

Andrea, 

The 2.2 elevator code suffers from this problem too? 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Andrea Arcangeli

On Sun, Sep 17, 2000 at 04:39:35PM -0300, Marcelo Tosatti wrote:
> 
> On Sun, 17 Sep 2000, Andrea Arcangeli wrote:
> 
>  
> 
> > If nobody does that before me I will try this "remeber last position of the
> > head" idea in my blkdev tree (there are many other pending elevator fixes in
> > it) as soon as I finished with 2.2.18pre9aa1 LFS nfsv3 and as soon as I finish
> > the fix for the spinlocks (this spinlock "memory" thing got starved over and
> > over again unfortunately, at least it's not an urgent fix :).
> 
> Andrea, 
> 
> The 2.2 elevator code suffers from this problem too? 

In the 2.2.x patches I didn't changed the ordering algorithm, see:

const int after_current = IN_ORDER(tmp,req);
const int before_next = IN_ORDER(req,tmp->next);

if (!IN_ORDER(tmp,tmp->next)) {
if (after_current || before_next)
break;
} else {
if (after_current && before_next)
break;
}

so it doesn't suffer form that problem. Also the elevator before
test2 didn't sufferred from that problem, see:

-   while ((entry = entry->prev) != head) {
-   if (!point && latency >= 0) {
-   point = entry;
-   point_latency = latency;
-   }
-   tmp = blkdev_entry_to_request(entry);
-   if (elevator_sequence_before(tmp->elevator_sequence, sequence) || 
!tmp->q)
-   break;
-   if (latency >= 0) {
-   if (IN_ORDER(tmp, req) ||
-   (pass && !IN_ORDER(tmp, blkdev_next_request(tmp
-   goto link;
-   }
-   latency += tmp->nr_segments;
-   pass = 1;
-   }
-
-   if (point) {
-   entry = point;
-   latency = point_latency;
-   }
-
- link:
-   list_add(>queue, entry);
-   req->elevator_sequence = elevator_sequence(elevator, latency);


And with the default latency values ("infinite") with the test2 elevator if
you're using scsi as your device, the patch can't make runtime differences
either.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Andrea Arcangeli

On Sun, Sep 17, 2000 at 05:38:17PM -0300, Rik van Riel wrote:
> Yes, I run my system with elvtune -r 250 -w 500.

Ok (sorry for asking, it was just to be sure).

> But even with -r 5 -w 5, I saw starvation. This, and

I'd call it "too high latency", not starvation. Well, strictly speaking it's
not even correct to call starvation the behaviour of 2.2.15 because the latency
provided by CSCAN (without any additional I/O scheduler logic) have an high
bound that depends on the size of the disk and on the speed of the disk, but
the higher possible latency of the old elevator was so big that calling it
"starvation" was not so badly wrong :).

> the code, is a clear sign to me that Peter is absolutely
> right and his corrections to the code should be merged.

Peter correction can improve performance but it shouldn't decrease the latency
significantly. Of course I see that running the I/O faster you have a chance to
reduce also the latency but as said not of a significant factor (not at the
levels of 2.2.16 or 2.4.0-test1).

The fact is that to reduce the latency of a pagein while a background
write is happening, we have to generate seeks at higher frequency. If we do
that we hurt the performance of the write and we change the tiotest numbers
(that's what happens in 2.2.16 for example). If you increase the latency
then tiotest returns to run fast as before. I don't see an easy solution
to apply to a global algorithm that doesn't know the semantics of the I/O
(so that doesn't know when somebody (human) is waiting I/O completion while
looking the screen).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Gérard Roudier



On Sun, 17 Sep 2000, Rik van Riel wrote:

> On 17 Sep 2000, Peter Osterlund wrote:
> > Andrea Arcangeli <[EMAIL PROTECTED]> writes:
> > 
> > > While the queue is plugged or with things like SCSI your logic
> > > change won't work because in such case if your request is lower the
> > > lowest in the queue, you can put it at the head of the queue and you
> > > have no way to know where your "tmp1" was placed so you can't make
> > > any assumption (that's why the current code makes sense).
> > 
> > I still don't think the current code makes sense.
> 
> [snip examples]
> 
> Indeed, it's obvious that your code will give better
> results (and it's more readable too). I like it...
> 
> > The new patch is also not unfair to requests near the end of the
> > disk. The current kernel code can starve those requests a very
> > long time if the request queue never becomes empty.
> 
> This is a very very big problem, which definately needs
> to be addressed ASAP. I've witnessed this starvation happen
> a couple of times and it's a really big problem...
> 
> > The only disadvantage I can see is that the new patch doesn't
> > handle consecutive insertions in O(1) time, but then again, the
> > pre-latency elevator code didn't do that either. Is this really
> > important? How long can the request queue be? Apparently we gain
> > more by avoiding disk seeks than we lose by wasting some CPU
> > cycles during request insertion.
> 
> Well DUH. ;)
> 
> A disk seek takes ~10 milliseconds on a modern drive,
> that's about an /eternity/ as far as the CPU is concerned.

This was under MS/DOS 10 years ago probably.

Nowadays you can connect 30 disks of about 4 ms average seek time to a
single Ultra 160 2 channels 64 bit 66MHz PCI controller. With appropriate
benchmarks that gain advantage of disk prefetching, you can easily observe
30,000 short IOs per seconds and even more (15000 per channel). This
happens using no more than 3 disks per BUS.

For real work, with disks actually seeking, such an possible system should  
probably be capable of performing more than 6000 IOs per seconds (That's 
theory from me, since I never saw such a system :-) ).
This gives some margin for wasting CPU cycles but far less than you seem 
to assume, in my opinion.

> Anyone who thinks saving on CPU time is worth it for
> something as critical to disk seeks as the elevator code
> should be locked up inside a microVAX ;)

Not wasting uselessly CPU cycles is still worth it, in my opinion.

Regards,
  Gerard.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Andrea Arcangeli

On Sun, Sep 17, 2000 at 10:53:17PM +0200, Peter Osterlund wrote:
> The test2 elevator (assuming it is the same as the test8 version) in the

Yes it's the same in such respect.

> infinite latency case will always send the request with the lowest sector
> number to the drive. (The request queue will always be sorted, since the

Correct. And infact that's what is happening to 2.4.0-test8.

> elevator function degenerates to insertion sort.) Do you really suggest
> that this is as good as a real elevator algorithm?

No, I wasn't suggesting anything like that, I was only saying "don't complain
the elevator if the number you got out of test8 aren't good and you were
running on scsi". I guess the latency defaults are been set to a too high
default value exactly to be able to say that :). Jens also verified that a
default of 250 and 500 is good enough to not degenerate the tiotest numbers
(tested on 2.2.17 + elv-simple-1 patch), and with a 250/500 pair a `find` with
a `cp /dev/zero .` in background is not too bad (not like an deadlock and you
can visibly see the fix in action, however you'll definitely notice the cp in
background :).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Rik van Riel

On 17 Sep 2000, Peter Osterlund wrote:
> Andrea Arcangeli <[EMAIL PROTECTED]> writes:
> 
> > While the queue is plugged or with things like SCSI your logic
> > change won't work because in such case if your request is lower the
> > lowest in the queue, you can put it at the head of the queue and you
> > have no way to know where your "tmp1" was placed so you can't make
> > any assumption (that's why the current code makes sense).
> 
> I still don't think the current code makes sense.

[snip examples]

Indeed, it's obvious that your code will give better
results (and it's more readable too). I like it...

> The new patch is also not unfair to requests near the end of the
> disk. The current kernel code can starve those requests a very
> long time if the request queue never becomes empty.

This is a very very big problem, which definately needs
to be addressed ASAP. I've witnessed this starvation happen
a couple of times and it's a really big problem...

> The only disadvantage I can see is that the new patch doesn't
> handle consecutive insertions in O(1) time, but then again, the
> pre-latency elevator code didn't do that either. Is this really
> important? How long can the request queue be? Apparently we gain
> more by avoiding disk seeks than we lose by wasting some CPU
> cycles during request insertion.

Well DUH. ;)

A disk seek takes ~10 milliseconds on a modern drive,
that's about an /eternity/ as far as the CPU is concerned.

Anyone who thinks saving on CPU time is worth it for
something as critical to disk seeks as the elevator code
should be locked up inside a microVAX ;)

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Peter Osterlund

Andrea Arcangeli <[EMAIL PROTECTED]> writes:

> The patch is buggy for non headactive devices like SCSI and also for
> IDE while the queue is plugged.

Thanks for looking at the patch. Yes, sorry, I misunderstood the linux
linked list implementation. This is easy to fix though. See new patch
at the end.

> While the queue is plugged or with things like SCSI your logic
> change won't work because in such case if your request is lower the
> lowest in the queue, you can put it at the head of the queue and you
> have no way to know where your "tmp1" was placed so you can't make
> any assumption (that's why the current code makes sense).

I still don't think the current code makes sense. Consider what will
happen if you insert request 100, 200, 101, 201, 102, 102, ... in an
empty plugged queue. (Assume max allowed latency is 2)

queue:   100 101 102 200 103 201 104 202 105 203 204 205 ...
latency: 2   2   2   0   2   0   2   0   2   0   1   2   ...

So you end up with one seek between each request. The same thing
happens no matter how large the latency is. It just takes longer
before the behaviour starts. With the new logic you will instead get:

queue:   100 101 102 200 201 202 203 204 103 104 105 205 ...
latency: 2   2   2   0   1   2   2   2   0   1   2   2   ...

So in this case there will be 5 (2*latency+1) requests between each
seek.

The new patch will obey the latency requirement but still keep disk
seeks to a minimum. This makes it possible to use much smaller latency
values without losing streaming performance. (I used "elvtune -r 1
-w 20" and didn't notice any slowdown in iozone and Bonnie. It is
still faster than plain 2.4.0-test8)

The new patch is also not unfair to requests near the end of the disk.
The current kernel code can starve those requests a very long time if
the request queue never becomes empty.

> But I had an idea to generalize the algorithm so that we could
> optimize SCSI as well and also IDE in the plugged case: nobody
> forbid us to remeber the last position of the drive in the
> request_queue_t to still be able to know your "tmp1" that actually
> we don't know about.

Yes, that would be even better, but IMHO the new patch is already
better than what's currently in the kernel. (I don't think the
elevator code before your latency fix handled this correctly either.)

The only disadvantage I can see is that the new patch doesn't handle
consecutive insertions in O(1) time, but then again, the pre-latency
elevator code didn't do that either. Is this really important? How
long can the request queue be? Apparently we gain more by avoiding
disk seeks than we lose by wasting some CPU cycles during request
insertion.

Here is the new patch:

--- linux-2.4.0-test8/drivers/block/elevator.c.orig Sun Sep 17 00:05:03 2000
+++ linux-2.4.0-test8/drivers/block/elevator.c  Sun Sep 17 20:27:48 2000
@@ -34,20 +34,55 @@
struct list_head *real_head,
struct list_head *head, int orig_latency)
 {
-   struct list_head *entry = real_head;
-   struct request *tmp;
-
-   req->elevator_sequence = orig_latency;
-
-   while ((entry = entry->prev) != head) {
-   tmp = blkdev_entry_to_request(entry);
-   if (IN_ORDER(tmp, req))
-   break;
-   if (!tmp->elevator_sequence)
-   break;
-   tmp->elevator_sequence--;
-   }
-   list_add(>queue, entry);
+   struct list_head *last = real_head->prev;
+   struct list_head *insert_after = last;
+   struct list_head *entry;
+   struct request *tmp1, *tmp2;
+   int do_insert;
+ 
+   req->elevator_sequence = orig_latency;
+ 
+   if (last == head)
+   goto out;
+ 
+   entry = last;
+   tmp1 = blkdev_entry_to_request(entry);
+   do {
+   tmp2 = tmp1;
+   entry = entry->prev;
+   tmp1 = blkdev_entry_to_request(entry);
+   if (!tmp2->elevator_sequence)
+   break;
+   do_insert = 0;
+   if (entry == real_head) {
+   /*
+* Since we don't know were the disk head is
+* currently located, we can not really know
+* if the request should be inserted here. The
+* best bet is probably not to insert the
+* request here, because otherwise the
+* elevator would be unfair to sectors at the
+* end of the disk.
+*/
+   } else if (IN_ORDER(tmp1, tmp2)) {
+   if (IN_ORDER(tmp1, req) && IN_ORDER(req, tmp2))
+   do_insert = 1;
+   } else {
+   if (IN_ORDER(tmp1, req) || IN_ORDER(req, tmp2))
+   do_insert = 1;
+   }
+   if (do_insert) {
+

Re: An elevator algorithm (patch)

2000-09-17 Thread Andrea Arcangeli

On Sun, Sep 17, 2000 at 01:26:22AM +0200, Peter Osterlund wrote:
> Indeed, the elevator logic is somewhat flawed. There are two problems
> with the current code:
> 
> 1. The test that decides if we have found a good spot to insert the
>current request doesn't handle the wraparound case correctly. (The
>case when the elevator reaches the end of the disk and starts over
>from the beginning.)
> 
> 2. If we can't find a good spot to insert the new request, we
>currently insert it as early as possible in the queue. If no good
>spot is found, it is more efficient and more fair to insert the new
>request last in the queue.

The patch is buggy for non headactive devices like SCSI and also for IDE
while the queue is plugged.

Assume the queue looks like this (it's scsi or IDE plugged):

HEAD A

HEAD is the >queue_head, A is a requests in the queue (not under processing
in the IDE case).

A runtime trace would look like this:

head = HEAD
real_head = HEAD

last = A
insert_after = A
entry = A
tmp1 = A
tmp2 = A
entry = HEAD
tmp1 = HEAD
read tmp1 -> you're reading random kernel memory here

While the queue is plugged or with things like SCSI your logic change won't
work because in such case if your request is lower the lowest in the queue, you
can put it at the head of the queue and you have no way to know where your
"tmp1" was placed so you can't make any assumption (that's why the current code
makes sense).

But I had an idea to generalize the algorithm so that we could optimize
SCSI as well and also IDE in the plugged case: nobody forbid us to remeber
the last position of the drive in the request_queue_t to still be able to
know your "tmp1" that actually we don't know about.

If nobody does that before me I will try this "remeber last position of the
head" idea in my blkdev tree (there are many other pending elevator fixes in
it) as soon as I finished with 2.2.18pre9aa1 LFS nfsv3 and as soon as I finish
the fix for the spinlocks (this spinlock "memory" thing got starved over and
over again unfortunately, at least it's not an urgent fix :).

Thanks for your comments.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Andrea Arcangeli

On Sun, Sep 17, 2000 at 01:26:22AM +0200, Peter Osterlund wrote:
 Indeed, the elevator logic is somewhat flawed. There are two problems
 with the current code:
 
 1. The test that decides if we have found a good spot to insert the
current request doesn't handle the wraparound case correctly. (The
case when the elevator reaches the end of the disk and starts over
from the beginning.)
 
 2. If we can't find a good spot to insert the new request, we
currently insert it as early as possible in the queue. If no good
spot is found, it is more efficient and more fair to insert the new
request last in the queue.

The patch is buggy for non headactive devices like SCSI and also for IDE
while the queue is plugged.

Assume the queue looks like this (it's scsi or IDE plugged):

HEAD A

HEAD is the q-queue_head, A is a requests in the queue (not under processing
in the IDE case).

A runtime trace would look like this:

head = HEAD
real_head = HEAD

last = A
insert_after = A
entry = A
tmp1 = A
tmp2 = A
entry = HEAD
tmp1 = HEAD
read tmp1 - you're reading random kernel memory here

While the queue is plugged or with things like SCSI your logic change won't
work because in such case if your request is lower the lowest in the queue, you
can put it at the head of the queue and you have no way to know where your
"tmp1" was placed so you can't make any assumption (that's why the current code
makes sense).

But I had an idea to generalize the algorithm so that we could optimize
SCSI as well and also IDE in the plugged case: nobody forbid us to remeber
the last position of the drive in the request_queue_t to still be able to
know your "tmp1" that actually we don't know about.

If nobody does that before me I will try this "remeber last position of the
head" idea in my blkdev tree (there are many other pending elevator fixes in
it) as soon as I finished with 2.2.18pre9aa1 LFS nfsv3 and as soon as I finish
the fix for the spinlocks (this spinlock "memory" thing got starved over and
over again unfortunately, at least it's not an urgent fix :).

Thanks for your comments.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Andrea Arcangeli

On Sun, Sep 17, 2000 at 05:38:17PM -0300, Rik van Riel wrote:
 Yes, I run my system with elvtune -r 250 -w 500.

Ok (sorry for asking, it was just to be sure).

 But even with -r 5 -w 5, I saw starvation. This, and

I'd call it "too high latency", not starvation. Well, strictly speaking it's
not even correct to call starvation the behaviour of 2.2.15 because the latency
provided by CSCAN (without any additional I/O scheduler logic) have an high
bound that depends on the size of the disk and on the speed of the disk, but
the higher possible latency of the old elevator was so big that calling it
"starvation" was not so badly wrong :).

 the code, is a clear sign to me that Peter is absolutely
 right and his corrections to the code should be merged.

Peter correction can improve performance but it shouldn't decrease the latency
significantly. Of course I see that running the I/O faster you have a chance to
reduce also the latency but as said not of a significant factor (not at the
levels of 2.2.16 or 2.4.0-test1).

The fact is that to reduce the latency of a pagein while a background
write is happening, we have to generate seeks at higher frequency. If we do
that we hurt the performance of the write and we change the tiotest numbers
(that's what happens in 2.2.16 for example). If you increase the latency
then tiotest returns to run fast as before. I don't see an easy solution
to apply to a global algorithm that doesn't know the semantics of the I/O
(so that doesn't know when somebody (human) is waiting I/O completion while
looking the screen).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Andrea Arcangeli

On Sun, Sep 17, 2000 at 04:39:35PM -0300, Marcelo Tosatti wrote:
 
 On Sun, 17 Sep 2000, Andrea Arcangeli wrote:
 
 snip 
 
  If nobody does that before me I will try this "remeber last position of the
  head" idea in my blkdev tree (there are many other pending elevator fixes in
  it) as soon as I finished with 2.2.18pre9aa1 LFS nfsv3 and as soon as I finish
  the fix for the spinlocks (this spinlock "memory" thing got starved over and
  over again unfortunately, at least it's not an urgent fix :).
 
 Andrea, 
 
 The 2.2 elevator code suffers from this problem too? 

In the 2.2.x patches I didn't changed the ordering algorithm, see:

const int after_current = IN_ORDER(tmp,req);
const int before_next = IN_ORDER(req,tmp-next);

if (!IN_ORDER(tmp,tmp-next)) {
if (after_current || before_next)
break;
} else {
if (after_current  before_next)
break;
}

so it doesn't suffer form that problem. Also the elevator before
test2 didn't sufferred from that problem, see:

-   while ((entry = entry-prev) != head) {
-   if (!point  latency = 0) {
-   point = entry;
-   point_latency = latency;
-   }
-   tmp = blkdev_entry_to_request(entry);
-   if (elevator_sequence_before(tmp-elevator_sequence, sequence) || 
!tmp-q)
-   break;
-   if (latency = 0) {
-   if (IN_ORDER(tmp, req) ||
-   (pass  !IN_ORDER(tmp, blkdev_next_request(tmp
-   goto link;
-   }
-   latency += tmp-nr_segments;
-   pass = 1;
-   }
-
-   if (point) {
-   entry = point;
-   latency = point_latency;
-   }
-
- link:
-   list_add(req-queue, entry);
-   req-elevator_sequence = elevator_sequence(elevator, latency);


And with the default latency values ("infinite") with the test2 elevator if
you're using scsi as your device, the patch can't make runtime differences
either.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Marcelo Tosatti


On Sun, 17 Sep 2000, Andrea Arcangeli wrote:

snip 

 If nobody does that before me I will try this "remeber last position of the
 head" idea in my blkdev tree (there are many other pending elevator fixes in
 it) as soon as I finished with 2.2.18pre9aa1 LFS nfsv3 and as soon as I finish
 the fix for the spinlocks (this spinlock "memory" thing got starved over and
 over again unfortunately, at least it's not an urgent fix :).

Andrea, 

The 2.2 elevator code suffers from this problem too? 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-17 Thread Peter Osterlund

On Sun, 17 Sep 2000, Andrea Arcangeli wrote:

: And with the default latency values ("infinite") with the test2
: elevator if you're using scsi as your device, the patch can't make
: runtime differences either.

The test2 elevator (assuming it is the same as the test8 version) in the
infinite latency case will always send the request with the lowest sector
number to the drive. (The request queue will always be sorted, since the
elevator function degenerates to insertion sort.) Do you really suggest
that this is as good as a real elevator algorithm?

-- 
Peter Österlund  Email: [EMAIL PROTECTED]
Sköndalsvägen 35[EMAIL PROTECTED]
S-128 66 Sköndal Home page: http://home1.swipnet.se/~w-15919
Sweden   Phone: +46 8 942647


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-16 Thread Peter Osterlund

Ragnar Kjørstad <[EMAIL PROTECTED]> writes:

> If I understand the current code correctly, it works like this:
[ example deleted ]
> So we've ended up with a very silly queue

Indeed, the elevator logic is somewhat flawed. There are two problems
with the current code:

1. The test that decides if we have found a good spot to insert the
   current request doesn't handle the wraparound case correctly. (The
   case when the elevator reaches the end of the disk and starts over
   from the beginning.)

2. If we can't find a good spot to insert the new request, we
   currently insert it as early as possible in the queue. If no good
   spot is found, it is more efficient and more fair to insert the new
   request last in the queue.

The following patch fixes both problems. Here are the results from
'iozone 100' and Bonnie with and without this patch.

test8:
--

iozone:
4626750 bytes/second for writing the file
7431438 bytes/second for reading the file

bonnie:
  ---Sequential Output ---Sequential Input-- --Random--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
  100  5278 60.6  7940 18.3  2876  9.7  5635 49.3  8344 16.7 179.3  2.0

test8 + modified elevator:
--

iozone:
4801172 bytes/second for writing the file
7613088 bytes/second for reading the file

bonnie:
  ---Sequential Output ---Sequential Input-- --Random--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
  100  5727 66.6  8365 18.1  2921  9.2  5991 52.2  8171 16.0 171.0  2.0

Here is the patch:

--- linux-2.4.0-test8/drivers/block/elevator.c.orig Sun Sep 17 00:05:03 2000
+++ linux-2.4.0-test8/drivers/block/elevator.c  Sun Sep 17 00:06:31 2000
@@ -34,20 +34,38 @@
struct list_head *real_head,
struct list_head *head, int orig_latency)
 {
-   struct list_head *entry = real_head;
-   struct request *tmp;
-
-   req->elevator_sequence = orig_latency;
-
-   while ((entry = entry->prev) != head) {
-   tmp = blkdev_entry_to_request(entry);
-   if (IN_ORDER(tmp, req))
-   break;
-   if (!tmp->elevator_sequence)
-   break;
-   tmp->elevator_sequence--;
-   }
-   list_add(>queue, entry);
+   struct list_head *last = real_head->prev;
+   struct list_head *insert_after = last;
+   struct list_head *entry;
+   struct request *tmp1, *tmp2;
+
+   req->elevator_sequence = orig_latency;
+
+   if (last == head)
+   goto out;
+
+   entry = last;
+   tmp1 = blkdev_entry_to_request(entry);
+   do {
+   tmp2 = tmp1;
+   entry = entry->prev;
+   tmp1 = blkdev_entry_to_request(entry);
+   if (!tmp2->elevator_sequence)
+   break;
+   if (IN_ORDER(tmp1, tmp2) ?
+   (IN_ORDER(tmp1, req) && IN_ORDER(req, tmp2)) :
+   (IN_ORDER(tmp1, req) || IN_ORDER(req, tmp2))) {
+   insert_after = entry;
+   do {
+   entry = entry->next;
+   tmp1 = blkdev_entry_to_request(entry);
+   tmp1->elevator_sequence--;
+   } while (entry != last);
+   break;
+   }
+   } while (entry != head);
+out:
+   list_add(>queue, insert_after);
 }
 
 int elevator_linus_merge(request_queue_t *q, struct request **req,


-- 
Peter Österlund  Email: [EMAIL PROTECTED]
Sköndalsvägen 35[EMAIL PROTECTED]
S-128 66 Sköndal Home page: http://home1.swipnet.se/~w-15919
Sweden   Phone: +46 8 942647

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: An elevator algorithm (patch)

2000-09-16 Thread Peter Osterlund

Ragnar Kjørstad [EMAIL PROTECTED] writes:

 If I understand the current code correctly, it works like this:
[ example deleted ]
 So we've ended up with a very silly queue

Indeed, the elevator logic is somewhat flawed. There are two problems
with the current code:

1. The test that decides if we have found a good spot to insert the
   current request doesn't handle the wraparound case correctly. (The
   case when the elevator reaches the end of the disk and starts over
   from the beginning.)

2. If we can't find a good spot to insert the new request, we
   currently insert it as early as possible in the queue. If no good
   spot is found, it is more efficient and more fair to insert the new
   request last in the queue.

The following patch fixes both problems. Here are the results from
'iozone 100' and Bonnie with and without this patch.

test8:
--

iozone:
4626750 bytes/second for writing the file
7431438 bytes/second for reading the file

bonnie:
  ---Sequential Output ---Sequential Input-- --Random--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
  100  5278 60.6  7940 18.3  2876  9.7  5635 49.3  8344 16.7 179.3  2.0

test8 + modified elevator:
--

iozone:
4801172 bytes/second for writing the file
7613088 bytes/second for reading the file

bonnie:
  ---Sequential Output ---Sequential Input-- --Random--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
  100  5727 66.6  8365 18.1  2921  9.2  5991 52.2  8171 16.0 171.0  2.0

Here is the patch:

--- linux-2.4.0-test8/drivers/block/elevator.c.orig Sun Sep 17 00:05:03 2000
+++ linux-2.4.0-test8/drivers/block/elevator.c  Sun Sep 17 00:06:31 2000
@@ -34,20 +34,38 @@
struct list_head *real_head,
struct list_head *head, int orig_latency)
 {
-   struct list_head *entry = real_head;
-   struct request *tmp;
-
-   req-elevator_sequence = orig_latency;
-
-   while ((entry = entry-prev) != head) {
-   tmp = blkdev_entry_to_request(entry);
-   if (IN_ORDER(tmp, req))
-   break;
-   if (!tmp-elevator_sequence)
-   break;
-   tmp-elevator_sequence--;
-   }
-   list_add(req-queue, entry);
+   struct list_head *last = real_head-prev;
+   struct list_head *insert_after = last;
+   struct list_head *entry;
+   struct request *tmp1, *tmp2;
+
+   req-elevator_sequence = orig_latency;
+
+   if (last == head)
+   goto out;
+
+   entry = last;
+   tmp1 = blkdev_entry_to_request(entry);
+   do {
+   tmp2 = tmp1;
+   entry = entry-prev;
+   tmp1 = blkdev_entry_to_request(entry);
+   if (!tmp2-elevator_sequence)
+   break;
+   if (IN_ORDER(tmp1, tmp2) ?
+   (IN_ORDER(tmp1, req)  IN_ORDER(req, tmp2)) :
+   (IN_ORDER(tmp1, req) || IN_ORDER(req, tmp2))) {
+   insert_after = entry;
+   do {
+   entry = entry-next;
+   tmp1 = blkdev_entry_to_request(entry);
+   tmp1-elevator_sequence--;
+   } while (entry != last);
+   break;
+   }
+   } while (entry != head);
+out:
+   list_add(req-queue, insert_after);
 }
 
 int elevator_linus_merge(request_queue_t *q, struct request **req,


-- 
Peter Österlund  Email: [EMAIL PROTECTED]
Sköndalsvägen 35[EMAIL PROTECTED]
S-128 66 Sköndal Home page: http://home1.swipnet.se/~w-15919
Sweden   Phone: +46 8 942647

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/