On Thu, Jun 20, 2013 at 04:17:13PM -0400, Matthew Wilcox wrote:
>
> A paper at FAST2012
> (http://static.usenix.org/events/fast12/tech/full_papers/Yang.pdf) pointed
> out the performance overhead of taking interrupts for low-latency block
> I/Os. The solution the author investigated was to spin w
On 06/20/2013 04:17 PM, Matthew Wilcox wrote:
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4527,6 +4527,36 @@ long __sched io_schedule_timeout(long timeout)
return ret;
}
+/*
+ * Wait for an I/O to complete against this backing_dev_info. If the
+ * task exhausts its timesl
On 06/23/2013 02:29 PM, Linus Torvalds wrote:
You could try to do that either *in* the idle thread (which would take
the context switch overhead - maybe negating some of the advantages),
or alternatively hook into the scheduler idle logic before actually
doing the switch.
But anything that star
On Mon, Jun 24 2013, Matthew Wilcox wrote:
> On Mon, Jun 24, 2013 at 10:07:51AM +0200, Ingo Molnar wrote:
> > I'm wondering, how will this scheme work if the IO completion latency is a
> > lot more than the 5 usecs in the testcase? What if it takes 20 usecs or
> > 100 usecs or more?
>
> There's
On Mon, Jun 24 2013, Steven Rostedt wrote:
> On Mon, Jun 24, 2013 at 09:17:18AM +0200, Jens Axboe wrote:
> > On Sun, Jun 23 2013, Linus Torvalds wrote:
> > >
> > > You could try to do that either *in* the idle thread (which would take
> > > the context switch overhead - maybe negating some of the
On Mon, Jun 24 2013, Matthew Wilcox wrote:
> On Mon, Jun 24, 2013 at 09:15:45AM +0200, Jens Axboe wrote:
> > Willy, I think the general design is fine, hooking in via the bdi is the
> > only way to get back to the right place from where you need to sleep.
> > Some thoughts:
> >
> > - This should b
On Mon, 2013-06-24 at 23:07 -0400, Matthew Wilcox wrote:
> On Mon, Jun 24, 2013 at 08:11:02PM -0400, Steven Rostedt wrote:
> > What about hooking into the idle_balance code? That happens if we are
> > about to go to idle but before the full schedule switch to the idle
> > task.
> >
> >
> > In __s
On 06/25/13 05:18, Matthew Wilcox wrote:
On Mon, Jun 24, 2013 at 10:07:51AM +0200, Ingo Molnar wrote:
I'm wondering, how will this scheme work if the IO completion latency is a
lot more than the 5 usecs in the testcase? What if it takes 20 usecs or
100 usecs or more?
There's clearly a threshol
On Mon, Jun 24, 2013 at 10:07:51AM +0200, Ingo Molnar wrote:
> I'm wondering, how will this scheme work if the IO completion latency is a
> lot more than the 5 usecs in the testcase? What if it takes 20 usecs or
> 100 usecs or more?
There's clearly a threshold at which it stops making sense, and
On Mon, Jun 24, 2013 at 08:11:02PM -0400, Steven Rostedt wrote:
> What about hooking into the idle_balance code? That happens if we are
> about to go to idle but before the full schedule switch to the idle
> task.
>
>
> In __schedule(void):
>
> if (unlikely(!rq->nr_running))
>
On Mon, Jun 24, 2013 at 09:15:45AM +0200, Jens Axboe wrote:
> Willy, I think the general design is fine, hooking in via the bdi is the
> only way to get back to the right place from where you need to sleep.
> Some thoughts:
>
> - This should be hooked in via blk-iopoll, both of them should call in
On Mon, Jun 24, 2013 at 09:17:18AM +0200, Jens Axboe wrote:
> On Sun, Jun 23 2013, Linus Torvalds wrote:
> >
> > You could try to do that either *in* the idle thread (which would take
> > the context switch overhead - maybe negating some of the advantages),
> > or alternatively hook into the sched
* David Ahern wrote:
> On 6/23/13 3:09 AM, Ingo Molnar wrote:
> >If an IO driver is implemented properly then it will batch up requests for
> >the controller, and gets IRQ-notified on a (sub-)batch of buffers
> >completed.
> >
> >If there's any spinning done then it should be NAPI-alike polling:
* Jens Axboe wrote:
> - With the former note, the app either needs to opt in (and hence
> willingly sacrifice CPU cycles of its scheduling slice) or it needs to
> be nicer in when it gives up and goes back to irq driven IO.
The scheduler could look at sleep latency averages of the task in
* Linus Torvalds wrote:
> On Sun, Jun 23, 2013 at 12:09 AM, Ingo Molnar wrote:
> >
> > The spinning approach you add has the disadvantage of actively wasting
> > CPU time, which could be used to run other tasks. In general it's much
> > better to make sure the completion IRQs are rate-limited
On Sun, Jun 23 2013, Linus Torvalds wrote:
> nothing in common. Networking very very seldom
> has the kind of "submit and wait for immediate result" issues that
> disk reads do.
>
> That said, I dislike the patch intensely. I do not think it's at all a
> good idea to look at "need_resched" to say
On Sun, Jun 23 2013, Ingo Molnar wrote:
> I'm wondering why this makes such a performance difference.
They key ingredient here is simply not going to sleep, only to get an
IRQ and get woken up very shortly again. NAPI and similar approaches
work great for high IOPS cases, where you maintain a cert
On 6/23/13 3:09 AM, Ingo Molnar wrote:
If an IO driver is implemented properly then it will batch up requests for
the controller, and gets IRQ-notified on a (sub-)batch of buffers
completed.
If there's any spinning done then it should be NAPI-alike polling: a
single "is stuff completed" polling
On Sun, Jun 23, 2013 at 12:09 AM, Ingo Molnar wrote:
>
> The spinning approach you add has the disadvantage of actively wasting CPU
> time, which could be used to run other tasks. In general it's much better
> to make sure the completion IRQs are rate-limited and just schedule. This
> (combined wi
* Matthew Wilcox wrote:
>
> A paper at FAST2012
> (http://static.usenix.org/events/fast12/tech/full_papers/Yang.pdf) pointed
> out the performance overhead of taking interrupts for low-latency block
> I/Os. The solution the author investigated was to spin waiting for each
> I/O to complete. T
A paper at FAST2012
(http://static.usenix.org/events/fast12/tech/full_papers/Yang.pdf) pointed
out the performance overhead of taking interrupts for low-latency block
I/Os. The solution the author investigated was to spin waiting for each
I/O to complete. This is inefficient as Linux submits man
21 matches
Mail list logo