On Wed, Jun 04, 2014 at 11:17:24AM +0100, Peter Zijlstra wrote: > On Wed, Jun 04, 2014 at 11:32:10AM +0200, Vincent Guittot wrote: > > On 4 June 2014 10:08, Peter Zijlstra <pet...@infradead.org> wrote: > > > On Wed, Jun 04, 2014 at 09:47:26AM +0200, Vincent Guittot wrote: > > >> On 3 June 2014 17:50, Peter Zijlstra <pet...@infradead.org> wrote: > > >> > On Wed, May 28, 2014 at 04:47:03PM +0100, Morten Rasmussen wrote: > > >> >> Since we may do periodic load-balance every 10 ms or so, we will > > >> >> perform > > >> >> a number of load-balances where runnable_avg_sum will mostly be > > >> >> reflecting the state of the world before a change (new task queued or > > >> >> moved a task to a different cpu). If you had have two tasks > > >> >> continuously > > >> >> on one cpu and your other cpu is idle, and you move one of the tasks > > >> >> to > > >> >> the other cpu, runnable_avg_sum will remain unchanged, 47742, on the > > >> >> first cpu while it starts from 0 on the other one. 10 ms later it will > > >> >> have increased a bit, 32 ms later it will be 47742/2, and 345 ms later > > >> >> it reaches 47742. In the mean time the cpu doesn't appear fully > > >> >> utilized > > >> >> and we might decide to put more tasks on it because we don't know if > > >> >> runnable_avg_sum represents a partially utilized cpu (for example a > > >> >> 50% > > >> >> task) or if it will continue to rise and eventually get to 47742. > > >> > > > >> > Ah, no, since we track per task, and update the per-cpu ones when we > > >> > migrate tasks, the per-cpu values should be instantly updated. > > >> > > > >> > If we were to increase per task storage, we might as well also track > > >> > running_avg not only runnable_avg. > > >> > > >> I agree that the removed running_avg should give more useful > > >> information about the the load of a CPU. > > >> > > >> The main issue with running_avg is that it's disturbed by other tasks > > >> (as point out previously). As a typical example, if we have 2 tasks > > >> with a load of 25% on 1 CPU, the unweighted runnable_load_avg will be > > >> in the range of [100% - 50%] depending of the parallelism of the > > >> runtime of the tasks whereas the reality is 50% and the use of > > >> running_avg will return this value > > > > > > I'm not sure I see how 100% is possible, but yes I agree that runnable > > > can indeed be inflated due to this queueing effect. > > Let me explain the 75%, take any one of the above scenarios. Lets call > the two tasks A and B, and let for a moment assume A always wins and > runs first, and then B. > > So A will be runnable for 25%, B otoh will be runnable the entire time A > is actually running plus its own running time, giving 50%. Together that > makes 75%. > > If you release the assumption that A runs first, but instead assume they > equally win the first execution, you get them averaging at 37.5% each, > which combined will still give 75%.
But that is assuming that the first task gets to run to completion of it busy period. If it uses up its sched_slice and we switch to the other tasks, they both get to wait. For example, if the sched_slice is 5 ms and the busy period is 10 ms, the execution pattern would be: A, B, A, B, idle, ... In that case A is runnable for 15 ms and B is for 20 ms. Assuming that the overall period is 40 ms, the A runnable is 37.5% and B is 50%. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/