On Sun, Aug 18, 2013 at 06:36:39PM +0200, Oleg Nesterov wrote: > On 08/16, Frederic Weisbecker wrote: > > > > On Fri, Aug 16, 2013 at 06:49:22PM +0200, Oleg Nesterov wrote: > > > > > > Personally I am fine either way. > > > > Me too. > > > > So my proposition is that we can keep the existing patches as they fix > > other distinct races > > perhaps... but it would be nice to fix the "goes backward" problem. > > This "race" is not theoretical, at least for get_cpu_iowait_time_us(). > nr_iowait_cpu() can change from !0 to 0 very easily.
Actually yes. Now I think we should fix the iowait race in the same changeset because the bugs I'm fixing in this patchset were probably lowering the iowait issue in some case. > > And just in case, note that get_cpu_idle_time_us() has the same problem > too, because it can also change from 0 to !0 although this case is much > less likely. Right. > > However, right now I do not see a simple solution. It seems that, if > get_cpu_idle_time_us() does ktime_add(ts->idle_sleeptime) it should > actually update ts->idle_sleeptime/entrytime to avoid these races > (the same for ->idle_sleeptime), but then we need the locking. It seems that wouldn't solve the issue. Imagine that task A waits for IO on CPU 0. It waits 10 seconds. Then it's finally woken on CPU 1. A further call to get_cpu_idle_time_us() on CPU 0, say 5 seconds later, will miss the io sleeptime part. For now I can't figure out how to avoid flushing the io sleeptime when the iowait task is moved to another CPU. This requires a lock (moving from seqcount to seqlock) and may be involve quite some cache issues in the idle path, resulting in higher latencies on wakeup. We really need another solution. > > > (and we add fixes on what peterz just reported) > > Do you mean his comments about 4/4 or I missed something else? Yep, just removing the cpu arguments from the functions that only use ts locally. > > Ah and I'll wait for > > your review first. > > If only I could understand this code ;) It looks really simple and > I hope I can understand what it does. But not why. I simply can't > understand why idle/iowait are "mutually exclusive". I believe this can be changed such that iowait is included in the idle sleeptime. We can do that if that's needed. > > And if we do this, then perhaps io_schedule() should do > "if (atomic_dec(&rq->nr_iowait)) update_iowait_sleeptime()" but > this means the locking again. Yep. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

