On 08/15, Rik van Riel wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 08/15/2014 12:49 PM, Oleg Nesterov wrote: > > > Just in case... Yes, sure, "seqlock_t stats_lock" is more scalable. > > Just I do not know it's worth the trouble. > > If we don't know whether it is worth the trouble, it is probably best > to stick to a well-known generic locking algorithm, instead of brewing > our own and trying to maintain it.
Perhaps. I am obviously biased and can't judge ;) Plus, again, I do understand that your approach has some advantages too. > Now to see if this change to cputime_adjust does the trick :) > > +++ b/kernel/sched/cputime.c > @@ -605,9 +605,12 @@ static void cputime_adjust(struct task_cputime *curr, > * If the tick based count grows faster than the scheduler one, > * the result of the scaling may go backward. > * Let's enforce monotonicity. > + * Atomic exchange protects against concurrent cputime_adjust. > */ > - - prev->stime = max(prev->stime, stime); > - - prev->utime = max(prev->utime, utime); > + while (stime > (rtime = ACCESS_ONCE(prev->stime))) > + cmpxchg(&prev->stime, rtime, stime); > + while (utime > (rtime = ACCESS_ONCE(prev->utime))) > + cmpxchg(&prev->utime, rtime, utime); Yes, perhaps we need something like this in any case. To remind, at least do_task_stat() calls task_cputime_adjusted() lockless, although we could fix this separately. But I do not think the change above is enough. With this change cputime_adjust() can race with itself. Yes, this guarantees monotonicity even if it is called lockless, but this can lead to "obviously inconsistent" numbers. And I don't think we can ignore this. If we could, then we can remove the scale_stime recalculation and change cputime_adjust() to simply do: static void cputime_adjust(struct task_cputime *curr, struct cputime *prev, cputime_t *ut, cputime_t *st) { /* enforce monotonicity */ *ut = prev->stime = max(prev->stime, curr->stime); *st = prev->utime = max(prev->utime, curr->utime); } Yes, we have this problem either way. And personally I think that this "enforce monotonicity" logic is pointless, userspace could take care, but it is too late to complain. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/