* Matt Mackall <[EMAIL PROTECTED]> wrote: > On Tue, Jun 05, 2007 at 09:50:15PM +0200, Ingo Molnar wrote: > > Matt, could you run this for 1-2 minutes and send us the sched_debug.txt > > output? > > http://selenic.com/sched_debug.txt.gz
thanks. It shows the anomaly in action: now at 365294215491977 nsecs .jiffies : 91248553 .next_balance : 0 .curr->pid : 18589 .clock : 125652924079659272 .prev_clock_raw : 365201238127457 .clock_warps : 9 .clock_unstable_events : 61896358 .clock_max_delta : 3997813 next one is: now at 365295219388142 nsecs .jiffies : 91248804 .next_balance : 0 .curr->pid : 18591 .clock : 125653018059166371 .prev_clock_raw : 365295217642619 .clock_warps : 9 .clock_unstable_events : 61896359 .clock_max_delta : 92976502936 251 jiffies passed, at 250 Hz that's 1 second - this proves that the sample is indeed an accurate once-per-second sample according to the timer interrupt. The 'now' timestamp (ktime_get() based) shows 1003896165 nanosecs passed - this too is showing a precise 1 second sample, according to GTOD. So all the time references we have show that (no surprise here) 1 second passed between the two samples. But sched_clock() shows a _large_ jump: .clock : 125652924079659272 .clock : 125653018059166371 also reflected in .clock_max_delta: .clock_max_delta : 92976502936 that's a 93 seconds jump (!) in a single 1-second sample. We also had a single sched-clock-unstable event: .clock_unstable_events : 61896358 .clock_unstable_events : 61896359 could you please try a test-boot with 'notsc' - do the scheduling weirdnesses go away? Also, There are two reasons why the sched_clock() in -mm could behave like that - either the sched-clock-share patches in it are buggy and we do not smoothly switch over from sc->unstable == 1 to sc->unstable == 0, or the TSC itself is unstable. To test the latter theory, could you run a 1 minute tsc-dump on your box: ./tsc-dump > tsc-dump.txt You can pick tsc-dump up from: http://redhat.com/~mingo/time-warp-test/ (please run this on a recent -mm kernel so that we have the same ACPI-idle characteristics as on the buggy kernel.) to test the former theory, could you boot with 'notsc' - do the weirdnesses go away? (please create another sched-debug.txt as well) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/