Li, Aubrey <> wrote: > Liu, Jiang wrote: > >>>>> Thanks for your reminder. >>>>> After reading relative code, I have some questions about DTrace >>>>> prober trigger point in deep C path. On SPARC and non-deep-C idle >>>>> path, DTrace probers have been placed as closer as possible to the >>>>> point that CPU enters into/exits from hardware idle state. On deep >>>>> C state path, the prober trigger points have been pulled out a >>>>> little. I heard there were some discussions about prober trigger >>>>> points but I missed those discussions. Could anybody give some >>>>> hints about those discussions? >>>>> >>>> >>>> What does PowerTop measure and report on other operating systems? >>>> Does PowerTop's C-state data include the software latenct to >>>> enter/exit C-state(s) on other Operating Systems? My current >>>> thought is Solaris should report the same >>>> measurement as other OSs. :-) >>> >>> Different OS is using different time resource. >>> My concern is, idle exiting dtrace probe was added into >>> do_interrupt, which will add too much latency when enabled. That >>> might affect the current report. >>> >>> We'd better to force idle exiting dtrace probe back to the idle >>> thread. >> You pointed out an very important and interesting issue, let's do >> more investigation and discussion about it. >> >> First, based on following factors, I think it's OK to trigger >> DTrace prober in >> do_interrupt(). >> 1) During every idle enter/exit loop, Dtrace will only be >> trigger in do_interrupt >> at most once, under the situation that an interrupt wakes up >> CPU from idle >> state. >> 2) Dtrace prober will cost non-ignorable latency only it's enabled. >> 3) There are already existing DTrace probers in interrupt >> path, which implies >> that Dtrace prober in interrupt path is acceptable. >> >> Second, actually, I think your question has revealled a >> possible design flaw >> in current deep C driver/powertop implementation in some >> extreme conditions. >> Thinking about following posssible extreme case: >> 1) idle thread put cpu into idle state. >> 2) CPU sleeps in idle state. >> 3) Hardware interrupt wakes up CPU from idle state. >> 4) do_interrupt calls hardware interrupt handler >> 5) more interrupts comes and are served by do_inerrupt. >> 6) cpu return to idle thread after served all interrupts. >> 7) Deep C driver get waking up timestamp, calculate CPU >> utilization, also >> triggers DTrace prober for powertop. >> 8) Goto step 1. >> >> In above example, let's say, >> step 1), 7) and 8) occupies 10%, which is the latency >> introduced by software >> step 2) occupies 30%, which is the actual CPU sleep time. >> step 3), 4), 5) and 6) occupies 60%, which is used to serve >> interrupt. >> >> The actual idle percent is about 40%. With current >> implementation, it will be >> calculated as about 100% idle, which will cause CPU falsely >> entering deep C >> state and powertop reporting wrong idle percent. >> >> With new patchset, it will trigger DTrace probers more >> precisely to reflect the >> actually CPU idle time. It still needs more cooperation >> between deep C driver >> and CPU idle notification to fix the above possible flaw. >> >> Any comments here? >> > > Actually as long as interrupt is restored after the dtrace > probe(exchange the place of hpet.use_lapic_timer and dtrace probe), > we shouldn't have the problem you described. I personally prefer to > force the dtrace probe in the idle path, we need more thoughts here, > and some benchmark result like libmicro. Yes, benchmark results will definitely give us more info about the choice. I have tried to run libmicro but found it's not very stable and the result is confusing.
You are right, with current implementation, if we switch hpet.use_lapic_timer and dtrace prober, the issue mentioned above doesn't exist. On the other hand, with introduction of cpu idle notification, the above simple solution is not sufficient enough, cpu idle notification may temporarily enable interrupt within callbacks. So more collaboration is needed here. Thanks! > > Thanks, > -Aubrey Liu Jiang (Gerry) OpenSolaris, OTC, SSG, Intel
