Hi Stephane, I have made some progress in tracking this problem down. The big picture is that pfm_arch_ctxswin_thread is never getting called, so when the thread is switched out, and then back in again at some point, the PMU context is not getting restored onto the PMU registers, causing the counters to stop till the end of the run.
pfm_arch_ctxswin_thread is not getting called because of the following code in perfmon_ctxsw.c: /* * TIF flag was removed since switch_to * context is detaching, skip everything, * keep oncpu=-1 */ if (!test_thread_flag(TIF_PERFMON_CTXSW)) goto skip_all; Apparently the TIF_PERFMON_CTXSW flag is always cleared. I haven't tracked any farther back than this yet, but was hoping this might trigger a thought or two in your mind as to what might be going on. I also noticed that this code appears to have changed from 2.6.29 to 2.6.30. Anyway, I'd appreciate any thoughts you might have on this. I may not get back to looking at this till Monday afternoon, so no huge rush. Thanks for your consideration, - Corey stephane eranian wrote: > Corey, > > On Wed, Aug 26, 2009 at 1:55 AM, Corey > Ashford<cjash...@linux.vnet.ibm.com> wrote: >> Corey Ashford wrote: >>> stephane eranian wrote: >>>> On Mon, Aug 24, 2009 at 8:48 PM, Corey >>>> Ashford<cjash...@linux.vnet.ibm.com> wrote: >>>>> stephane eranian wrote: >>>>>> Corey, >>>>>> >>>>> [snip] >>>>>> Here are a couple of tests you could try and run to narrow it down: >>>>>> - taskset -c 0 self >>>>>> - syst >>>>>> >>>>> "taskset -c 0 self" doesn't improve the behavior. The results are still >>>>> all >>>>> over the place. >>>>> >>>> That's strange, must be something really central. >>>> You need to enable debugging. Careful as this has changed again in 2.6.30 >>>> because of the dynamic_printk stuff. The good thing is that now you can >>>> turn on/off individual printk. >>> I'm not familiar with dynamic_printk, so that will take some research. >>> >>>>> "syst" is giving me an error, which may be something completely >>>>> unrelated: >>>>> >>>>> [r...@elm3c4 examples_v2.x]# ./syst >>>>> cannot set affinity to CPU0: Invalid argument >>>>> >>>> Weird. You have a CPU0, don't you? >>> Yes :) I'm still debugging this to figure out what's going on. No >>> results yet >>> (took me awhile to get systemtap running due to many pilot errors) >> Ok, I tracked the syst problem down. There is an error in syst.c which >> manifests itself on big-endian machines when syst.c is compiled in 32-bit >> mode. >> >> The bit vector which is used to describe the cpus that you want to set the >> affinity for is an array of 32-bit words (when using the >> compat_sys_sched_setaffinity system call in 32-bit mode). syst programs a >> vector of 64-bit words. On a little endian machine, this wouldn't matter, >> because the least significant byte of the 32-bit or 64-bit word is always at >> offset 0. But on a big-endian machine, the least significant byte is at >> offset 0x3 or 0x7 depending on the word size. So the result is that the bit >> vector is interpreted as setting the affinity for a cpu which does not >> exist. >> > I think nowdays, we should simply use the libc cpu_set and call the > regular sched_setaffinity() instead of having a custom version. That > was from a long time ago. Hopefully, the official API will work on 32-bit > big-endian systems. > >> There are a couple of ways to fix this, and I will post a patch which >> contains both versions. >> >> So, after fixing this problem, syst does produce reliable results on 2.6.30. >> So I am assuming now that this the problem with the self test (and others) >> is that something is messed up with the per-thread context code. >> > Yes, most likely. That is why I asked you to try taskset -c 0 self to avoid > switching from one CPU to another. But obviously you can be switched in > and out. > > >> I will be start working on this. >> >> - Corey >> >> -- Regards, - Corey Corey Ashford Software Engineer IBM Linux Technology Center, Linux Toolchain Beaverton, OR 503-578-3507 cjash...@us.ibm.com ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel