Hello, the program I was using when I discovered the problem is pretty complex, but I will try to see if I can make a small test program which can reproduce the error.
In the meanwhile, I have created a very small patch which can fix the problem. I am attaching it. It tries to emulate the Intel behaviour when stopping (i.e., PMDs are saved). Do you think it is OK? Regards, Victor On Wed, May 20, 2009 at 7:44 PM, Corey J Ashford <cjash...@us.ibm.com> wrote: > victor jimenez <betaband...@gmail.com> wrote on 05/20/2009 09:24:43 AM: > >> Hello, >> >> I did more tests and I think I found the reason why it is not working. >> I am attaching a log with the debug information. The log shows >> information between a call to sys_pfm_start() and sys_pfm_stop(), plus >> a following call to sys_pfm_read_pmds(). >> >> In file perfmon/perfmon_ctxsw.c:242, pfm_save_pmds() is called >> depending on the value returned by pfm_arch_ctxswout_thread(). This >> second function returns the value returned by pfm_arch_is_active(). >> pfm_arch_is_active() returns 1 when the process is being monitor >> (after calling pfm_start) and 0 if the process is not (after calling >> pfm_stop). >> >> The problem is that using only pfm_arch_is_active() to determine >> whether we should save the PMDs or not is not enough. When pfm_stop() >> is called the PMDs are not being saved in the powerpc implementation >> of perfmon2 (they are saved in the x86 implementation). Therefore, if >> after calling pfm_stop() a the process is switched out >> pfm_arch_is_active() will return 0 (because the process is not being >> monitored; we already call pfm_stop()). This will cause that the PMDs >> are not saved at the context switch time and, thus the wrong ones will >> be restored later. >> >> All this can be seen in the attached log. At line 103 sys_pfm_stop() >> is called. At line 112 sys_pfm_read_pmds() is called. For instance, >> the value for counter 5 (#instructions) is 1218670811. Then a context >> switch comes, but at line 128 we can see how need_save_pmds is 0 so >> the PMDs are not saved. Next time they are restored, at line 135 (for >> counter 5) the value is 1207034044. This value is the value which was >> saved before at line 83. >> >> I hope this can help to find an solution. > > > It definitely does! I haven't looked at this code for awhile, so I'm > working on getting back up to speed. I see the exact issue you are > talking about and how the Power implementation is different from the x86 > implementation. The structure of the calls is different: x86 has a > stop_save per-pmu-type call, whereas Power has a single pfm_stop_active > call. You can see that the comments in the Power code originally came > from the x86 implementation, because it still refers to "stop_save()" in > pfm_arch_stop, but obviously we took a slightly different tack after the > two diverged. > > I think the fix needs to go in pfm_stop_active, though, since that is the > equivalent spot in the Power implementation. I will try out a fix and see > how it goes. > > Do you have a test case I could try that reproduces this? If not, I'm > sure > I could come up with one, but a ready-made test case would save me some > time. > > > Regards, > > - Corey > > Corey Ashford > Software Engineer > IBM Linux Technology Center, Linux Toolchain > cjash...@us.ibm.com > >
diff -uNr linux-2.6.28/arch/powerpc/perfmon/perfmon.c linux-2.6.28-patched/arch/powerpc/perfmon/perfmon.c --- linux-2.6.28/arch/powerpc/perfmon/perfmon.c 2009-05-20 17:44:14.000000000 +0200 +++ linux-2.6.28-patched/arch/powerpc/perfmon/perfmon.c 2009-05-21 01:02:40.000000000 +0200 @@ -25,7 +25,33 @@ #include <linux/interrupt.h> #include <linux/perfmon_kern.h> -static void pfm_stop_active(struct task_struct *task, +static void pfm_save_pmds_no_clear(struct pfm_context *ctx, struct pfm_event_set *set) +{ + u64 val, ovfl_mask; + u64 *used_pmds, *cnt_pmds; + u16 i, num; + + ovfl_mask = pfm_pmu_conf->ovfl_mask; + num = set->nused_pmds; + cnt_pmds = ctx->regs.cnt_pmds; + used_pmds = set->used_pmds; + + /* + * save HW PMD, for counters, reconstruct 64-bit value + */ + for (i = 0; num; i++) { + if (test_bit(i, cast_ulp(used_pmds))) { + val = pfm_arch_read_pmd(ctx, i); + if (likely(test_bit(i, cast_ulp(cnt_pmds)))) + val = (set->pmds[i].value & ~ovfl_mask) | + (val & ovfl_mask); + set->pmds[i].value = val; + num--; + } + } +} + +static int pfm_stop_active(struct task_struct *task, struct pfm_context *ctx, struct pfm_event_set *set) { struct pfm_arch_pmu_info *arch_info; @@ -36,9 +62,15 @@ arch_info->disable_counters(ctx, set); if (set->npend_ovfls) - return; + return 1; arch_info->get_ovfl_pmds(ctx, set); + + /* PMDs must be saved when stopping. */ + pfm_save_pmds_no_clear(ctx, set); + + /* no need to save PMDs at a upper level. */ + return 0; } /* @@ -77,6 +109,7 @@ int pfm_arch_ctxswout_thread(struct task_struct *task, struct pfm_context *ctx) { struct pfm_arch_pmu_info *arch_info; + int need_save_pmds; arch_info = pfm_pmu_info(); /* @@ -87,12 +120,12 @@ if (ctx->state == PFM_CTX_MASKED) return 1; - pfm_stop_active(task, ctx, ctx->active_set); + need_save_pmds = pfm_stop_active(task, ctx, ctx->active_set); if (arch_info->ctxswout_thread) arch_info->ctxswout_thread(task, ctx, ctx->active_set); - return pfm_arch_is_active(ctx); + return need_save_pmds && pfm_arch_is_active(ctx); } /*
------------------------------------------------------------------------------ Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://www.creativitycat.com
_______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel