Hello,

the program I was using when I discovered the problem is pretty
complex, but I will try to see if I can make a small test program
which can reproduce the error.

In the meanwhile, I have created a very small patch which can fix the
problem. I am attaching it. It tries to emulate the Intel behaviour
when stopping (i.e., PMDs are saved).

Do you think it is OK?

Regards,
Victor

On Wed, May 20, 2009 at 7:44 PM, Corey J Ashford <cjash...@us.ibm.com> wrote:
> victor jimenez <betaband...@gmail.com> wrote on 05/20/2009 09:24:43 AM:
>
>> Hello,
>>
>> I did more tests and I think I found the reason why it is not working.
>> I am attaching a log with the debug information. The log shows
>> information between a call to sys_pfm_start() and sys_pfm_stop(), plus
>> a following call to sys_pfm_read_pmds().
>>
>> In file perfmon/perfmon_ctxsw.c:242, pfm_save_pmds() is called
>> depending on the value returned by pfm_arch_ctxswout_thread(). This
>> second function returns the value returned by pfm_arch_is_active().
>> pfm_arch_is_active() returns 1 when the process is being monitor
>> (after calling pfm_start) and 0 if the process is not (after calling
>> pfm_stop).
>>
>> The problem is that using only pfm_arch_is_active() to determine
>> whether we should save the PMDs or not is not enough. When pfm_stop()
>> is called the PMDs are not being saved in the powerpc implementation
>> of perfmon2 (they are saved in the x86 implementation). Therefore, if
>> after calling pfm_stop() a the process is switched out
>> pfm_arch_is_active() will return 0 (because the process is not being
>> monitored; we already call pfm_stop()). This will cause that the PMDs
>> are not saved at the context switch time and, thus the wrong ones will
>> be restored later.
>>
>> All this can be seen in the attached log. At line 103 sys_pfm_stop()
>> is called. At line 112 sys_pfm_read_pmds() is called. For instance,
>> the value for counter 5 (#instructions) is 1218670811. Then a context
>> switch comes, but at line 128 we can see how need_save_pmds is 0 so
>> the PMDs are not saved. Next time they are restored, at line 135 (for
>> counter 5) the value is 1207034044. This value is the value which was
>> saved before at line 83.
>>
>> I hope this can help to find an solution.
>
>
> It definitely does!  I haven't looked at this code for awhile, so I'm
> working on getting back up to speed.  I see the exact issue you are
> talking about and how the Power implementation is different from the x86
> implementation.  The structure of the calls is different: x86 has a
> stop_save per-pmu-type call, whereas Power has a single pfm_stop_active
> call.  You can see that the comments in the Power code originally came
> from the x86 implementation, because it still refers to "stop_save()" in
> pfm_arch_stop, but obviously we took a slightly different tack after the
> two diverged.
>
> I think the fix needs to go in pfm_stop_active, though, since that is the
> equivalent spot in the Power implementation.  I will try out a fix and see
> how it goes.
>
> Do you have a test case I could try that reproduces this?  If not, I'm
> sure
> I could come up with one, but a ready-made test case would save me some
> time.
>
>
> Regards,
>
> - Corey
>
> Corey Ashford
> Software Engineer
> IBM Linux Technology Center, Linux Toolchain
> cjash...@us.ibm.com
>
>
diff -uNr linux-2.6.28/arch/powerpc/perfmon/perfmon.c linux-2.6.28-patched/arch/powerpc/perfmon/perfmon.c
--- linux-2.6.28/arch/powerpc/perfmon/perfmon.c	2009-05-20 17:44:14.000000000 +0200
+++ linux-2.6.28-patched/arch/powerpc/perfmon/perfmon.c	2009-05-21 01:02:40.000000000 +0200
@@ -25,7 +25,33 @@
 #include <linux/interrupt.h>
 #include <linux/perfmon_kern.h>
 
-static void pfm_stop_active(struct task_struct *task,
+static void pfm_save_pmds_no_clear(struct pfm_context *ctx, struct pfm_event_set *set)
+{
+	u64 val, ovfl_mask;
+	u64 *used_pmds, *cnt_pmds;
+	u16 i, num;
+
+	ovfl_mask = pfm_pmu_conf->ovfl_mask;
+	num = set->nused_pmds;
+	cnt_pmds = ctx->regs.cnt_pmds;
+	used_pmds = set->used_pmds;
+
+	/*
+	 * save HW PMD, for counters, reconstruct 64-bit value
+	 */
+	for (i = 0; num; i++) {
+		if (test_bit(i, cast_ulp(used_pmds))) {
+			val = pfm_arch_read_pmd(ctx, i);
+			if (likely(test_bit(i, cast_ulp(cnt_pmds))))
+				val = (set->pmds[i].value & ~ovfl_mask) |
+					(val & ovfl_mask);
+			set->pmds[i].value = val;
+			num--;
+		}
+	}
+}
+
+static int pfm_stop_active(struct task_struct *task,
 			    struct pfm_context *ctx, struct pfm_event_set *set)
 {
 	struct pfm_arch_pmu_info *arch_info;
@@ -36,9 +62,15 @@
 	arch_info->disable_counters(ctx, set);
 
 	if (set->npend_ovfls)
-		return;
+		return 1;
 
 	arch_info->get_ovfl_pmds(ctx, set);
+
+	/* PMDs must be saved when stopping. */
+	pfm_save_pmds_no_clear(ctx, set);
+
+	/* no need to save PMDs at a upper level. */
+	return 0;
 }
 
 /*
@@ -77,6 +109,7 @@
 int pfm_arch_ctxswout_thread(struct task_struct *task, struct pfm_context *ctx)
 {
 	struct pfm_arch_pmu_info *arch_info;
+	int need_save_pmds;
 
 	arch_info = pfm_pmu_info();
 	/*
@@ -87,12 +120,12 @@
 	if (ctx->state == PFM_CTX_MASKED)
 		return 1;
 
-	pfm_stop_active(task, ctx, ctx->active_set);
+	need_save_pmds = pfm_stop_active(task, ctx, ctx->active_set);
 
 	if (arch_info->ctxswout_thread)
 		arch_info->ctxswout_thread(task, ctx, ctx->active_set);
 
-	return pfm_arch_is_active(ctx);
+	return need_save_pmds && pfm_arch_is_active(ctx);
 }
 
 /*
------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp asthey present alongside digital heavyweights like Barbarian
Group, R/GA, & Big Spaceship. http://www.creativitycat.com 
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to