Hello
When doing some experiments on a SPARC niagara1 machine, I noticed that
for large runs the INSTR_CNT count was off by a factor of two.
For example, I have a microbenchmark that runs a 4-instruction loop a
given number of times (code attached). The results are as follows
! g2 = 1,000 result = 4,005 (correct)
! g2 = 10,000 result = 40,006 (correct)
! g2 = 100,000 result = 400,006 (correct)
! g2 = 1,000,000 result = 4,000,006 (correct)
! g2 = 10,000,000 result = 40,000,006 (correct)
! g2 = 100,000,000 result = 400,000,006 (correct)
! g2 = 1,000,000,000 result = 4,000,000,006 (correct)
! g2 = 2,000,000,000 result = 5,852,516,357 (incorrect)
! = 0x1 5cd6 5005
! should be = 0x1 dcd6 5005
From looking at the SPARC/T1 (niagara) documentation I see that the PIC
(counter) register overflows at 32-bits, but in the perfmon code we have
counter_width set to 31. Thus when the counter overflows, we only add
2^31 to the emulated 64-bit counter instead of the proper 2^32.
I tried changing counter_width to 32 in arch/sparc64/perfmon/perfmon.c
but this doesn't fix things. In __pfm_get_ovfl_pmds() we detect if
an overflow has occurred by anding against 1ULL<<counter_width, but since
the PIC register is only 32-bits, this will never be true.
So, is there a proper way to fix this? I'm not sure how other
architectures handle cases like this.
Thanks,
Vince
! as -o instr_count.o instr_count.s ; ld -o instr_count instr_count.o
! loop has 4 instructions
! so total instruction count should be
! g2*4 + 5
! g2 = 1,000 result = 4,005 (correct)
! g2 = 10,000 result = 40,006 (correct)
! g2 = 100,000 result = 400,006 (correct)
! g2 = 1,000,000 result = 4,000,006 (correct)
! g2 = 10,000,000 result = 40,000,006 (correct)
! g2 = 100,000,000 result = 400,000,006 (correct)
! g2 = 1,000,000,000 result = 4,000,000,006 (correct)
! g2 = 2,000,000,000 result = 5,852,516,357 (incorrect)
! = 0x1 5cd6 5005
! should be = 0x1 dcd6 5005
.equ SYSCALL_EXIT,1
.globl _start
_start:
set 2000000000,%g2
set 0,%g3
loop:
inc %g3
cmp %g2,%g3
bne loop
inc %g4 ! branch delay slot
!================================
! Exit
!================================
exit:
mov 0,%o0 ! exit value
mov SYSCALL_EXIT,%g1 ! put the exit syscall number in g1
ta 0x10 ! and exit
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel