Hello

When doing some experiments on a SPARC niagara1 machine, I noticed that for large runs the INSTR_CNT count was off by a factor of two.

For example, I have a microbenchmark that runs a 4-instruction loop a given number of times (code attached). The results are as follows

! g2 =          1,000   result =         4,005 (correct)
! g2 =         10,000   result =        40,006 (correct)
! g2 =        100,000   result =       400,006 (correct)
! g2 =      1,000,000   result =     4,000,006 (correct)
! g2 =     10,000,000   result =    40,000,006 (correct)
! g2 =    100,000,000   result =   400,000,006 (correct)
! g2 =  1,000,000,000   result = 4,000,000,006 (correct)

! g2 =  2,000,000,000   result = 5,852,516,357 (incorrect)
!                              = 0x1 5cd6 5005
!                   should be  = 0x1 dcd6 5005


From looking at the SPARC/T1 (niagara) documentation I see that the PIC
(counter) register overflows at 32-bits, but in the perfmon code we have counter_width set to 31. Thus when the counter overflows, we only add
2^31 to the emulated 64-bit counter instead of the proper 2^32.

I tried changing counter_width to 32 in arch/sparc64/perfmon/perfmon.c
but this doesn't fix things.  In __pfm_get_ovfl_pmds() we detect if
an overflow has occurred by anding against 1ULL<<counter_width, but since
the PIC register is only 32-bits, this will never be true.

So, is there a proper way to fix this? I'm not sure how other architectures handle cases like this.

Thanks,

Vince


! as -o instr_count.o instr_count.s ; ld -o instr_count instr_count.o

! loop has 4 instructions
! so total instruction count should be
!   g2*4 + 5
        
! g2 =          1,000   result =         4,005 (correct)
! g2 =         10,000   result =        40,006 (correct)
! g2 =        100,000   result =       400,006 (correct)
! g2 =      1,000,000   result =     4,000,006 (correct)
! g2 =     10,000,000   result =    40,000,006 (correct)
! g2 =    100,000,000   result =   400,000,006 (correct)
! g2 =  1,000,000,000   result = 4,000,000,006 (correct)

! g2 =  2,000,000,000   result = 5,852,516,357 (incorrect)
!                              = 0x1 5cd6 5005
!                   should be  = 0x1 dcd6 5005


.equ SYSCALL_EXIT,1     

        .globl _start
_start:
                
        set     2000000000,%g2
        set     0,%g3
        
loop:   
        inc     %g3             
        cmp     %g2,%g3 
        bne     loop
        inc     %g4     ! branch delay slot
                
        !================================
        ! Exit
        !================================
exit:           
        mov     0,%o0                   ! exit value
        mov     SYSCALL_EXIT,%g1        ! put the exit syscall number in g1
        ta      0x10                    ! and exit
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to