?? TaoJie wrote:
> Dear all:
>
> My platform is:
> Intel Pentium 4 CPU
> OpenSolaris B74, built by myself
> Sun Studio 11
>
> In my program, I use asm("rdtsc") to measure the time cost between two rdtsc.
> for example:
> int some_func(...)
> {
> long long time1, time2;
> int i = 3198, j = 324;
>
> asm volatile("rdtsc" : "=A" (time1));
>
> ....
> i = i + j * i / j;
>
> asm volatile("rdtsc" : "=A" (time2))
>
> return i;
> }
>
> int main(...)
> {
> ....
> some_func();
> ....
> }
>
> When I compile this program using "cc example.c" and disasmble a.out
> by dis, the program logic is ok. The output is
> some_func()
> main+0x36: 0f 31 rdtsc
> main+0x38: 89 45 f4 movl %eax,-0xc(%ebp)
> main+0x3b: 89 55 f8 movl %edx,-0x8(%ebp)
> main+0x3e: 8b 45 e8 movl -0x18(%ebp),%eax
> main+0x41: 03 45 e4 addl -0x1c(%ebp),%eax
> main+0x44: 89 45 e8 movl %eax,-0x18(%ebp)
> main+0x47: 8b 45 e8 movl -0x18(%ebp),%eax
> main+0x4a: 0f af 45 e4 imull -0x1c(%ebp),%eax
> main+0x4e: 89 45 e8 movl %eax,-0x18(%ebp)
> main+0x51: 8b 45 e8 movl -0x18(%ebp),%eax
> main+0x54: 99 cltd
> main+0x55: f7 7d e4 idivl -0x1c(%ebp)
> main+0x58: 8b d0 movl %eax,%edx
> main+0x5a: 89 55 e8 movl %edx,-0x18(%ebp)
> main+0x5d: 0f 31 rdtsc
> main+0x5f: 89 45 ec movl %eax,-0x14(%ebp)
> main+0x62: 89 55 f0 movl %edx,-0x10(%ebp)
>
> When I compile this program using "cc -xO5", the dis output is
> some_func()
> main+0x7: 0f 31 rdtsc
> main+0x9: 89 45 e8 movl %eax,-0x18(%ebp)
> main+0xc: 89 55 ec movl %edx,-0x14(%ebp)
> main+0xf: 0f 31 rdtsc
> main+0x11: 89 45 f0 movl %eax,-0x10(%ebp)
> main+0x14: 89 55 f4 movl %edx,-0xc(%ebp)
> main+0x17: 8b 5d f0 movl -0x10(%ebp),%ebx
> main+0x1a: 8b 45 f4 movl -0xc(%ebp),%eax
> main+0x1d: 8b 4d e8 movl -0x18(%ebp),%ecx
> main+0x20: 8b 55 ec movl -0x14(%ebp),%edx
> main+0x23: 2b d9 subl %ecx,%ebx
> main+0x25: 1b c2 sbbl %edx,%eax
> main+0x27: 89 5d e0 movl %ebx,-0x20(%ebp)
> main+0x2a: 89 45 e4 movl %eax,-0x1c(%ebp)
>
> Now the program logic is wrong! sun cc thinks rdtscs are irrelative
> with the other parts in some_func, and then it advances the second
> asm("rdtsc")!
> In this case, I can't measure the time cost.
>
> Then how can I stop sun cc optimization partly between these two asm
> statements when using -xO5 optimization to the whole program?
> I mean the second rdtsc should be put after the statement i = i + j *
> i / j strictly. (though I know the instructions will be executed in
> x86 cpu out-of-order, and the result may not be very precise, but it
> still works)
> Any good ideas?
>
> TIA
>
> Regards,
> TJ
> _______________________________________________
> opensolaris-discuss mailing list
> [email protected]
You're going to be very frustrated with this approach because:
1) rdtsc is not a synchronizing instruction; the cpu may perform
the load earlier than you think it does.
2) you'll need to bind your program to a cpu as tsc counters are not
the same at boot.
My suggestion is to repeat the activity a sufficient number of times
such that you can afford to use gethrtime() to measure the time
interval. This is the approach we took w/ libmicro (see performance
community) and has worked reasonably well.
- Bart
--
Bart Smaalders Solaris Kernel Performance
[EMAIL PROTECTED] http://blogs.sun.com/barts
_______________________________________________
opensolaris-discuss mailing list
[email protected]