The comiler calculates the function result in *compile time* here.
3198+324*3198/324 is not that hard to calculate without running the program.
So there is nothing to execute in the runtime.
So in fact the program just issues two rdtscs in a raw.
You see, the very point of the optimization is for the program
to do less and to work faster. You wanted to know what is the time spent in runtime
And in this case the time spent doing the calculation in runtime is 0.
It is not clear for me what are those subtractions.
Probably they are from the omitted parts of the testcase.
It is always a good idea to provide the complete testcase
when you asking about compiler optoimizations.
Alexander
On Mon, 29 Oct 2007, [GB2312] лу╫щ TaoJie wrote:
Dear all:
My platform is:
Intel Pentium 4 CPU
OpenSolaris B74, built by myself
Sun Studio 11
In my program, I use asm("rdtsc") to measure the time cost between two rdtsc.
for example:
int some_func(...)
{
long long time1, time2;
int i = 3198, j = 324;
asm volatile("rdtsc" : "=A" (time1));
....
i = i + j * i / j;
asm volatile("rdtsc" : "=A" (time2))
return i;
}
int main(...)
{
....
some_func();
....
}
When I compile this program using "cc example.c" and disasmble a.out
by dis, the program logic is ok. The output is
some_func()
main+0x36: 0f 31 rdtsc
main+0x38: 89 45 f4 movl %eax,-0xc(%ebp)
main+0x3b: 89 55 f8 movl %edx,-0x8(%ebp)
main+0x3e: 8b 45 e8 movl -0x18(%ebp),%eax
main+0x41: 03 45 e4 addl -0x1c(%ebp),%eax
main+0x44: 89 45 e8 movl %eax,-0x18(%ebp)
main+0x47: 8b 45 e8 movl -0x18(%ebp),%eax
main+0x4a: 0f af 45 e4 imull -0x1c(%ebp),%eax
main+0x4e: 89 45 e8 movl %eax,-0x18(%ebp)
main+0x51: 8b 45 e8 movl -0x18(%ebp),%eax
main+0x54: 99 cltd
main+0x55: f7 7d e4 idivl -0x1c(%ebp)
main+0x58: 8b d0 movl %eax,%edx
main+0x5a: 89 55 e8 movl %edx,-0x18(%ebp)
main+0x5d: 0f 31 rdtsc
main+0x5f: 89 45 ec movl %eax,-0x14(%ebp)
main+0x62: 89 55 f0 movl %edx,-0x10(%ebp)
When I compile this program using "cc -xO5", the dis output is
some_func()
main+0x7: 0f 31 rdtsc
main+0x9: 89 45 e8 movl %eax,-0x18(%ebp)
main+0xc: 89 55 ec movl %edx,-0x14(%ebp)
main+0xf: 0f 31 rdtsc
main+0x11: 89 45 f0 movl %eax,-0x10(%ebp)
main+0x14: 89 55 f4 movl %edx,-0xc(%ebp)
main+0x17: 8b 5d f0 movl -0x10(%ebp),%ebx
main+0x1a: 8b 45 f4 movl -0xc(%ebp),%eax
main+0x1d: 8b 4d e8 movl -0x18(%ebp),%ecx
main+0x20: 8b 55 ec movl -0x14(%ebp),%edx
main+0x23: 2b d9 subl %ecx,%ebx
main+0x25: 1b c2 sbbl %edx,%eax
main+0x27: 89 5d e0 movl %ebx,-0x20(%ebp)
main+0x2a: 89 45 e4 movl %eax,-0x1c(%ebp)
Now the program logic is wrong! sun cc thinks rdtscs are irrelative
with the other parts in some_func, and then it advances the second
asm("rdtsc")!
In this case, I can't measure the time cost.
Then how can I stop sun cc optimization partly between these two asm
statements when using -xO5 optimization to the whole program?
I mean the second rdtsc should be put after the statement i = i + j *
i / j strictly. (though I know the instructions will be executed in
x86 cpu out-of-order, and the result may not be very precise, but it
still works)
Any good ideas?
TIA
Regards,
TJ
_______________________________________________
tools-discuss mailing list
[EMAIL PROTECTED]
_______________________________________________
opensolaris-discuss mailing list
[email protected]