I have been trying to determine why scp runs slower on T1000 and T2000 
systems as compared to other SPARC systems. I have been using the 
profile provider to help me determine what instructions run "hot" on 
the T1000 I have available for my to test.

Here is the script I am using:

#!/usr/sbin/dtrace -s
int cnter;
profile-997
/arg1 && pid == $target/
{
         @aes[ustack(1)]=count();
         cnter++;
}
profile-997
/cnter >=40000 /
{
         exit(0);
}


I then take this data to find the hot functions and the hot 
instructions within those functions.

Looking at the data, the hottest function is AES_encrypt (no surprise 
there). It has 50% more hits on the T1000 than a V220 I am testing on.
This is interesting, because using timex, scp takes about 50% longer 
on the T1000 than the V220, overall, even though the T2000 has a clock 
of 1GHz and the V220 has only 450MHz.

Even more interesting, the AES_encrypt function shows less hotspots on 
the T1000 than the V220. Here is a bit of output from the listing of 
AES_encrypt from both systems. The numbers are the number of hits from 
5 runs of the profile script above. Blanks are "no hits".

V220:

     311 AES_encrypt+0x110:  e0 02 c0 01  ld     [%o3 + %g1], %l0
       2 AES_encrypt+0x114:  97 34 60 0e  srl    %l1, 0xe, %o3
         AES_encrypt+0x118:  84 04 fc 00  add    %l3, -0x400, %g2
     342 AES_encrypt+0x11c:  c6 02 80 16  ld     [%o2 + %l6], %g3
       1 AES_encrypt+0x120:  90 0a 63 fc  and    %o1, 0x3fc, %o0
       4 AES_encrypt+0x124   88 0d 20 ff  and    %l4, 0xff, %g4
     338 AES_encrypt+0x128:  b1 29 20 02  sll    %g4, 0x2, %i0
         AES_encrypt+0x12c:  f8 02 00 02  ld     [%o0 + %g2], %i4
         AES_encrypt+0x130:  9e 0b 7f fc  and    %o5, -0x4, %o7
     314 AES_encrypt+0x134   93 35 20 06  srl    %l4, 0x6, %o1
         AES_encrypt+0x138   fa 06 00 13  ld     [%i0 + %l3], %i5
         AES_encrypt+0x13c   b6 1c 00 03  xor    %l0, %g3, %i3
     317 AES_encrypt+0x140   c6 03 c0 01  ld     [%o7 + %g1], %g3
      11 AES_encrypt+0x144   94 0a e3 fc  and    %o3, 0x3fc, %o2
         AES_encrypt+0x148   98 1e c0 1c  xor    %i3, %i4, %o4

T1000:

     123 AES_encrypt+0x110:  e0 02 c0 01  ld        [%o3 + %g1], %l0
     257 AES_encrypt+0x114:  97 34 60 0e  srl       %l1, 0xe, %o3
     109 AES_encrypt+0x118:  84 04 fc 00  add       %l3, -0x400, %g2
     123 AES_encrypt+0x11c:  c6 02 80 16  ld        [%o2 + %l6], %g3
     285 AES_encrypt+0x120:  90 0a 63 fc  and       %o1, 0x3fc, %o0
     137 AES_encrypt+0x124:  88 0d 20 ff  and       %l4, 0xff, %g4
     116 AES_encrypt+0x128:  b1 29 20 02  sll       %g4, 0x2, %i0
     122 AES_encrypt+0x12c:  f8 02 00 02  ld        [%o0 + %g2], %i4
     257 AES_encrypt+0x130:  9e 0b 7f fc  and       %o5, -0x4, %o7
     123 AES_encrypt+0x134:  93 35 20 06  srl       %l4, 0x6, %o1
     109 AES_encrypt+0x138:  fa 06 00 13  ld        [%i0 + %l3], %i5
     309 AES_encrypt+0x13c:  b6 1c 00 03  xor       %l0, %g3, %i3
     136 AES_encrypt+0x140:  c6 03 c0 01  ld        [%o7 + %g1], %g3
     196 AES_encrypt+0x144:  94 0a e3 fc  and       %o3, 0x3fc, %o2
     116 AES_encrypt+0x148:  98 1e c0 1c  xor       %i3, %i4, %o4

Notice that there are particular instructions on the V220 that take 
awhile, and then the next couple have no hits. On the T1000, they are 
all hit, in a range of 100 to 300. I suspect this has something to do 
with the CMT, that where a cache miss on the V220 stalls the 
instruction, on the T1000 it causes a thread switch, and when the 
profile probe fires the script I have above won't record it. Does the 
profile provider fire for the virtual CPUs?

Anyway, I am at a loss. Is there any way to cause the profile provider 
to record the PC of a process even if that process is not on the CPU? 
Perhaps the tick provider would be better?

Any ideas how to figure this out?




-- 
blu

There are two rules in life:
Rule 1- Don't tell people everything you know
----------------------------------------------------------------------
Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
_______________________________________________
dtrace-discuss mailing list
[email protected]

Reply via email to