On Fri, Aug 14, 2009 at 7:31 PM, Paul Yuan<yingbo....@gmail.com> wrote:
> Hi all,
>
> I used cachegrind to evaluate the cache behavior. But the Dw number is
> very strange.

I suspect the assembly code doesn't look like you think it does --
that it is doing more memory writes than you think, particularly for
the malloc() calls;  perhaps this is due to an argument being passed
on the stack?

I suggest annotating the assembly code rather than the C code to
really understand what's happening;  see
http://www.valgrind.org/docs/manual/cg-manual.html#cg-manual.assembler
for details how.

Nick


>
> //test1.c
> #include <stdio.h>
> #include <stdlib.h>
>
> #define N 20000
> int *parray[N];
>
> int main ()
> {
>   int i;
>
>   for (i = 0; i < N; i++)
>     parray[i] = (int *) malloc (10 * sizeof (int));
>   for (i = 0; i < N; i++)
>     parray[i] = (int *) (i);
>
>   return 0;
> }
>
> valgrind --tool=cachegrind ./test1
>
> 1) On 64-bit Intel Xeon CPU 3050 2.13GHz dual core:  cache line size
> is 64 bytes.
> gcc -O2 -g -o test1 test1.c
> sizeof (int*) = 8.
>
> Ir I1mr I2mr Dr D1mr D2mr     Dw  D1mw  D2mw
> ----------------------------------------------------
> 39,998    0    0  0    0    0      0     0     0    for (i = 0; i < N; i++)
> 80,000    1    1  0    0    0 40,000 2,501 2,501      parray[i] = (int
> *) malloc (10 * sizeof (int));
>
> 39,998    0    0  0    0    0      0     0     0    for (i = 0; i < N; i++)
> 40,000    0    0  0    0    0 20,000 2,501     0      parray[i] = (int *) (i);
>
>
> Question: Why is Dw is 40,000 for the line of malloc()? The
> corresponding assembly is "movq  %rax, pdarray(,%rbx,8)". The D1mw is
> reasonable. 20,000 * 8 / 64 = 2500.
>
> 2) On Dual Core AMD Opteron Processor 270: cache line size is 64 bytes.
> gcc -m32 -O2 -g -o test1 test1.c
> sizeof (int*) = 4.
>
> Ir I1mr I2mr Dr D1mr D2mr     Dw  D1mw  D2mw
> ----------------------------------------------------
> 80,002    0    0  0    0    0      0     0     0        for (i = 0; i < N; 
> i++)
> 80,000    0    0  0    0    0 60,000 1,250 1,250
> parray[i] = (int *) malloc (10 * sizeof (int));
>
> 60,002    0    0  0    0    0      0     0     0        for (i = 0; i < N; 
> i++)
> 20,000    0    0  0    0    0 20,000 1,250     0
> parray[i] = (int *) (i);
>
> Both Ir and Dw numbers are wrong. The D1mw is reasonable. 20,000 * 4 /
> 64 = 1250.
>
> Any suggestion is welcome.
>
> --
> Regards,
> Paul Yuan (袁鹏)
>
> ------------------------------------------------------------------------------
> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
> trial. Simplify your report design, integration and deployment - and focus on
> what you do best, core application coding. Discover what's new with
> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> _______________________________________________
> Valgrind-users mailing list
> Valgrind-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to