On Fri, Aug 14, 2009 at 7:31 PM, Paul Yuan<yingbo....@gmail.com> wrote: > Hi all, > > I used cachegrind to evaluate the cache behavior. But the Dw number is > very strange.
I suspect the assembly code doesn't look like you think it does -- that it is doing more memory writes than you think, particularly for the malloc() calls; perhaps this is due to an argument being passed on the stack? I suggest annotating the assembly code rather than the C code to really understand what's happening; see http://www.valgrind.org/docs/manual/cg-manual.html#cg-manual.assembler for details how. Nick > > //test1.c > #include <stdio.h> > #include <stdlib.h> > > #define N 20000 > int *parray[N]; > > int main () > { > int i; > > for (i = 0; i < N; i++) > parray[i] = (int *) malloc (10 * sizeof (int)); > for (i = 0; i < N; i++) > parray[i] = (int *) (i); > > return 0; > } > > valgrind --tool=cachegrind ./test1 > > 1) On 64-bit Intel Xeon CPU 3050 2.13GHz dual core: cache line size > is 64 bytes. > gcc -O2 -g -o test1 test1.c > sizeof (int*) = 8. > > Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw > ---------------------------------------------------- > 39,998 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++) > 80,000 1 1 0 0 0 40,000 2,501 2,501 parray[i] = (int > *) malloc (10 * sizeof (int)); > > 39,998 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++) > 40,000 0 0 0 0 0 20,000 2,501 0 parray[i] = (int *) (i); > > > Question: Why is Dw is 40,000 for the line of malloc()? The > corresponding assembly is "movq %rax, pdarray(,%rbx,8)". The D1mw is > reasonable. 20,000 * 8 / 64 = 2500. > > 2) On Dual Core AMD Opteron Processor 270: cache line size is 64 bytes. > gcc -m32 -O2 -g -o test1 test1.c > sizeof (int*) = 4. > > Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw > ---------------------------------------------------- > 80,002 0 0 0 0 0 0 0 0 for (i = 0; i < N; > i++) > 80,000 0 0 0 0 0 60,000 1,250 1,250 > parray[i] = (int *) malloc (10 * sizeof (int)); > > 60,002 0 0 0 0 0 0 0 0 for (i = 0; i < N; > i++) > 20,000 0 0 0 0 0 20,000 1,250 0 > parray[i] = (int *) (i); > > Both Ir and Dw numbers are wrong. The D1mw is reasonable. 20,000 * 4 / > 64 = 1250. > > Any suggestion is welcome. > > -- > Regards, > Paul Yuan (袁鹏) > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Valgrind-users mailing list > Valgrind-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/valgrind-users > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users