Hi,
I am doing some memory performance measurements on our custom MPC5200B board which runs on 396 MHz internally and is connected to DDR RAM. The RAM is driven with 132 MHz. With the attached program (compile with -lrt) I am testing the memcpy() throughput. In theory the memory throughput should be the double of the memcpy() throughput if source and destination buffers are same size and inside the DDR-RAM. So one could make the simple calculation: 132 MHz * 32 Bit (address width) * 2 (DDR) ~ 1GBytes/sec brutto memory throughput. For a memcpy this should be then ~500MB/second. Of course in real world scenarios we cannot reach the theoretical limit, but be about 30 % near I guess. I get the following values on my board: bash-2.05b# ./memcpy_perf Test (10000) memcpy of sizes (1024) .... 10000 memcpy. Time per memcpy: 1567 [nsec] (653 MB/sec) finished. Test (10000) memcpy of sizes (2048) .... 10000 memcpy. Time per memcpy: 2939 [nsec] (696 MB/sec) finished. Test (10000) memcpy of sizes (4096) .... 10000 memcpy. Time per memcpy: 5706 [nsec] (717 MB/sec) finished. Test (10000) memcpy of sizes (8192) .... 10000 memcpy. Time per memcpy: 17077 [nsec] (479 MB/sec) finished. Test (10000) memcpy of sizes (16384) .... 10000 memcpy. Time per memcpy: 133314 [nsec] (122 MB/sec) finished. Test (1000) memcpy of sizes (32768) .... 1000 memcpy. Time per memcpy: 243417 [nsec] (134 MB/sec) finished. Test (1000) memcpy of sizes (51200) .... 1000 memcpy. Time per memcpy: 403455 [nsec] (126 MB/sec) finished. Test (1000) memcpy of sizes (102400) .... 1000 memcpy. Time per memcpy: 713316 [nsec] (143 MB/sec) finished. Test (100) memcpy of sizes (1048576) .... 100 memcpy. Time per memcpy: 7210570 [nsec] (145 MB/sec) finished. Test (10) memcpy of sizes (10485760) .... 10 memcpy. Time per memcpy: 78162400 [nsec] (134 MB/sec) finished. Test (5) memcpy of sizes (52428800) .... 5 memcpy. Time per memcpy: 425281800 [nsec] (123 MB/sec) finished. The first 4 values are because of the data cache. So here we are testing cache performance. All other values will test the memory controller interface. All in all, I am not sure, why the memory access is so much slower than I expected. Which factors did I miss in my calculation ? Can anybody run this program on its 5200B based board as a comparision ? Best regards, Daniel Schnell.
memcpy_perf.c
Description: memcpy_perf.c
_______________________________________________ Linuxppc-embedded mailing list Linuxppc-embedded@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-embedded