A function call itself for every word written or read+written in these loops is bad enough. But since the memory test must be run with dcache disabled, the schedule() call, traversing the linked list of registered cyclic clients, and accessing the 'struct cyclic_info' for each to see if any are due for a callback, is quite expensive. On a beagleboneblack, testing a modest 16MiB region takes 2.5 minutes:
=> dcache off => time mtest 0x81000000 0x82000000 0 1 Testing 81000000 ... 82000000: Iteration: 1 Tested 1 iteration(s) with 0 errors. time: 2 minutes, 28.946 seconds There is really no need for calling schedule() so frequently. It is quite easy to limit the calls to once for every 256 words by using a u8 variable. With that, the same test as above becomes 37 times faster: => dcache off => time mtest 0x81000000 0x82000000 0 1 Testing 81000000 ... 82000000: Iteration: 1 Tested 1 iteration(s) with 0 errors. time: 4.052 seconds Note that we are still making a total of 3 loops * (4 * 2^20 words/loop) / (256 words/call) = 49152 calls during those ~4000 milliseconds, so the schedule() calls are still done less than 0.1ms apart. These numbers are just for a beagleboneblack, other boards may have a slower memory, but we are _two orders of magnitude_ away from schedule() "only" being called at 100Hz, which is still more than enough to ensure any watchdog is kept happy. Signed-off-by: Rasmus Villemoes <r...@prevas.dk> --- cmd/mem.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/cmd/mem.c b/cmd/mem.c index 9a1cfa4534c..013e93f09df 100644 --- a/cmd/mem.c +++ b/cmd/mem.c @@ -730,6 +730,8 @@ static ulong mem_test_alt(vu_long *buf, ulong start_addr, ulong end_addr, 0x00000055, /* four non-adjacent bits */ 0xaaaaaaaa, /* alternating 1/0 */ }; + /* Rate-limit schedule() calls to one for every 256 words. */ + u8 count = 0; num_words = (end_addr - start_addr) / sizeof(vu_long); @@ -885,7 +887,8 @@ static ulong mem_test_alt(vu_long *buf, ulong start_addr, ulong end_addr, * Fill memory with a known pattern. */ for (pattern = 1, offset = 0; offset < num_words; pattern++, offset++) { - schedule(); + if (!count++) + schedule(); addr[offset] = pattern; } @@ -893,7 +896,8 @@ static ulong mem_test_alt(vu_long *buf, ulong start_addr, ulong end_addr, * Check each location and invert it for the second pass. */ for (pattern = 1, offset = 0; offset < num_words; pattern++, offset++) { - schedule(); + if (!count++) + schedule(); temp = addr[offset]; if (temp != pattern) { printf("\nFAILURE (read/write) @ 0x%.8lx:" @@ -913,7 +917,8 @@ static ulong mem_test_alt(vu_long *buf, ulong start_addr, ulong end_addr, * Check each location for the inverted pattern and zero it. */ for (pattern = 1, offset = 0; offset < num_words; pattern++, offset++) { - schedule(); + if (!count++) + schedule(); anti_pattern = ~pattern; temp = addr[offset]; if (temp != anti_pattern) { -- 2.50.1