A function call itself for every word written or read+written in these
loops is bad enough. But since the memory test must be run with dcache
disabled, the schedule() call, traversing the linked list of
registered cyclic clients, and accessing the 'struct cyclic_info' for
each to see if any are due for a callback, is quite expensive. On a
beagleboneblack, testing a modest 16MiB region takes 2.5 minutes:

  => dcache off
  => time mtest 0x81000000 0x82000000 0 1
  Testing 81000000 ... 82000000:
  Iteration:      1
  Tested 1 iteration(s) with 0 errors.

  time: 2 minutes, 28.946 seconds

There is really no need for calling schedule() so frequently. It is
quite easy to limit the calls to once for every 256 words by using a
u8 variable. With that, the same test as above becomes 37 times
faster:

  => dcache off
  => time mtest 0x81000000 0x82000000 0 1
  Testing 81000000 ... 82000000:
  Iteration:      1
  Tested 1 iteration(s) with 0 errors.

  time: 4.052 seconds

Note that we are still making a total of

  3 loops * (4 * 2^20 words/loop) / (256 words/call) = 49152 calls

during those ~4000 milliseconds, so the schedule() calls are still
done less than 0.1ms apart.

These numbers are just for a beagleboneblack, other boards may have a
slower memory, but we are _two orders of magnitude_ away from
schedule() "only" being called at 100Hz, which is still more than
enough to ensure any watchdog is kept happy.

Signed-off-by: Rasmus Villemoes <r...@prevas.dk>
---
 cmd/mem.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/cmd/mem.c b/cmd/mem.c
index 9a1cfa4534c..013e93f09df 100644
--- a/cmd/mem.c
+++ b/cmd/mem.c
@@ -730,6 +730,8 @@ static ulong mem_test_alt(vu_long *buf, ulong start_addr, 
ulong end_addr,
                0x00000055,     /* four non-adjacent bits */
                0xaaaaaaaa,     /* alternating 1/0 */
        };
+       /* Rate-limit schedule() calls to one for every 256 words. */
+       u8 count = 0;
 
        num_words = (end_addr - start_addr) / sizeof(vu_long);
 
@@ -885,7 +887,8 @@ static ulong mem_test_alt(vu_long *buf, ulong start_addr, 
ulong end_addr,
         * Fill memory with a known pattern.
         */
        for (pattern = 1, offset = 0; offset < num_words; pattern++, offset++) {
-               schedule();
+               if (!count++)
+                       schedule();
                addr[offset] = pattern;
        }
 
@@ -893,7 +896,8 @@ static ulong mem_test_alt(vu_long *buf, ulong start_addr, 
ulong end_addr,
         * Check each location and invert it for the second pass.
         */
        for (pattern = 1, offset = 0; offset < num_words; pattern++, offset++) {
-               schedule();
+               if (!count++)
+                       schedule();
                temp = addr[offset];
                if (temp != pattern) {
                        printf("\nFAILURE (read/write) @ 0x%.8lx:"
@@ -913,7 +917,8 @@ static ulong mem_test_alt(vu_long *buf, ulong start_addr, 
ulong end_addr,
         * Check each location for the inverted pattern and zero it.
         */
        for (pattern = 1, offset = 0; offset < num_words; pattern++, offset++) {
-               schedule();
+               if (!count++)
+                       schedule();
                anti_pattern = ~pattern;
                temp = addr[offset];
                if (temp != anti_pattern) {
-- 
2.50.1

Reply via email to