The first stagnation is probably due to L1 cache thrashing, since by the same calculation, the problem for n=50 takes up 50*50*8bytes*3 ~ 60 kb.
Thanks, Jiahao Chen, PhD Staff Research Scientist MIT Computer Science and Artificial Intelligence Laboratory
