#7429: Unexplained performance boost with +RTS -h
 Aha - of course I was forgetting something, the GC will also eliminate the
 indirections, and that seems to explain the difference at least in

 In the original version, the data structure with the indirections is
 enough to push it over the size of the L1 cache, so all the misses are
 capacity misses (on my machine anyway).  In `Main2.hs`, the difference is
 all due to having to traverse the extra indirections in the data

 Mystery solved - and I don't think there's anything we can do here.  If it
 is important to make this case go fast, you might try to rewrite the
 program so that it creates the data structure strictly, which will
 eliminate the indirections at creation time.

