On today's v0.5, the breaking point is between 6 and 7 indices. I did some more benchmarking to see if the code above suffers from cache line issues (because it is iterating over the array in row-major order, and Julia prefers column-major order). Current Intel chips have a 32KB L1 data cache of 64 byte lines, which is 8-way associative, I'd thought when I first read this post that your problem might be related to that. The effect of the heap allocations is so large, it swamps any cache line issues, as far as I can see from benchmarking this.
On Friday, May 27, 2016 at 9:06:10 PM UTC-4, [email protected] wrote: > > Regarding Julia 0.4.5, I've discovered that arrays with many subscripts > (apparently 8 is enough to cause trouble) unexpectedly cause heap > allocation. See the code and timings below. This is in regard to a finite > element code in which many-way arrays are used to represent multi-index > tensors. Can anyone explain this? > > Thanks, > Steve Vavasis > > > julia> @time test_manyway.test2way(100000) > 0.001836 seconds (6 allocations: 272 bytes) > 49837.4971725032 > > julia> @time test_manyway.test4way(100000) > 0.008931 seconds (7 allocations: 432 bytes) > 50050.05619989492 > > > julia> @time test_manyway.test8way(100000) > 49.707042 seconds (385.40 M allocations: 11.103 GB, 4.16% gc time) > 50062.84252292006 > > > > module test_manyway > > function test2way(n) > a2 = zeros(2,2) > x = 0.0 > for tr = 1 : n > for j1 = 1 : 2 > for j2 = 1 : 2 > a2[j1,j2] = rand() > end > end > x += a2[1,1] > end > x > end > > function test4way(n) > a4 = zeros(2,2,2,2) > x = 0.0 > for tr = 1 : n > for j1 = 1 : 2 > for j2 = 1 : 2 > for j3 = 1 : 2 > for j4 = 1 : 2 > a4[j1,j2,j3,j4] = rand() > end > end > end > end > x += a4[1,1,1,1] > end > x > end > > function test8way(n) > a8 = zeros(2,2,2,2,2,2,2,2) > x = 0.0 > for tr = 1 : n > for j1 = 1 : 2 > for j2 = 1 : 2 > for j3 = 1 : 2 > for j4 = 1 : 2 > for j5 = 1 : 2 > for j6 = 1 : 2 > for j7 = 1 : 2 > for j8 = 1 : 2 > a8[j1,j2,j3,j4,j5,j6,j7,j8] = > rand() > end > end > end > end > end > end > end > end > x += a8[1,1,1,1,1,1,1,1] > end > x > end > > end > > >
