Title: Re: improving performance in access to multidimensional array elements
Alan,
 
(suppose this is relevant to the list too)
 
I re-read your email just now. I always try to allocate the full array (2-dimensional or n-dimensional) at the point of declaration. I had an incle-ing that this was the best method, especailly when filling the array, but I was not aware that it would speed up reading of the array significantly.
 
Remebering back to my High Performance computing lectures, I think that when the array is allocated "slice by slice" there are problems with cache misses. I remember something about accessing in stride in order to achinve maximum cache hits.
 
I suspect that when you specify bounds at creation the memory structure allocated is (x*y*bits per cell), I.E. the data is stored in one big chunk and accessed using a single index:
 
    memory_location = start_point + x_index + (y_index * y_size)
 
On the other hand if you are specifying the size of each strip (perhaps in a loop) then adding them to another 1-d array of arrays (pointers to arrays in fact) then the memory locations dont have to be, and rarely will be contiguous. Cache misses will occour each time you access a new 1-d strip. The memroy location of a cell is probably calculated using something like this:
 
    memory_location = array_of_start_points [y_index] + (x * x_size)
 
The penalty caused by this type of accessing would be greatest for arrays of large y and small x bounds. Conversly smally y and large x upper bounds would have a small cache miss penalty.
 
Im not sure if the retrieval of the start point (in memory) for that particuar row would be cached, but I guess this complexity could only cause slow-downs. I think this speed up trick chops this "dereferencing" stage out, therefore yeilding an improvement, but this improvement would obviously not be useful when the array's bounds are specified at creation.
 
I will use your app as a testbed to check my thoughts (these are only guesses) soon.
 
N
 
----- Original Message -----
Sent: Tuesday, February 04, 2003 8:00 PM
Subject: Re: improving performance in access to multidimensional array ele ments

Nathan-

Wanted to close the loop on your question about why certain matrix sizes/geometries are faster. Actually, I can't answer that question, but had other info I thought you might find useful. Attached is a new version of the original testing program. This one explicit calls garbage collection to prevent that from affecting the outcome, and tries out a couple extra techniques.

What it shows is that access to double dimension arrays really is only a small part of this issue. The much larger one is how you grab the original memory. The first test reserves a 2D matrix and then works with it by row, showing the improvement only of accessing by single instead of double dimension arrays. This gives very small improvements. The second test reserves all the required elements as one long row and then accesses them as needed to emulate a 2D matrix. Finally comes the original, which grabs memory a row at a time to make the 2D matrix.

For your geometry, the second approach is much better (ok, I'm still testing on doubles, not integers.) So, if you are willing to put up with a little more complication and less clarity in the code, this may be an option. You'll still want to try it out with your exact use.

Hope this is of some use,

Alan

<<MatrixCopyTest.java>>

C:\ >java MatrixCopyTest 400 400 1000

ebe 11.328

m row 11.312

improvement 0.14124293785310735 %

ebe takes 0.14144271570014144 % longer

c row 8.156

improvement 28.00141242937853 %

ebe takes 38.89161353604708 % longer

row 10.312

improvement 8.968926553672317 %

ebe takes 9.852598913886734 % longer


C:\ >java MatrixCopyTest 800 800 1000

ebe 30.515

m row 30.453

improvement 0.20317876454202852 %

ebe takes 0.20359242110793682 % longer

c row 18.797

improvement 38.40078649844339 %

ebe takes 62.33973506410597 % longer

row 26.031

improvement 14.694412583975094 %

ebe takes 17.225615612154737 % longer

Reply via email to