On Sunday, September 21, 2014 03:02:38 PM Hans W Borchers wrote: > If one gets the data as 1000x3 matrix, should one really first transpose > the > matrix and then apply a distance function on 3x1000? I don't think so.
It depends on the coming algorithm. 1000 points is not very much; the whole thing would fit into L1 cache, and so for 1000 points there's no incentive. But change that to 10^5 points, so the total computational cost is 10^10, and you could get an 5-10 fold speedup even with the cost of that transposition. When it comes to performance, cache is a huge consideration, and an O(N) reordering is totally worth it for an O(N^2) algorithm. Of course, if you're using the Euclidean distance, you can do it using BLAS matrix multiplication (with some loss in precision), and BLAS already worries about cache for you. So in that particular case, with that particular algorithm, it wouldn't be necessary. But basically any other metric (or if you are worried about that loss of precision and want to use the higher-precision "naive" algorithm), you'll want the points arranged along columns. --Tim
