Big thanks for your tips! (Though this will compute correlations between columns, not rows. And > the distance is 1 - correlation.) > Since matrix transpose is easy to obtain, there's no obstacle to adapt to CorrDist() .
I've also tried ArrayViews on my previous applications, except ArrayViews does not support sparse matrices. Will check on Linda Hua's source code and try to implement views on sparse matrices soon. -- REGARDS, Todd Leo On Monday, December 1, 2014 8:55:15 PM UTC+8, Milan Bouchet-Valat wrote: > > Le lundi 01 décembre 2014 à 02:50 +0000, SLiZn Liu a écrit : > > I have a n by m dense matrix, and each row is a vector > > representing variating flows like stock price, and I'd like to find > > out the two vectors which have the highest similarity using cor(). > > Hence, a nested for-loop was utilized to calculate the similarity > > between each pair, and fill the similarity into an n by n adjacency > > matrix. > In that case you can simply use the Distances.jl package like this: > pairwise(CorrDist(), x) > > (Though this will compute correlations between columns, not rows. And > the distance is 1 - correlation.) > > > If you look at the code it uses by calling > edit(pairwise!) > > you'll see that it relies on array views to avoid creating copies of the > columns. > > > Regards > > > > On Fri Nov 28 2014 at 8:49:51 PM Milan Bouchet-Valat > > <[email protected] <javascript:>> wrote: > > Le vendredi 28 novembre 2014 à 10:21 +0000, SLiZn Liu a > > écrit : > > > I'm doing row-wise/col-wise calculation, isn't it inevitable > > to create > > > row/col copies after iteratively extract single elements? > > No, I don't think so, though sometimes you'll want to extract > > a full > > row/column to pass it to a standard function instead of > > writing all of > > the computations by hand. That's where array views are very > > useful. > > > > But can you give more details about the calculation you need > > to do? > > > > > > Regards > > > > > I will consider to take a shot on option 1, ArrayViews if > > this > > > single-element-extraction comes to a dead end. Thanks, > > Milan! > > > > > > > > > > > > On Fri Nov 28 2014 at 6:00:07 PM Milan Bouchet-Valat > > > <[email protected] <javascript:>> wrote: > > > Le vendredi 28 novembre 2014 à 01:45 -0800, Todd Leo > > a écrit : > > > > Hi Fellows, > > > > > > > > > > > > Say I have a 1000 x 1000 matrix, and I'm going to > > do some > > > calculation > > > > in a nested for-loop, with each pair of rows/cols > > in the > > > matrix. But I > > > > suffered a heavy performance penalty in row/col > > extraction. > > > Here's my > > > > minimum reproducible example, which I hope > > explains itself. > > > > > > > > > > > > A = rand(0.:0.01:1.,1000,1000) > > > > > > > > > > > > function test(x) > > > > for i in 1:1000, j in 1:1000 > > > > x[:,i] > > > > x[:,j] > > > > end > > > > end > > > > > > > > > > > > test(A) # warm up > > > > gc() > > > > @time test(A) > > > > ## elapsed time: 13.28547939 seconds (16208000080 > > bytes > > > allocated, > > > > 72.42% gc time) > > > > > > > > It takes 13 seconds, only extracting the > > rows/cols for the > > > sake of > > > > further calculations. I'm wondering if anything I > > could do > > > to improve > > > > the performance.Thanks in advance. > > > This is because extracting a row/column creates a > > copy. > > > Depending on > > > what calculation you want to do on them, you can: > > > - use arrays views (which will become the default > > when > > > extracting slices > > > in 0.4): https://github.com/JuliaLang/ArrayViews.jl > > > - manually write loops to go over the row and column > > so that > > > you only > > > extract one individual element of the matrix at a > > time > > > > > > > > > Regards > > > >
