This is more on post-multiplying a matrix by a diagonal matrix (in six 
different ways).

The machine-comparison results from this little exercise can be explained by 
three factors

-- CPU speed
-- number of threads
-- BLAS library

Of the six methods, the first two use BLAS, the last four do not. For the
last four, times are determined largely by CPU speed.

For the first two methods, there are two things going on. First, BLAS in the 
Windows
implementation of R is faster than the vecLib BLAS on OS X. Second, on the
Mac Pro with Nehalem, for reasons unknown (bug !!) the OS decides to use a 
non-parellel
single-threaded BLAS. Running the same code, with the same OS and the same R, 
on a MacBook
or an iMac or a MacAir, does use multiple threads.  This is all
under OS X 10.6.5, by the way, using R-develop from svn. Running the same code 
in Parallels
emulation, using Windows R on the MacPro, uses multiple threads as well.

I cannot possibly convey how annoying this is.


Attachment: matdiag.R
Description: Binary data

==========================================================================
2.66 Ghz dual-core i5 iMac, with 4 GB DDR, R-2.12.0, 64 bit

matrix multiplication      34.682   1.474  11.397 
tcrossprod                 33.461   1.264   9.993 
transposition and reuse    20.436   3.654  24.09 
elementwise after reshape   8.309   1.932  17.599 
columnwise sapply          22.692  18.242  42.006 
for loop over columns      29.723  15.832  45.556 
==========================================================================
2 x 2.94 Ghz Quad-core  MacPro, 16 GB DDR, R-2.12.0, 64 bit

matrix multiplication      31.259   1.814  33.073 
tcrossprod                 29.693   1.643  31.337 
transposition and reuse    24.188   4.931  29.120 
elementwise after reshape   9.592   2.464  12.056 
columnwise sapply          23.277  22.095  45.373 
for loop over columns      29.895  18.662  48.558 

==========================================================================
2 x 2.94 Ghz Quad-core  MacPro, 16 GB DDR, R-2.12.0, 64 bit (generic BLAS)

matrix multiplication      31.514   1.794  33.838 
tcrossprod                 29.929   1.614  31.545 
transposition and reuse    25.364   4.834  30.206 
elementwise after reshape   9.722   2.422  12.144 
columnwise sapply          23.516  22.074  45.593 
for loop over columns      29.982  18.632  48.617 
==========================================================================
2 x 2.94 Ghz Quad-core MacPro, 16 GB DDR, R-2.11.0, 64 bit, run through R.app

matrix multiplication      31.916   3.673  35.384 
tcrossprod                  30.45   4.379  34.606 
transposition and reuse    24.822   7.928  32.476 
elementwise after reshape   9.995   3.673  13.559 
columnwise sapply          23.708   23.03  46.718 
for loop over columns      30.298  19.629  49.745 
==========================================================================
1.8 Ghz dual core MacAir, with 2 GB DDR

matrix multiplication      76.154   3.294  49.571 
tcrossprod                 77.653   3.166  53.702 
transposition and reuse    58.258   6.814  65.618 
elementwise after reshape  27.091   3.243  30.491 
columnwise sapply          48.919  42.725  91.991 
for loop over columns      57.727  39.142  97.188 
==========================================================================
Windows 7, Parallels emulation, MacPro, R-2.11.0 (64bit)
 
matrix multiplication        8.68    1.86   10.59 
tcrossprod                   4.0     1.84    5.85 
transposition and reuse     39.89    6.92   47.11 
elementwise after reshape   10.96    3.24   14.24 
columnwise sapply           26.47   10.28   36.9 
for loop over columns       34.04    9.75   43.86 
==========================================================================
Ubuntu, Parallels emulation, MacPro, R-2.10.1 (64bit)

matrix multiplication      25.206   3.072  28.802 
tcrossprod                 21.209   2.925  24.242 
transposition and reuse    32.51    9.564  42.17 
elementwise after reshape  10.669   4.368  15.074 
columnwise sapply          27.665  10.121  37.869 
for loop over columns      35.043  10.193  45.32
==========================================================================
Jeroen on Windows (32 bit):

matrix multiplication        9.18    3.09   12.34
tcrossprod                   6.27    3.52    9.83 
transposition and reuse     41.31    9.96   51.48
elementwise after reshape   18.89    5.06   24.01
columnwise sapply           36.25   15.48   51.9
for loop over columns        41.4    13.7   55.63
==========================================================================
Jeroen on a cheap Ubuntu instance with very limited memory:

matrix multiplication       12.86    9.28  22.573
tcrossprod                   6.26    9.31  15.668
transposition and reuse    138.39   28.53 169.99
elementwise after reshape   30.22   14.12  44.50
columnwise sapply           36.62   19.17  65.638
for loop over columns       53.72   12.27  66.349
==========================================================================
Masanao on PC

matrix multiplication        12.87   3.47   17.11 
tcrossprod                   10.06   4.38   14.92 
transposition and reuse      74.5   13.84   91.42 
elementwise after reshape    43.9    7.39   53.36 
columnwise sapply            86.5   26.92  124.61 
for loop over columns      103.58   25.59  138.88 
==========================================================================
Masanao, 2.66 Ghz Dual core i7, 8 GB DDR, MacBook Pro
 
matrix multiplication         58.875   1.801  18.095 
tcrossprod                    57.423   1.53   16.607 
transposition and reuse       31.104   4.181  35.286 
elementwise after reshape      9.382   2.052  11.434 
columnwise sapply             23.14   20.577  43.718 
for loop over columns         29.789  17.516  47.307




===============================================================
     Jan de Leeuw, 11667 Steinhoff Rd, Frazier Park, CA 93225
     home 661-245-1725 mobile 661-231-5416 work 310-825-9550
     .mac: jdeleeuw +++  aim: deleeuwjan +++ skype: j_deleeuw
===============================================================
             I am I because my little dog knows me.
                                         Gertrude Stein

_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Reply via email to