Kieran,
the reference benchmarks have been calibrated against vecLib/Accelerate BLAS. If you use reference BLAS it can be a lot slower. You can switch between reference BLAS and vecLib in R CRAN releases simply by switching the libRblas.dylib symlink (in $R_HOME/lib), e.g.: ls -l /Library/Frameworks/R.framework/Resources/lib/libRblas*dylib -rwxrwxr-x 1 root admin 226288 Oct 31 14:41 /Library/Frameworks/R.framework/Resources/lib/libRblas.0.dylib lrwxr-xr-x 1 root. admin 21 Nov 1 09:56 /Library/Frameworks/R.framework/Resources/lib/libRblas.dylib -> libRblas.vecLib.dylib -rwxrwxr-x 1 root admin 154368 Oct 31 14:41 /Library/Frameworks/R.framework/Resources/lib/libRblas.vecLib.dylib (For recent R you'll need R 4.1.1 or higher) Cheers, Simon PS: reminder to everyone, please test R 4.1.2 RC - now are the last few hours to report anything! > On Nov 1, 2021, at 11:31 AM, Kieran Healy <kjhe...@gmail.com> wrote: > > Hello, > > Just out of interest, I ran benchmark-25.R from Simon’s repo, as I have > access to an M1 Max. Are the *very* long times on cross-product, linear > regression, and Matrix functions a consequence of the BLAS version? > > Kieran > >> sessionInfo() > R version 4.1.1 (2021-08-10) > Platform: aarch64-apple-darwin20 (64-bit) > Running under: macOS Monterey 12.0.1 > > Matrix products: default > BLAS: > /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.0.dylib > LAPACK: > /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib > > >> source("R-benchmark-25.R") > Loading required package: SuppDists > > > R Benchmark 2.5 > =============== > Number of times each test is run__________________________: 3 > > I. Matrix calculation > --------------------- > Creation, transp., deformation of a 2500x2500 matrix (sec): 0.249666666666656 > 2400x2400 normal distributed random matrix ^1000____ (sec): 0.105000000000009 > Sorting of 7,000,000 random values__________________ (sec): 0.594666666666673 > 2800x2800 cross-product matrix (b = a' * a)_________ (sec): 13.3016666666667 > Linear regr. over a 3000x3000 matrix (c = a \ b')___ (sec): 6.27033333333334 > -------------------------------------------- > Trimmed geom. mean (2 extremes eliminated): 0.976431082297569 > > II. Matrix functions > -------------------- > FFT over 2,400,000 random values____________________ (sec): > 0.0726666666666631 > Eigenvalues of a 640x640 random matrix______________ (sec): 0.425666666666672 > Determinant of a 2500x2500 random matrix____________ (sec): 1.73833333333333 > Cholesky decomposition of a 3000x3000 matrix________ (sec): 5.17333333333333 > Inverse of a 1600x1600 random matrix________________ (sec): 1.43099999999996 > -------------------------------------------- > Trimmed geom. mean (2 extremes eliminated): 1.01925013610031 > > III. Programmation > ------------------ > 3,500,000 Fibonacci numbers calculation (vector calc)(sec): > 0.0950000000000273 > Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec): 0.115333333333335 > Grand common divisors of 400,000 pairs (recursion)__ (sec): > 0.0799999999999841 > Creation of a 500x500 Toeplitz matrix (loops)_______ (sec): > 0.0173333333333593 > Escoufier's method on a 45x45 matrix (mixed)________ (sec): 0.152999999999963 > -------------------------------------------- > Trimmed geom. mean (2 extremes eliminated): 0.0957023962714685 > > > Total time for all 15 tests_________________________ (sec): 29.823 > Overall mean (sum of I, II and III trimmed means/3)_ (sec): 0.45668322781674 > --- End of test --- > > Warning messages: > 1: In remove("a", "b") : object 'a' not found > 2: In remove("a", "b") : object 'b' not found > >> On Oct 31, 2021, at 5:11 PM, Simon Urbanek <simon.urba...@r-project.org> >> wrote: >> >> >> Tim, >> >> that is a great idea, those test are really old. Just for the fun of it I >> have run the tests on my old iMac, but with R 4.1.2 and they still work. >> It's nice to see the huge speed improvements in loops and similar (see below >> - recall the original tests were scaled to be around 1). >> >> I have added the page to the repo >> https://github.com/R-macos/R-mac-dev >> so I'd be happy to review PRs, but I'll probably want to re-do it first so >> it is better organized for comparisons as we have to also accommodate M1 etc. >> >> Cheers, >> Simon >> >> --- >> iMac14,2 3.2Ghz i5, macOS 10.4.6, R 4.1.2 vecib/Accelerate BLAS >> >> >> R Benchmark 2.5 >> =============== >> Number of times each test is run__________________________: 3 >> >> I. Matrix calculation >> --------------------- >> Creation, transp., deformation of a 2500x2500 matrix (sec): >> 0.829666666666667 >> 2400x2400 normal distributed random matrix ^1000____ (sec): >> 0.155333333333334 >> Sorting of 7,000,000 random values__________________ (sec): >> 0.638333333333334 >> 2800x2800 cross-product matrix (b = a' * a)_________ (sec): >> 0.242000000000001 >> Linear regr. over a 3000x3000 matrix (c = a \ b')___ (sec): >> 0.170999999999999 >> -------------------------------------------- >> Trimmed geom. mean (2 extremes eliminated): 0.29781941072597 >> >> II. Matrix functions >> -------------------- >> FFT over 2,400,000 random values____________________ (sec): >> 0.331333333333333 >> Eigenvalues of a 640x640 random matrix______________ (sec): >> 0.347000000000001 >> Determinant of a 2500x2500 random matrix____________ (sec): >> 0.207000000000001 >> Cholesky decomposition of a 3000x3000 matrix________ (sec): >> 0.254333333333334 >> Inverse of a 1600x1600 random matrix________________ (sec): >> 0.345666666666663 >> -------------------------------------------- >> Trimmed geom. mean (2 extremes eliminated): 0.307686639256803 >> >> III. Programmation >> ------------------ >> 3,500,000 Fibonacci numbers calculation (vector calc)(sec): 0.245 >> Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec): >> 0.289666666666669 >> Grand common divisors of 400,000 pairs (recursion)__ (sec): >> 0.259333333333331 >> Creation of a 500x500 Toeplitz matrix (loops)_______ (sec): >> 0.0400000000000015 >> Escoufier's method on a 45x45 matrix (mixed)________ (sec): >> 0.263000000000005 >> -------------------------------------------- >> Trimmed geom. mean (2 extremes eliminated): 0.255658395143118 >> >> >> Total time for all 15 tests_________________________ (sec): >> 4.61866666666667 >> Overall mean (sum of I, II and III trimmed means/3)_ (sec): >> 0.286136920519432 >> --- End of test --- >> >> >> >>> On Nov 1, 2021, at 2:48 AM, Tim Bates <timothy.c.ba...@gmail.com> wrote: >>> >>> I wonder if this (2008/R 2.7) page could be updated with some current >>> benchmark runs? >>> >>> Especially current Intel server chips, i9, and M1/+ >>> >>> I'm guessing if Simon could help upload the resulting updated page, people >>> here could contribute bench mark runs on different hardware. >>> >>> >>> Also be interesting to see different blas results. >>> >>> I wonder if either intel or arm chip "neural" cores (dot product engines?) >>> or multi-core and GPU are being used in current R builds? >>> >>> tim >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> R-SIG-Mac mailing list >>> R-SIG-Mac@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac >>> >> >> _______________________________________________ >> R-SIG-Mac mailing list >> R-SIG-Mac@r-project.org >> https://stat.ethz.ch/mailman/listinfo/r-sig-mac > _______________________________________________ R-SIG-Mac mailing list R-SIG-Mac@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-mac