Hi Paul,

We've been through this process ourselves for the Revolution R Open project. 
There are a number of pitfalls to avoid, but you can take a look at how we 
achieved it in the build scripts at:

https://github.com/RevolutionAnalytics/RRO

There are also some very useful notes in the R Installation guide:
https://cran.r-project.org/doc/manuals/r-release/R-admin.html#BLAS 

Most packages do benefit from MKL (or any multi-threaded BLAS) to some degree, 
although the actual benefit depends on the R functions they call. Some packages 
(and some built-in R functions) don't call into BLAS endpoints, so you won't 
see benefits in all cases.

# David Smith

-- 
David M Smith <david...@microsoft.com>
R Community Lead, Revolution Analytics (a Microsoft company)  
Tel: +1 (312) 9205766 (Chicago IL, USA)
Twitter: @revodavid | Blog:  http://blog.revolutionanalytics.com
We are hiring engineers for Revolution R and Azure Machine Learning.

-----Original Message-----
From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Paul Johnson
Sent: Monday, November 23, 2015 09:28
To: R Devel List <r-devel@r-project.org>
Subject: [Rd] MKL Acceleration encouraging; need adjust package builds?

Dear R-devel:

The Cluster administrators at KU got enthusiastic about testing
R-3.2.2 with Intel MKL when I asked for some BLAS integration.  Below I forward 
a performance report, which is encouraging, and thought you would like to know 
the numbers.  Appears to my untrained eye there are some extraordinary speedups 
on Cholesky decomposition, determinants, and matrix inversion.

They had difficulty getting R to compile with  R shared BLAS (don't know what 
went wrong there), so they went the other direction.

In his message to me, the technician says that I should consider adjusting the 
compilation flags on the packages that use BLAS.  Do you think that is needed? 
R is compiled with non-shared BLAS libraries, won't packages know where to look 
for BLAS headers?

2. If I need to do that, I wonder how to do it and which packages need 
attention.  Eigen and Armadillo packages, and possibly the ones that depend on 
them, lme4, anything flowing through Rcpp.

Here's the build for some packages. Are they finding MKL BLAS?  How would I 
know?

* installing *source* package 'RcppArmadillo' ...
** package 'RcppArmadillo' successfully unpacked and MD5 sums checked
* checking LAPACK_LIBS: divide-and-conquer complex SVD available via system 
LAPACK
** libs
g++ -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include
-I"/panfs/pfs.acf.ku.edu/crmda/tools/lib64/R/3.2/site-library/Rcpp/include"
 -I../inst/include -fpic  -O3 -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic    -c RcppArmadillo.cpp -o RcppArmadillo.o
g++ -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include
-I"/panfs/pfs.acf.ku.edu/crmda/tools/lib64/R/3.2/site-library/Rcpp/include"
 -I../inst/include -fpic  -O3 -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic    -c RcppExports.cpp -o RcppExports.o
g++ -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include
-I"/panfs/pfs.acf.ku.edu/crmda/tools/lib64/R/3.2/site-library/Rcpp/include"
 -I../inst/include -fpic  -O3 -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic    -c fastLm.cpp -o fastLm.o
g++ -shared -L/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/lib
-L/usr/local/lib64 -o 
https://na01.safelinks.protection.outlook.com/?url=RcppArmadillo.so&data=01%7c01%7cdavidsmi%40microsoft.com%7c80ae9ec8fef04c42eed808d2f42bf31d%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=AwdY1xC74H25uBIyciugr9HeuGhYhnDGKoQkeDUhpeQ%3d
 RcppArmadillo.o RcppExports.o fastLm.o 
-L/panfs/pfs.acf.ku.edu/cluster/6.2/intel/2015/mkl/lib/intel64
-Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core 
-Wl,--end-group -fopenmp -ldl -lpthread -lm -lgfortran -lm 
-L/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/lib -lR installing to 
/panfs/pfs.acf.ku.edu/crmda/tools/lib64/R/3.2/site-library/RcppArmadillo/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (RcppArmadillo)

* installing *source* package 'RcppEigen' ...
** package 'RcppEigen' successfully unpacked and MD5 sums checked
** libs
g++ -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include
-I"/panfs/pfs.acf.ku.edu/crmda/tools/lib64/R/3.2/site-library/Rcpp/include"
 -I../inst/include -fpic  -O3 -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic    -c RcppEigen.cpp -o RcppEigen.o
g++ -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include
-I"/panfs/pfs.acf.ku.edu/crmda/tools/lib64/R/3.2/site-library/Rcpp/include"
 -I../inst/include -fpic  -O3 -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic    -c RcppExports.cpp -o RcppExports.o
g++ -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include
-I"/panfs/pfs.acf.ku.edu/crmda/tools/lib64/R/3.2/site-library/Rcpp/include"
 -I../inst/include -fpic  -O3 -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic    -c fastLm.cpp -o fastLm.o
g++ -shared -L/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/lib
-L/usr/local/lib64 -o 
https://na01.safelinks.protection.outlook.com/?url=RcppEigen.so&data=01%7c01%7cdavidsmi%40microsoft.com%7c80ae9ec8fef04c42eed808d2f42bf31d%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=JKBcv7cUulJ07Du2ksIqghjWlkEkg%2b8RbNL64cvvYus%3d
 RcppEigen.o RcppExports.o fastLm.o
-L/panfs/pfs.acf.ku.edu/cluster/6.2/intel/2015/mkl/lib/intel64
-Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core 
-Wl,--end-group -fopenmp -ldl -lpthread -lm -lgfortran -lm 
-L/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/lib -lR installing to 
/panfs/pfs.acf.ku.edu/crmda/tools/lib64/R/3.2/site-library/RcppEigen/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (RcppEigen)

* installing *source* package 'MatrixModels' ...
** package 'MatrixModels' successfully unpacked and MD5 sums checked
** R
** preparing package for lazy loading
Creating a generic function for 'resid' from package 'stats' in package 
'MatrixModels'
Creating a generic function for 'fitted.values' from package 'stats'
in package 'MatrixModels'
Creating a generic function for 'coefficients' from package 'stats' in package 
'MatrixModels'
Creating a generic function for 'formula' from package 'stats' in package 
'MatrixModels'
Creating a generic function for 'coef' from package 'stats' in package 
'MatrixModels'
Creating a generic function for 'fitted' from package 'stats' in package 
'MatrixModels'
Creating a generic function for 'residuals' from package 'stats' in package 
'MatrixModels'
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (MatrixModels)
* installing *source* package 'quantreg' ...
** package 'quantreg' successfully unpacked and MD5 sums checked
** libs
gfortran   -fpic  -g -O2  -c akj.f -o akj.o
gfortran   -fpic  -g -O2  -c boot.f -o boot.o
gfortran   -fpic  -g -O2  -c brute.f -o brute.o
gcc -std=gnu99 -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include    -fpic
-I/panfs/pfs.acf.ku.edu/cluster/system/pkg/R/curl7.45_install/include
-L/panfs/pfs.acf.ku.edu/cluster/6.2/R/3.2.2_mkl/lib64  -c chlfct.c -o chlfct.o
gfortran   -fpic  -g -O2  -c cholesky.f -o cholesky.o
gfortran   -fpic  -g -O2  -c combos.f -o combos.o
gfortran   -fpic  -g -O2  -c crq.f -o crq.o
gfortran   -fpic  -g -O2  -c crqfnb.f -o crqfnb.o
gfortran   -fpic  -g -O2  -c dsel05.f -o dsel05.o
gfortran   -fpic  -g -O2  -c etime.f -o etime.o
gfortran   -fpic  -g -O2  -c extract.f -o extract.o
gfortran   -fpic  -g -O2  -c idmin.f -o idmin.o
gfortran   -fpic  -g -O2  -c iswap.f -o iswap.o
gfortran   -fpic  -g -O2  -c kuantile.f -o kuantile.o
gcc -std=gnu99 -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include    -fpic
-I/panfs/pfs.acf.ku.edu/cluster/system/pkg/R/curl7.45_install/include
-L/panfs/pfs.acf.ku.edu/cluster/6.2/R/3.2.2_mkl/lib64  -c mcmb.c -o mcmb.o
gfortran   -fpic  -g -O2  -c penalty.f -o penalty.o
gfortran   -fpic  -g -O2  -c powell.f -o powell.o
gfortran   -fpic  -g -O2  -c rls.f -o rls.o
gfortran   -fpic  -g -O2  -c rq0.f -o rq0.o
gfortran   -fpic  -g -O2  -c rq1.f -o rq1.o
gfortran   -fpic  -g -O2  -c rqbr.f -o rqbr.o
gfortran   -fpic  -g -O2  -c rqfn.f -o rqfn.o
gfortran   -fpic  -g -O2  -c rqfnb.f -o rqfnb.o
gfortran   -fpic  -g -O2  -c rqfnc.f -o rqfnc.o
gfortran   -fpic  -g -O2  -c rqs.f -o rqs.o
gfortran   -fpic  -g -O2  -c sparskit2.f -o sparskit2.o
gcc -std=gnu99 -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include    -fpic
-I/panfs/pfs.acf.ku.edu/cluster/system/pkg/R/curl7.45_install/include
-L/panfs/pfs.acf.ku.edu/cluster/6.2/R/3.2.2_mkl/lib64  -c srqfn.c -o srqfn.o 
gcc -std=gnu99 -I/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/include
-I/usr/local/include    -fpic
-I/panfs/pfs.acf.ku.edu/cluster/system/pkg/R/curl7.45_install/include
-L/panfs/pfs.acf.ku.edu/cluster/6.2/R/3.2.2_mkl/lib64  -c srqfnc.c -o srqfnc.o
gfortran   -fpic  -g -O2  -c srtpai.f -o srtpai.o
gcc -std=gnu99 -shared -L/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/lib
-L/usr/local/lib64 -o 
https://na01.safelinks.protection.outlook.com/?url=quantreg.so&data=01%7c01%7cdavidsmi%40microsoft.com%7c80ae9ec8fef04c42eed808d2f42bf31d%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=jwhQtiHxfZFerLI515tW7VRYIEGuxOrLIKktxR4KOlY%3d
 akj.o boot.o brute.o chlfct.o cholesky.o combos.o crq.o crqfnb.o dsel05.o 
etime.o extract.o idmin.o iswap.o kuantile.o mcmb.o penalty.o powell.o rls.o 
rq0.o rq1.o rqbr.o rqfn.o rqfnb.o rqfnc.o rqs.o sparskit2.o srqfn.o srqfnc.o 
srtpai.o
-L/panfs/pfs.acf.ku.edu/cluster/6.2/intel/2015/mkl/lib/intel64
-Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core 
-Wl,--end-group -fopenmp -ldl -lpthread -lm -lgfortran -lm -lgfortran -lm 
-L/tools/cluster/6.2/R/3.2.2_mkl/lib64/R/lib -lR installing to 
/panfs/pfs.acf.ku.edu/crmda/tools/lib64/R/3.2/site-library/quantreg/libs
** R
** data
** demo
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (quantreg)


pj



Hi PJ,

We're still running the benchmarks to quantify the performance increase.

The R benchmarks for the MKL version are promising. The performance increase is 
varied from test to test, but there isn't any degradation in performance by 
using the MKL version. You can expect a 2x to 10x performance increase 
depending on the matrix calculations you are performing. Here are the 
compilation arguments we used for compiling R with MKL:

--disable-BLAS-shlib
--with-blas="-L/panfs/pfs.acf.ku.edu/cluster/6.2/intel/2015/mkl/lib/intel64 -W 
l,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core 
-Wl,--end-group -fopenmp -ldl -lpthread -lm" --with-lapack

You may want to include these while recompiling R packages which use BLAS.


Here are the results of the benchmark for the standard R 3.2.2:

R Benchmark 2.5
===============
Number of times each test is run__________________________: 3

I. Matrix calculation
---------------------
Creation, transp., deformation of a 2500x2500 matrix (sec): 2.69466666666667
2400x2400 normal distributed random matrix ^1000____ (sec): 1.42433333333333 
Sorting of 7,000,000 random values__________________ (sec): 2.34466666666667
2800x2800 cross-product matrix (b = a' * a)_________ (sec): 33.187 Linear regr. 
over a 3000x3000 matrix (c = a \ b')___ (sec): 14.52
--------------------------------------------
Trimmed geom. mean (2 extremes eliminated): 4.51008013606039

II. Matrix functions
--------------------
FFT over 2,400,000 random values____________________ (sec): 1.203 Eigenvalues 
of a 640x640 random matrix______________ (sec): 1.60599999999999 Determinant of 
a 2500x2500 random matrix____________ (sec): 7.64266666666667 Cholesky 
decomposition of a 3000x3000 matrix________ (sec): 8.05900000000001 Inverse of 
a 1600x1600 random matrix________________ (sec): 8.64166666666667
--------------------------------------------
Trimmed geom. mean (2 extremes eliminated): 4.62477425061321

III. Programmation
------------------
3,500,000 Fibonacci numbers calculation (vector calc)(sec): 1.25633333333335 
Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec): 0.894999999999982 
Grand common divisors of 400,000 pairs (recursion)__ (sec): 1.714 Creation of a 
500x500 Toeplitz matrix (loops)_______ (sec): 1.4013333333333 Escoufier's 
method on a 45x45 matrix (mixed)________ (sec): 2.041
--------------------------------------------
Trimmed geom. mean (2 extremes eliminated): 1.44505946077978


Total time for all 15 tests_________________________ (sec): 88.6306666666667 
Overall mean (sum of I, II and III trimmed means/3)_ (sec): 3.11209972260597
--- End of test ---


Here are the results for the MKL version:

R Benchmark 2.5
===============
Number of times each test is run__________________________: 3

I. Matrix calculation
---------------------
Creation, transp., deformation of a 2500x2500 matrix (sec): 2.88466666666667
2400x2400 normal distributed random matrix ^1000____ (sec): 1.45933333333333 
Sorting of 7,000,000 random values__________________ (sec): 2.35166666666667
2800x2800 cross-product matrix (b = a' * a)_________ (sec): 3.37233333333333 
Linear regr. over a 3000x3000 matrix (c = a \ b')___ (sec): 1.68666666666666
--------------------------------------------
Trimmed geom. mean (2 extremes eliminated): 2.25337542617509

II. Matrix functions
--------------------
FFT over 2,400,000 random values____________________ (sec): 1.232 Eigenvalues 
of a 640x640 random matrix______________ (sec): 0.823333333333333 Determinant 
of a 2500x2500 random matrix____________ (sec): 1.752 Cholesky decomposition of 
a 3000x3000 matrix________ (sec): 1.417 Inverse of a 1600x1600 random 
matrix________________ (sec): 1.33833333333334
--------------------------------------------
Trimmed geom. mean (2 extremes eliminated): 1.32693082905282

III. Programmation
------------------
3,500,000 Fibonacci numbers calculation (vector calc)(sec): 1.28600000000001 
Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec): 1.00833333333334 
Grand common divisors of 400,000 pairs (recursion)__ (sec): 1.82266666666666 
Creation of a 500x500 Toeplitz matrix (loops)_______ (sec): 1.40533333333334 
Escoufier's method on a 45x45 matrix (mixed)________ (sec): 1.91199999999998
--------------------------------------------
Trimmed geom. mean (2 extremes eliminated): 1.48790723568791


Total time for all 15 tests_________________________ (sec): 25.7516666666667 
Overall mean (sum of I, II and III trimmed means/3)_ (sec): 1.64469699141649
--- End of test ---



--
Paul E. Johnson
Professor, Political Science        Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fpj.freefaculty.org&data=01%7c01%7cdavidsmi%40microsoft.com%7c80ae9ec8fef04c42eed808d2f42bf31d%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=OQn3ZG5CWA3HRew7kSXouwHTARsGXFvzHHUoicoo%2fBA%3d
              
https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fcrmda.ku.edu&data=01%7c01%7cdavidsmi%40microsoft.com%7c80ae9ec8fef04c42eed808d2f42bf31d%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=uCFPVsWJzHYMKd6kWq33qFkOXvj4H51zcEEBcOdvxyI%3d

______________________________________________
R-devel@r-project.org mailing list
https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fstat.ethz.ch%2fmailman%2flistinfo%2fr-devel&data=01%7c01%7cdavidsmi%40microsoft.com%7c80ae9ec8fef04c42eed808d2f42bf31d%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=YFcT64Zhp8Qi1MMSh%2bhiLESj7t4kTfSp8CYoYtRp2LM%3d

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to