If you're affiliated with a university, Anaconda has free academic licenses that include MKL and their optimized builds.
Vlad On Mon Feb 24 09:22:07 2014, Javier Martínez-López wrote: > That is great, thanks! I do not have the mkl module (it isn't free, > right?) but with your script the calculation is approx. 10 times > faster than in R. Is there a way to increase performance using Cython, > BLAS and LAPACK? Could you possibly show some examples of how to do > it? > > Thank you very much again and cheers, > > Javier > > On Sat, Feb 22, 2014 at 12:20 AM, Sturla Molden <sturla.mol...@gmail.com> > wrote: >> On 21/02/14 14:07, Javier Martínez-López wrote: >>> Hello, >>> >>> I am quite new to python, so I am sorry if I am asking something too >>> simple: I am trying to speed up a raster processing with python and I >>> want to use the "sklearn.metrics.pairwise.pairwise_distances" module >>> with the mahalanobis distance metric and the " n_jobs=-1" in order to >>> speed up the process using all processors. However, when I run it, it >>> only uses one processor... while if I use the euclidean distance all >>> processors are used... where is the problem? am I doing something >>> wrong? >> >> >> On 21/02/14 14:07, Javier Martínez-López wrote:> Hello, >> > >> > I am quite new to python, so I am sorry if I am asking something too >> > simple: I am trying to speed up a raster processing with python and I >> > want to use the "sklearn.metrics.pairwise.pairwise_distances" module >> > with the mahalanobis distance metric and the " n_jobs=-1" in order to >> > speed up the process using all processors. However, when I run it, it >> > only uses one processor... while if I use the euclidean distance all >> > processors are used... where is the problem? am I doing something >> > wrong? >> > >> > Thank you and cheers, >> > >> > Javier >> >> I does not seem it is vectorized up with joblib. It is just deferred to >> scipy.spatial.distances.mahalanobis. >> >> Assuming X is ndata x nparam array, m is the mean vector and S is the >> covariance matrix, you can compute mahalanobis distances in parallel >> like so: >> >> import numpy as np >> import scipy >> from scipy.linalg import cholesky, solve_triangular >> from sklearn.externals.joblib import Parallel, delayed >> from multiprocessing import cpu_count >> >> def _schedule(n, nproc): >> """ guided scheduler """ >> start = 0 >> size = (n - start) // nproc >> while size > 100: >> yield slice(start,start+size) >> start += size >> size = (n - start) // nproc >> yield slice(start,n+1) >> return >> >> def _mahalanobis_distances(m, L, X): >> cX = X - m[np.newaxis,:] >> tmp = solve_triangular(L, cX.T, lower=True).T >> tmp **= 2 >> return np.sqrt(tmp.sum(axis=1)) >> >> def mahalanobis_distances(m, S, X, parallel=True): >> L = cholesky(S, lower=True) >> n = X.shape[0] >> if parallel: >> nproc = cpu_count() >> res = (Parallel(n_jobs=-1) >> (delayed(_mahalanobis_distances) >> (m, L, X[s,:]) >> for s in _schedule(n,nproc))) >> return np.hstack(res) >> else: >> return _mahalanobis_distances(m, L, X) >> >> >> # scipy.spatial.distance.mahalanobis for comparison >> >> from scipy.spatial import distance >> >> def _mahalanobis_distances_scipy(m, SI, X): >> n = X.shape[0] >> mahal = np.zeros(n) >> for i in xrange(X.shape[0]): >> x = X[i,:] >> mahal[i] = distance.mahalanobis(x,m,SI) >> return mahal >> >> def mahalanobis_distances_scipy(m, S, X, parallel=True): >> SI = np.linalg.inv(S) >> n = X.shape[0] >> if parallel: >> nproc = cpu_count() >> res = (Parallel(n_jobs=-1) >> (delayed(_mahalanobis_distances_scipy) >> (m, SI, X[s,:]) >> for s in _schedule(n,nproc))) >> return np.hstack(res) >> else: >> return _mahalanobis_distances_scipy(m, SI, X) >> >> >> >> if __name__ == "__main__": >> >> ## verify accuracy >> >> X = np.random.randn(300000*16).reshape((300000,16)) >> m = np.mean(X, axis=0) >> S = np.cov(X, rowvar=False) >> >> mahal_scipy = mahalanobis_distances_scipy(m, S, X) >> mahal_cholesky = mahalanobis_distances(m, S, X) >> >> print("Similar result to scipy.spatial.distance.mahalanobis: %s" % >> ('true' if np.allclose(mahal_scipy - mahal_cholesky,0) else >> 'false',)) >> >> mahal = (mahalanobis_distances(m, S, X, parallel=True) >> - mahalanobis_distances(m, S, X, parallel=False) >> + mahalanobis_distances_scipy(m, S, X, parallel=True) >> - mahalanobis_distances_scipy(m, S, X, parallel=False)) >> >> print("Similar results with and without parallel execution: %s" % >> ('true' if np.allclose(mahal,0) else 'false',)) >> >> # timing >> from time import clock >> >> print("Example timings:") >> >> t0 = clock() >> mahalanobis_distances(m, S, X, parallel=True) >> t1 = clock() >> print("cholesky, parallel: %f ms" % (1000*(t1-t0),)) >> >> t0 = clock() >> mahalanobis_distances(m, S, X, parallel=False) >> t1 = clock() >> print("cholesky, single process: %f ms" % (1000*(t1-t0),)) >> >> t0 = clock() >> mahalanobis_distances_scipy(m, S, X, parallel=True) >> t1 = clock() >> print("scipy, parallel: %f ms" % (1000*(t1-t0),)) >> >> t0 = clock() >> mahalanobis_distances_scipy(m, S, X, parallel=False) >> t1 = clock() >> print("scipy, single process: %f ms" % (1000*(t1-t0),)) >> >> >> This gives me: >> >> $ python mahal.py >> Similar result to scipy.spatial.distance.mahalanobis: true >> Similar results with and without parallel execution: true >> Example timings: >> cholesky, parallel: 122.138000 ms >> cholesky, single process: 159.878000 ms >> scipy, parallel: 308.829000 ms >> scipy, single process: 6744.384000 ms >> >> >> Actually, there is a strange interaction with MKL threading here: >> >> try: >> import mkl >> mkl.set_num_threads(1) >> except ImportError: >> pass >> >> Now I get these timings: >> >> $ python mahal.py >> Similar result to scipy.spatial.distance.mahalanobis: true >> Similar results with and without parallel execution: true >> Example timings: >> cholesky, parallel: 120.784000 ms >> cholesky, single process: 84.593000 ms >> scipy, parallel: 129.127000 ms >> scipy, single process: 6749.786000 ms >> >> >> The version that uses a single MKL thread and a single process is >> actually the fastest. >> >> I could probably get even better results using Cython, BLAS and LAPACK, >> but this can suffice for now. >> >> >> Sturla >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Managing the Performance of Cloud-Based Applications >> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >> Read the Whitepaper. >> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > ------------------------------------------------------------------------------ > Flow-based real-time traffic analytics software. Cisco certified tool. > Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer > Customize your own dashboards, set traffic alerts and generate reports. > Network behavioral analysis & security monitoring. All-in-one tool. > http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis & security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general