Thank you very much! Already working! However, contrasting with the results obtained by Sturla, I get the fastest result with cholesky, parallel method:
bash-4.1$ python mahalk.py Similar result to scipy.spatial.distance.mahalanobis: true Similar results with and without parallel execution: true Example timings: cholesky, parallel: 3450.000000 ms cholesky, single process: 4910.000000 ms scipy, parallel: 17040.000000 ms scipy, single process: 36950.000000 ms Thank you both again and cheers, Javier On Mon, Feb 24, 2014 at 9:38 AM, Vlad Niculae <zephy...@gmail.com> wrote: > If you're affiliated with a university, Anaconda has free academic > licenses that include MKL and their optimized builds. > > Vlad > > On Mon Feb 24 09:22:07 2014, Javier Martínez-López wrote: >> That is great, thanks! I do not have the mkl module (it isn't free, >> right?) but with your script the calculation is approx. 10 times >> faster than in R. Is there a way to increase performance using Cython, >> BLAS and LAPACK? Could you possibly show some examples of how to do >> it? >> >> Thank you very much again and cheers, >> >> Javier >> >> On Sat, Feb 22, 2014 at 12:20 AM, Sturla Molden <sturla.mol...@gmail.com> >> wrote: >>> On 21/02/14 14:07, Javier Martínez-López wrote: >>>> Hello, >>>> >>>> I am quite new to python, so I am sorry if I am asking something too >>>> simple: I am trying to speed up a raster processing with python and I >>>> want to use the "sklearn.metrics.pairwise.pairwise_distances" module >>>> with the mahalanobis distance metric and the " n_jobs=-1" in order to >>>> speed up the process using all processors. However, when I run it, it >>>> only uses one processor... while if I use the euclidean distance all >>>> processors are used... where is the problem? am I doing something >>>> wrong? >>> >>> >>> On 21/02/14 14:07, Javier Martínez-López wrote:> Hello, >>> > >>> > I am quite new to python, so I am sorry if I am asking something too >>> > simple: I am trying to speed up a raster processing with python and I >>> > want to use the "sklearn.metrics.pairwise.pairwise_distances" module >>> > with the mahalanobis distance metric and the " n_jobs=-1" in order to >>> > speed up the process using all processors. However, when I run it, it >>> > only uses one processor... while if I use the euclidean distance all >>> > processors are used... where is the problem? am I doing something >>> > wrong? >>> > >>> > Thank you and cheers, >>> > >>> > Javier >>> >>> I does not seem it is vectorized up with joblib. It is just deferred to >>> scipy.spatial.distances.mahalanobis. >>> >>> Assuming X is ndata x nparam array, m is the mean vector and S is the >>> covariance matrix, you can compute mahalanobis distances in parallel >>> like so: >>> >>> import numpy as np >>> import scipy >>> from scipy.linalg import cholesky, solve_triangular >>> from sklearn.externals.joblib import Parallel, delayed >>> from multiprocessing import cpu_count >>> >>> def _schedule(n, nproc): >>> """ guided scheduler """ >>> start = 0 >>> size = (n - start) // nproc >>> while size > 100: >>> yield slice(start,start+size) >>> start += size >>> size = (n - start) // nproc >>> yield slice(start,n+1) >>> return >>> >>> def _mahalanobis_distances(m, L, X): >>> cX = X - m[np.newaxis,:] >>> tmp = solve_triangular(L, cX.T, lower=True).T >>> tmp **= 2 >>> return np.sqrt(tmp.sum(axis=1)) >>> >>> def mahalanobis_distances(m, S, X, parallel=True): >>> L = cholesky(S, lower=True) >>> n = X.shape[0] >>> if parallel: >>> nproc = cpu_count() >>> res = (Parallel(n_jobs=-1) >>> (delayed(_mahalanobis_distances) >>> (m, L, X[s,:]) >>> for s in _schedule(n,nproc))) >>> return np.hstack(res) >>> else: >>> return _mahalanobis_distances(m, L, X) >>> >>> >>> # scipy.spatial.distance.mahalanobis for comparison >>> >>> from scipy.spatial import distance >>> >>> def _mahalanobis_distances_scipy(m, SI, X): >>> n = X.shape[0] >>> mahal = np.zeros(n) >>> for i in xrange(X.shape[0]): >>> x = X[i,:] >>> mahal[i] = distance.mahalanobis(x,m,SI) >>> return mahal >>> >>> def mahalanobis_distances_scipy(m, S, X, parallel=True): >>> SI = np.linalg.inv(S) >>> n = X.shape[0] >>> if parallel: >>> nproc = cpu_count() >>> res = (Parallel(n_jobs=-1) >>> (delayed(_mahalanobis_distances_scipy) >>> (m, SI, X[s,:]) >>> for s in _schedule(n,nproc))) >>> return np.hstack(res) >>> else: >>> return _mahalanobis_distances_scipy(m, SI, X) >>> >>> >>> >>> if __name__ == "__main__": >>> >>> ## verify accuracy >>> >>> X = np.random.randn(300000*16).reshape((300000,16)) >>> m = np.mean(X, axis=0) >>> S = np.cov(X, rowvar=False) >>> >>> mahal_scipy = mahalanobis_distances_scipy(m, S, X) >>> mahal_cholesky = mahalanobis_distances(m, S, X) >>> >>> print("Similar result to scipy.spatial.distance.mahalanobis: %s" % >>> ('true' if np.allclose(mahal_scipy - mahal_cholesky,0) else >>> 'false',)) >>> >>> mahal = (mahalanobis_distances(m, S, X, parallel=True) >>> - mahalanobis_distances(m, S, X, parallel=False) >>> + mahalanobis_distances_scipy(m, S, X, parallel=True) >>> - mahalanobis_distances_scipy(m, S, X, parallel=False)) >>> >>> print("Similar results with and without parallel execution: %s" % >>> ('true' if np.allclose(mahal,0) else 'false',)) >>> >>> # timing >>> from time import clock >>> >>> print("Example timings:") >>> >>> t0 = clock() >>> mahalanobis_distances(m, S, X, parallel=True) >>> t1 = clock() >>> print("cholesky, parallel: %f ms" % (1000*(t1-t0),)) >>> >>> t0 = clock() >>> mahalanobis_distances(m, S, X, parallel=False) >>> t1 = clock() >>> print("cholesky, single process: %f ms" % (1000*(t1-t0),)) >>> >>> t0 = clock() >>> mahalanobis_distances_scipy(m, S, X, parallel=True) >>> t1 = clock() >>> print("scipy, parallel: %f ms" % (1000*(t1-t0),)) >>> >>> t0 = clock() >>> mahalanobis_distances_scipy(m, S, X, parallel=False) >>> t1 = clock() >>> print("scipy, single process: %f ms" % (1000*(t1-t0),)) >>> >>> >>> This gives me: >>> >>> $ python mahal.py >>> Similar result to scipy.spatial.distance.mahalanobis: true >>> Similar results with and without parallel execution: true >>> Example timings: >>> cholesky, parallel: 122.138000 ms >>> cholesky, single process: 159.878000 ms >>> scipy, parallel: 308.829000 ms >>> scipy, single process: 6744.384000 ms >>> >>> >>> Actually, there is a strange interaction with MKL threading here: >>> >>> try: >>> import mkl >>> mkl.set_num_threads(1) >>> except ImportError: >>> pass >>> >>> Now I get these timings: >>> >>> $ python mahal.py >>> Similar result to scipy.spatial.distance.mahalanobis: true >>> Similar results with and without parallel execution: true >>> Example timings: >>> cholesky, parallel: 120.784000 ms >>> cholesky, single process: 84.593000 ms >>> scipy, parallel: 129.127000 ms >>> scipy, single process: 6749.786000 ms >>> >>> >>> The version that uses a single MKL thread and a single process is >>> actually the fastest. >>> >>> I could probably get even better results using Cython, BLAS and LAPACK, >>> but this can suffice for now. >>> >>> >>> Sturla >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Managing the Performance of Cloud-Based Applications >>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >>> Read the Whitepaper. >>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> ------------------------------------------------------------------------------ >> Flow-based real-time traffic analytics software. Cisco certified tool. >> Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer >> Customize your own dashboards, set traffic alerts and generate reports. >> Network behavioral analysis & security monitoring. All-in-one tool. >> http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > ------------------------------------------------------------------------------ > Flow-based real-time traffic analytics software. Cisco certified tool. > Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer > Customize your own dashboards, set traffic alerts and generate reports. > Network behavioral analysis & security monitoring. All-in-one tool. > http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis & security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general