If you're affiliated with a university, Anaconda has free academic 
licenses that include MKL and their optimized builds.

Vlad

On Mon Feb 24 09:22:07 2014, Javier Martínez-López wrote:
> That is great, thanks! I do not have the mkl module (it isn't free,
> right?) but with your script the calculation is approx. 10 times
> faster than in R. Is there a way to increase performance using Cython,
> BLAS and LAPACK? Could you possibly show some examples of how to do
> it?
>
> Thank you very much again and cheers,
>
> Javier
>
> On Sat, Feb 22, 2014 at 12:20 AM, Sturla Molden <sturla.mol...@gmail.com> 
> wrote:
>> On 21/02/14 14:07, Javier Martínez-López wrote:
>>> Hello,
>>>
>>> I am quite new to python, so I am sorry if I am asking something too
>>> simple: I am trying to speed up a raster processing with python and I
>>> want to use the "sklearn.metrics.pairwise.pairwise_distances" module
>>> with the mahalanobis distance metric and the " n_jobs=-1" in order to
>>> speed up the process using all processors. However, when I run it, it
>>> only uses one processor... while if I use the euclidean distance all
>>> processors are used... where is the problem? am I doing something
>>> wrong?
>>
>>
>> On 21/02/14 14:07, Javier Martínez-López wrote:> Hello,
>>   >
>>   > I am quite new to python, so I am sorry if I am asking something too
>>   > simple: I am trying to speed up a raster processing with python and I
>>   > want to use the "sklearn.metrics.pairwise.pairwise_distances" module
>>   > with the mahalanobis distance metric and the " n_jobs=-1" in order to
>>   > speed up the process using all processors. However, when I run it, it
>>   > only uses one processor... while if I use the euclidean distance all
>>   > processors are used... where is the problem? am I doing something
>>   > wrong?
>>   >
>>   > Thank you and cheers,
>>   >
>>   > Javier
>>
>> I does not seem it is vectorized up with joblib. It is just deferred to
>> scipy.spatial.distances.mahalanobis.
>>
>> Assuming X is ndata x nparam array, m is the mean vector and S is the
>> covariance matrix, you can compute mahalanobis distances in parallel
>> like so:
>>
>> import numpy as np
>> import scipy
>> from scipy.linalg import cholesky, solve_triangular
>> from sklearn.externals.joblib import Parallel, delayed
>> from multiprocessing import cpu_count
>>
>> def _schedule(n, nproc):
>>       """ guided scheduler """
>>       start = 0
>>       size = (n - start) // nproc
>>       while size > 100:
>>           yield slice(start,start+size)
>>           start += size
>>           size = (n - start) // nproc
>>       yield slice(start,n+1)
>>       return
>>
>> def _mahalanobis_distances(m, L, X):
>>       cX = X - m[np.newaxis,:]
>>       tmp = solve_triangular(L, cX.T, lower=True).T
>>       tmp **= 2
>>       return np.sqrt(tmp.sum(axis=1))
>>
>> def mahalanobis_distances(m, S, X, parallel=True):
>>       L = cholesky(S, lower=True)
>>       n = X.shape[0]
>>       if parallel:
>>           nproc = cpu_count()
>>           res = (Parallel(n_jobs=-1)
>>                  (delayed(_mahalanobis_distances)
>>                    (m, L, X[s,:])
>>                      for s in _schedule(n,nproc)))
>>           return np.hstack(res)
>>       else:
>>           return _mahalanobis_distances(m, L, X)
>>
>>
>> # scipy.spatial.distance.mahalanobis for comparison
>>
>> from scipy.spatial import distance
>>
>> def _mahalanobis_distances_scipy(m, SI, X):
>>       n = X.shape[0]
>>       mahal = np.zeros(n)
>>       for i in xrange(X.shape[0]):
>>           x = X[i,:]
>>           mahal[i] = distance.mahalanobis(x,m,SI)
>>       return mahal
>>
>> def mahalanobis_distances_scipy(m, S, X, parallel=True):
>>       SI = np.linalg.inv(S)
>>       n = X.shape[0]
>>       if parallel:
>>           nproc = cpu_count()
>>           res = (Parallel(n_jobs=-1)
>>                  (delayed(_mahalanobis_distances_scipy)
>>                   (m, SI, X[s,:])
>>                     for s in _schedule(n,nproc)))
>>           return np.hstack(res)
>>       else:
>>           return _mahalanobis_distances_scipy(m, SI, X)
>>
>>
>>
>> if __name__ == "__main__":
>>
>>       ## verify accuracy
>>
>>       X = np.random.randn(300000*16).reshape((300000,16))
>>       m = np.mean(X, axis=0)
>>       S = np.cov(X, rowvar=False)
>>
>>       mahal_scipy = mahalanobis_distances_scipy(m, S, X)
>>       mahal_cholesky = mahalanobis_distances(m, S, X)
>>
>>       print("Similar result to scipy.spatial.distance.mahalanobis: %s" %
>>          ('true' if np.allclose(mahal_scipy - mahal_cholesky,0) else
>> 'false',))
>>
>>       mahal = (mahalanobis_distances(m, S, X, parallel=True)
>>          - mahalanobis_distances(m, S, X, parallel=False)
>>          + mahalanobis_distances_scipy(m, S, X, parallel=True)
>>          - mahalanobis_distances_scipy(m, S, X, parallel=False))
>>
>>       print("Similar results with and without parallel execution: %s" %
>>          ('true' if np.allclose(mahal,0) else 'false',))
>>
>>       # timing
>>       from time import clock
>>
>>       print("Example timings:")
>>
>>       t0 = clock()
>>       mahalanobis_distances(m, S, X, parallel=True)
>>       t1 = clock()
>>       print("cholesky, parallel: %f ms" % (1000*(t1-t0),))
>>
>>       t0 = clock()
>>       mahalanobis_distances(m, S, X, parallel=False)
>>       t1 = clock()
>>       print("cholesky, single process: %f ms" % (1000*(t1-t0),))
>>
>>       t0 = clock()
>>       mahalanobis_distances_scipy(m, S, X, parallel=True)
>>       t1 = clock()
>>       print("scipy, parallel: %f ms" % (1000*(t1-t0),))
>>
>>       t0 = clock()
>>       mahalanobis_distances_scipy(m, S, X, parallel=False)
>>       t1 = clock()
>>       print("scipy, single process: %f ms" % (1000*(t1-t0),))
>>
>>
>> This gives me:
>>
>> $ python mahal.py
>> Similar result to scipy.spatial.distance.mahalanobis: true
>> Similar results with and without parallel execution: true
>> Example timings:
>> cholesky, parallel: 122.138000 ms
>> cholesky, single process: 159.878000 ms
>> scipy, parallel: 308.829000 ms
>> scipy, single process: 6744.384000 ms
>>
>>
>> Actually, there is a strange interaction with MKL threading here:
>>
>>       try:
>>          import mkl
>>          mkl.set_num_threads(1)
>>       except ImportError:
>>          pass
>>
>> Now I get these timings:
>>
>> $ python mahal.py
>> Similar result to scipy.spatial.distance.mahalanobis: true
>> Similar results with and without parallel execution: true
>> Example timings:
>> cholesky, parallel: 120.784000 ms
>> cholesky, single process: 84.593000 ms
>> scipy, parallel: 129.127000 ms
>> scipy, single process: 6749.786000 ms
>>
>>
>> The version that uses a single MKL thread and a single process is
>> actually the fastest.
>>
>> I could probably get even better results using Cython, BLAS and LAPACK,
>> but this can suffice for now.
>>
>>
>> Sturla
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Managing the Performance of Cloud-Based Applications
>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>> Read the Whitepaper.
>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> ------------------------------------------------------------------------------
> Flow-based real-time traffic analytics software. Cisco certified tool.
> Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
> Customize your own dashboards, set traffic alerts and generate reports.
> Network behavioral analysis & security monitoring. All-in-one tool.
> http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to