On Fri, Oct 03, 2008 at 09:11:58PM +0000, Pauli Virtanen wrote: > Fri, 03 Oct 2008 18:59:02 +0200, Gael Varoquaux wrote:
> > I am doing a calculation where one call numpy.dot ends up taking 90% > > of > > the time (the array is huge: (61373, 500) ). > > Any chance I can make this faster? I would believe BLAS/ATLAS would > > be > > behind this, but from my quick analysis (ldd on > > numpy/core/multiarray.so) it doesn't seem so. Have I done something > > stupid when building numpy (disclaimer: I am on a system I don't know > > well --Mandriva--, so I could very well have done something stupid). > AFAIK, multiarray.so is never linked against ATLAS. The accelerated dot > implementation is in _dotblas.so, and can be toggled with alterdot/ > restoredot (but the ATLAS one should be active by default). > >>> numpy.dot.__module__ > 'numpy.core._dotblas' OK, thanks, that's useful info. I am not at work right now, and I can't log in the boxes at work, but I am pretty sure that 'numpy.dot.__module__' returned 'numpy.core.mutliarray', that's why I tried an ldd on it. On an Ubuntu box at home, using the off-the-shelf numpy package I have the same results, althought the numpy package has an optional dependency on atlas. I have 1.0.3 on the Ubuntu box and something more recent on the work box -- not sure what, but I seem to remember I grabbed trunk. Playing with restoredot/alterdot didn't change anything. > Are your arrays appropriately contiguous? Numpy needs to copy the data > if they are not; though I'm not sure if this could account for what you > see. On Fri, Oct 03, 2008 at 11:14:08AM -0600, Charles R Harris wrote: > What does np.__config__.show() show? That's on my home box (where things and also quite slow), but nothing good: In [4]: numpy.__config__.show() blas_info: libraries = ['blas'] library_dirs = ['/usr/lib'] language = f77 lapack_info: libraries = ['lapack'] library_dirs = ['/usr/lib'] language = f77 atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['blas'] library_dirs = ['/usr/lib'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_blas_threads_info: NOT AVAILABLE lapack_opt_info: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_info: NOT AVAILABLE lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE atlas_blas_info: NOT AVAILABLE mkl_info: NOT AVAILABLE This seems to tell that numpy has been build without altas. Hum, maybe we need to work with the Debian guys to make sure that numpy is available with altas. My home box is a 4-years old AMD64 (single core) and it is slightly quicker than the bran new 8-cores, super-cool box we have at the lab (intel CPUs). I am quite puzzled. This is not the first time I see this. could it be because I am running a 64 bit distro? I haven't checked if PAE is enabled on the box at the lab, but I definitely know it is not 64 bit. > What exactly are you multiplying? Right now numy.random.random to do my tests, but my original problem (very noisy matrices coming from neuroimaging data) showed the same behavior. > What is the original problem? I am building a correlation matrix as a first step for a PCA. I have a matrix M made of m=61373 rows giving time-series of length n=500. I am calculating X = np.dot(M.T, M) and doing an SVD on X. This is not code I have writen, and I am slowly warming up to a new field, so I might be missing some obvious points, or talking nonsense. Cheers, Gaƫl _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion