Thanks for the update. Just out of curiosity, why does Julia call a single threaded LAPACK routine when in parallel processing mode? Is it the case that its impossible to have several processes calling multi-threaded LAPACK routines, or just that its less efficient than having several processes calling single-threaded LAPACK routines?
-Thom On Thursday, July 10, 2014 3:08:24 AM UTC-5, Andreas Noack wrote: > > I think the problem is in the single threaded version of dpotri in > OpenBLAS. When you add processes to Julia, OpenBLAS is called singled > threaded and therefore you see the problem when using addprocs. I could > reproduce the error by calling blas_set_num_threads(1). I have filed an > issues at the Julia and OpenBLAS github sites. > > > 2014-07-10 3:51 GMT+02:00 Thomas Covert <[email protected] <javascript:>> > : > >> Here's are two additional pieces to the puzzle. tl;dr is that the >> parallel version and serial version generate different cholesky factors, >> and that conditional on those computed factors, a "serial" call to >> cholfact(inv(C)) works fine on both computed factors, while a "parallel" >> call doesn't work on either. >> >> 1) If I fix the random seed to be the same across runs, the non-parallel >> version and the parallel version generate slightly different values of C. >> The maximum absolute difference between them is on the order of 10e-15, >> but almost all values in the upper left triangle are different from each >> other. >> >> 2) Taking the above computations of C and calling CS the version computed >> in the absence of addprocs() and CP the version computed with addprocs(), I >> get another difference. If I have saved these matrices, open a fresh >> instance of julia (no addprocs()), and read them in, both cholfact(inv(CS)) >> and cholfact(inv(CP)) work fine. If I do a fresh open, then addprocs(), >> then read them in, NEITHER cholfact(inv(CS)) and cholfact(inv(CP)) work, >> and they both throw the same PosDefException number. >> >> >> On Wednesday, July 9, 2014 5:50:07 PM UTC-5, Thomas Covert wrote: >>> >>> I have found cholfact to behave differently (erroneously?) under >>> parallel processing contexts than under standard settings. What I mean by >>> "parallel processing" is simply having previously called addprocs(). Here >>> is some example code that I am running on my mid-2009 MacBook Pro using a >>> somewhat recent brew of @staticfloat's homebrew distribution: >>> >>> addprocs(8) >>> >>> N = 1000 >>> >>> x = 10 * randn(N) >>> >>> X = zeros(N,N) >>> >>> >>> for i = 1:N >>> >>> for j = 1:N >>> >>> X[i,j] = exp(-.5 * (x[i]-x[j])^2) >>> >>> end >>> >>> end >>> >>> >>> X = X + diagm(.5 * ones(N)) >>> >>> >>> C = cholfact(X) >>> >>> iC = inv(C) >>> >>> CiC = cholfact(iC) >>> >>> I believe this code generates an X which is positive definite by >>> construction. >>> >>> If I run this code as-is, I get the following error (or something >>> similar, the PosDefException sometimes changes): >>> >>> *ERROR: PosDefException(12)* >>> >>> * in cholfact! at linalg/factorization.jl:36* >>> >>> * in cholfact at linalg/factorization.jl:39* >>> >>> *while loading /Users/tcovert/path_to_code.jl, in expression starting on >>> line 16* >>> >>> However, if I comment out the "addprocs(8)" line, everything works fine. >>> Also, for smaller values of N the problem goes away (N=100,200 is fine, >>> N=400 is not). Here is my versioninfo() if that helps: >>> >>> *julia> **versioninfo()* >>> >>> Julia Version 0.3.0-prerelease+3868 >>> >>> Commit e7a9a7d* (2014-06-24 19:39 UTC) >>> >>> Platform Info: >>> >>> System: Darwin (x86_64-apple-darwin13.2.0) >>> >>> CPU: Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz >>> >>> WORD_SIZE: 64 >>> >>> BLAS: libopenblas (USE64BITINT NO_AFFINITY) >>> >>> LAPACK: libopenblas >>> >>> LIBM: libopenlibm >>> >>> > > > -- > Med venlig hilsen > > Andreas Noack Jensen >
