Thanks for the update.  Just out of curiosity, why does Julia call a single 
threaded LAPACK routine when in parallel processing mode?  Is it the case 
that its impossible to have several processes calling multi-threaded LAPACK 
routines, or just that its less efficient than having several processes 
calling single-threaded LAPACK routines?

-Thom

On Thursday, July 10, 2014 3:08:24 AM UTC-5, Andreas Noack wrote:
>
> I think the problem is in the single threaded version of dpotri in 
> OpenBLAS. When you add processes to Julia, OpenBLAS is called singled 
> threaded and therefore you see the problem when using addprocs. I could 
> reproduce the error by calling blas_set_num_threads(1). I have filed an 
> issues at the Julia and OpenBLAS github sites.
>
>
> 2014-07-10 3:51 GMT+02:00 Thomas Covert <[email protected] <javascript:>>
> :
>
>> Here's are two additional pieces to the puzzle.  tl;dr is that the 
>> parallel version and serial version generate different cholesky factors, 
>> and that conditional on those computed factors, a "serial" call to 
>> cholfact(inv(C)) works fine on both computed factors, while a "parallel" 
>> call doesn't work on either. 
>>
>> 1) If I fix the random seed to be the same across runs, the non-parallel 
>> version and the parallel version generate slightly different values of C. 
>>  The maximum absolute difference between them is on the order of 10e-15, 
>> but almost all values in the upper left triangle are different from each 
>> other.
>>
>> 2) Taking the above computations of C and calling CS the version computed 
>> in the absence of addprocs() and CP the version computed with addprocs(), I 
>> get another difference.  If I have saved these matrices, open a fresh 
>> instance of julia (no addprocs()), and read them in, both cholfact(inv(CS)) 
>> and cholfact(inv(CP)) work fine.  If I do a fresh open, then addprocs(), 
>> then read them in, NEITHER cholfact(inv(CS)) and cholfact(inv(CP)) work, 
>> and they both throw the same PosDefException number.
>>
>>
>> On Wednesday, July 9, 2014 5:50:07 PM UTC-5, Thomas Covert wrote:
>>>
>>> I have found cholfact to behave differently (erroneously?) under 
>>> parallel processing contexts than under standard settings.  What I mean by 
>>> "parallel processing" is simply having previously called addprocs().  Here 
>>> is some example code that I am running on my mid-2009 MacBook Pro using a 
>>> somewhat recent brew of @staticfloat's homebrew distribution:
>>>
>>> addprocs(8)
>>>
>>> N = 1000
>>>
>>> x = 10 * randn(N)
>>>
>>> X = zeros(N,N)
>>>
>>>
>>> for i = 1:N
>>>
>>>     for j = 1:N
>>>
>>> X[i,j] = exp(-.5 * (x[i]-x[j])^2)
>>>
>>>     end
>>>
>>> end
>>>
>>>
>>> X = X + diagm(.5 * ones(N))
>>>
>>>
>>> C = cholfact(X)
>>>
>>> iC = inv(C)
>>>
>>> CiC = cholfact(iC)
>>>
>>> I believe this code generates an X which is positive definite by 
>>> construction.
>>>
>>> If I run this code as-is, I get the following error (or something 
>>> similar, the PosDefException sometimes changes):
>>>
>>> *ERROR: PosDefException(12)*
>>>
>>> * in cholfact! at linalg/factorization.jl:36*
>>>
>>> * in cholfact at linalg/factorization.jl:39*
>>>
>>> *while loading /Users/tcovert/path_to_code.jl, in expression starting on 
>>> line 16*
>>>  
>>> However, if I comment out the "addprocs(8)" line, everything works fine. 
>>>  Also, for smaller values of N the problem goes away (N=100,200 is fine, 
>>> N=400 is not).  Here is my versioninfo() if that helps:
>>>
>>> *julia> **versioninfo()*
>>>
>>> Julia Version 0.3.0-prerelease+3868
>>>
>>> Commit e7a9a7d* (2014-06-24 19:39 UTC)
>>>
>>> Platform Info:
>>>
>>>   System: Darwin (x86_64-apple-darwin13.2.0)
>>>
>>>   CPU: Intel(R) Core(TM)2 Duo CPU     P8700  @ 2.53GHz
>>>
>>>   WORD_SIZE: 64
>>>
>>>   BLAS: libopenblas (USE64BITINT NO_AFFINITY)
>>>
>>>   LAPACK: libopenblas
>>>
>>>   LIBM: libopenlibm
>>>
>>>
>
>
> -- 
> Med venlig hilsen
>
> Andreas Noack Jensen
>  

Reply via email to