Le mercredi 30 mars 2016 à 15:16 -0700, Johannes Wagner a écrit :
> 
> 
> > Le mercredi 30 mars 2016 à 04:43 -0700, Johannes Wagner a écrit : 
> > > Sorry for not having expressed myself clearly, I meant the latest 
> > > version of fedora to work fine (24 development). I always used the 
> > > latest julia nightly available on the copr nalimilan repo. Right now 
> > > that is: 0.5.0-dev+3292, Commit 9d527c5*, all use 
> > > LLVM: libLLVM-3.7.1 (ORCJIT, haswell) 
> > > 
> > > peakflops on all machines (hardware identical) is ~1.2..1.5e11.   
> > > 
> > > Fedora 22&23 with julia 0.5 is ~50% slower then 0.4, only on fedora 
> > > 24 julia 0.5 is  faster compared to julia 0.4. 
> > Could you try to find a simple code to reproduce the problem? In 
> > particular, it would be useful to check whether this comes from 
> > OpenBLAS differences or whether it also happens with pure Julia code 
> > (typical operations which depend on BLAS are matrix multiplication, as 
> > well as most of linear algebra). Normally, 0.4 and 0.5 should use the 
> > same BLAS, but who knows... 
> well thats what I did, and the 3 simple calls inside the loop are 
> more or less same speed. only the whole loop seems slower. See my
> code sample fromanswer march 8th (code gets in same proportions
> faster when exp(im .* dotprods) is replaced by cis(dotprods) ). 
> So I don't know what I can do then...  
Sorry, somehow I had missed that message. This indeed looks like a code
generation issue in Julia/LLVM.

> > Can you also confirm that all versioninfo() fields are the same for all 
> > three machines, both for 0.4 and 0.5? We must envision the possibility 
> > that the differences actually come from 0.4. 
> ohoh, right! just noticed that my fedora 24 machine was an ivy bridge
> which works fast:
> 
> Julia Version 0.5.0-dev+3292
> Commit 9d527c5* (2016-03-28 06:55 UTC)
> Platform Info:
>   System: Linux (x86_64-redhat-linux)
>   CPU: Intel(R) Core(TM) i5-3550 CPU @ 3.30GHz
>   WORD_SIZE: 64
>   BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Sandybridge)
>   LAPACK: libopenblasp.so.0
>   LIBM: libopenlibm
>   LLVM: libLLVM-3.7.1 (ORCJIT, ivybridge)
> 
> and the other ones with fed22/23 are haswell, which work slow:
> 
> Julia Version 0.5.0-dev+3292
> Commit 9d527c5* (2016-03-28 06:55 UTC)
> Platform Info:
>   System: Linux (x86_64-redhat-linux)
>   CPU: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
>   WORD_SIZE: 64
>   BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Haswell)
>   LAPACK: libopenblasp.so.0
>   LIBM: libopenlibm
>   LLVM: libLLVM-3.7.1 (ORCJIT, haswell)
> 
> I just booted an fedora 23 on the ivy bridge machine and it's also fast. 
>  
> Now if I use julia 0.45 on both architectures:
> 
> Julia Version 0.4.5
> Commit 2ac304d* (2016-03-18 00:58 UTC)
> Platform Info:
>   System: Linux (x86_64-redhat-linux)
>   CPU: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
>   WORD_SIZE: 64
>   BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Haswell)
>   LAPACK: libopenblasp.so.0
>   LIBM: libopenlibm
>   LLVM: libLLVM-3.3
> 
> and:
> 
> Julia Version 0.4.5
> Commit 2ac304d* (2016-03-18 00:58 UTC)
> Platform Info:
>   System: Linux (x86_64-redhat-linux)
>   CPU: Intel(R) Core(TM) i5-3550 CPU @ 3.30GHz
>   WORD_SIZE: 64
>   BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Sandybridge)
>   LAPACK: libopenblasp.so.0
>   LIBM: libopenlibm
>   LLVM: libLLVM-3.3
> 
> there is no speed difference apart from the ~10% or so from the
> faster haswell machine. So could perhaps be haswell hardware target
> specific with the change from llvm 3.3 to 3.7.1? Is there anything
> else I could provide?
This is certainly an interesting finding. Could you paste somewhere the
output of @code_native for your function on Sandybridge vs. Haswell,
for both 0.4 and 0.5?

It would also be useful to check whether the same difference appears if
you use the generic binary tarballs from http://julialang.org/downloads
.

Finally, do you get the same result if you remove the call to exp()
from the loop? (This is the only external function, so it shouldn't be
affected by changes in Julia.)


Regards


> Best, Johannes
> 
> >  Regards 
> 
> 
> > > Le mercredi 16 mars 2016 à 09:25 -0700, Johannes Wagner a écrit :  
> > > > just a little update. Tested some other fedoras: Fedora 22 with llvm  
> > > > 3.8 is also slow with julia 0.5, whereas a fedora 24 branch with llvm  
> > > > 3.7 is faster on julia 0.5 compared to julia 0.4, as it should be  
> > > > (speedup from inner loop parts translated into speedup to whole  
> > > > function).  
> > > >  
> > > > don't know if anyone cares about that... At least the latest version  
> > > > seems to work fine, hope it stays like this into the final fedora 24  
> > > What's the "latest version"? git built from source or RPM nightlies?  
> > > With which LLVM version for each?  
> > > 
> > > If from the RPMs, I've switched them to LLVM 3.8 for a few days, and  
> > > went back to 3.7 because of a build failure. So that might explain the  
> > > difference. You can install the last version which built with LLVM 3.8  
> > > manually from here:  
> > > https://copr-be.cloud.fedoraproject.org/results/nalimilan/julia-nightlies/fedora-23-x86_64/00167549-julia/
> > >   
> > > 
> > > It would be interesting to compare it with the latest nightly with 3.7.  
> > > 
> > > 
> > > Regards  
> > > 
> > > 
> > > 
> > > > > hey guys,  
> > > > > I just experienced something weird. I have some code that runs fine  
> > > > > on 0.43, then I updated to 0.5dev to test the new Arrays, run same  
> > > > > code and noticed it got about ~50% slower. Then I downgraded back  
> > > > > to 0.43, ran the old code, but speed remained slow. I noticed while  
> > > > > reinstalling 0.43, openblas-threads didn't get isntalled along with  
> > > > > it. So I manually installed it, but no change.   
> > > > > Does anyone has an idea what could be going on? LLVM on fedora23 is  
> > > > > 3.7  
> > > > >  
> > > > > Cheers, Johannes  
> > > > >  

Reply via email to