Re: [julia-users] Article on `@simd`

2014-09-17 Thread Stefan Karpinski
Since the actual vectorization is done by LLVM and Julia just sets things up so that LLVM is more likely to be able to vectorize things, this is actually pretty hard to figure out. But it would be a good thing to do. Another idea for a paranoid / testing mode. Of course, the problem with that is th

Re: [julia-users] Article on `@simd`

2014-09-17 Thread Job van der Zwan
On Tuesday, 16 September 2014 19:14:16 UTC+2, Jacob Quinn wrote: > > Oops, the code I shared had a bug in it > Question: if the compiler didn't vectorise despite the @simd macro, is the user notified of this? If not, wouldn't that be useful debugging information, especially if the message indica

Re: [julia-users] Article on `@simd`

2014-09-16 Thread Arch Robison
Yes, the indirect store (a "scatter" in vectorizer parlance) will stop the current vectorizer. AVX-512 has the requisite scatter instruction, so when AVX-512 becomes available and LLVM is updated to use it, we should revisit this example. In the other example, the reduction into cent_sumsq[k]

Re: [julia-users] Article on `@simd`

2014-09-16 Thread Jacob Quinn
Oops, the code I shared had a bug in it, it should be: N, M = size(data) @inbounds for m = 1:M clust = assignments[m] @simd for n = data.colptr[m]:(data.colptr[m+1]-1) centroids[data.rowval[n],clust] += data.nzval[n] end centroid_counts[clust] += 1.0 end where the row inde

Re: [julia-users] Article on `@simd`

2014-09-16 Thread Arch Robison
The "llvm.mem.parallel_loop_access" is an annotation on loads and stores that indicate they do not depend on other iterations. @simd causes them to be sprinkled throughout the loop when the LLVM IR is generated. The lack of "load <*n* x float>" indicates that the LLVM vectorizer gave up. I'm not

Re: [julia-users] Article on `@simd`

2014-09-16 Thread Jacob Quinn
Arch, I've been having too much fun diving into using @simd this morning. One thing I've noticed is that some @simd loops have the code_llvm output a little differently. For example, the following code: n, m = size(data) # sparse matrix @inbounds for col = 1:m clust = assignments[col]

Re: [julia-users] Article on `@simd`

2014-09-16 Thread Arch Robison
I concur that simultaneity is the key issue, not atomicity. I've revised the sentence, and exchanged some "concerns" for "pertains to". Thanks. - Arch On Mon, Sep 15, 2014 at 11:16 PM, Stefan Karpinski < stefan.karpin...@gmail.com> wrote: > Excellent article. When you describe how sets of vecto

Re: [julia-users] Article on `@simd`

2014-09-15 Thread Stefan Karpinski
Excellent article. When you describe how sets of vector instructions occur, "simultaneously" would seem be more correct than "instantaneously". Stylistically, the concerns paragraph seems a bit overly concerned – perhaps one could be changed to a different word. > On Sep 15, 2014, at 11:39 PM,

Re: [julia-users] Article on `@simd`

2014-09-15 Thread David Smith
Great article! I used some of your tips on a complicated inner loop just now and immediately got a 3.6x speedup. On Monday, September 15, 2014 7:57:17 PM UTC-5, Arch Robison wrote: > > Thanks to all for taking time to report my errors. I missed @code_llvm's > introduction into Julia. It let

Re: [julia-users] Article on `@simd`

2014-09-15 Thread Arch Robison
Thanks to all for taking time to report my errors. I missed @code_llvm's introduction into Julia. It let me shorten that section slightly. Jacob is right about the row/column index. I tend to think of the leftmost subscript as the "runs down the column" index. When I'm working with Fortran/Juli

Re: [julia-users] Article on `@simd`

2014-09-15 Thread Jacob Quinn
- Under the first bullet point of "What is Vectorization?", "Writing you code" should be "Writing your code" - In the "Speedup Surprise" section, the last sentence says "The compiler than...", should be "The compiler then" - In the section "Inspecting Whether Code Vectorizes", you can actuall

Re: [julia-users] Article on `@simd`

2014-09-15 Thread Patrick O'Leary
Under "Inspecting Whether Code Vectorizes" code_llvm(axpy,(T1,T2,T2}) The next-to-last character should be a paren. This is a very informative article; thanks for putting it and the feature together!

Re: [julia-users] Article on `@simd`

2014-09-15 Thread Elliot Saba
It also looks as if there is some HTML that sneaked into your code. On line 11 of the fourth code blob:"end" -E On Mon, Sep 15, 2014 at 3:06 PM, Elliot Saba wrote: > Hey Arch, > > Looking at the second code blob, on line 08 you have "t4 = b+c", but I > think you mean "t4 = t2+t3". > -E > > On M

Re: [julia-users] Article on `@simd`

2014-09-15 Thread Elliot Saba
Hey Arch, Looking at the second code blob, on line 08 you have "t4 = b+c", but I think you mean "t4 = t2+t3". -E On Mon, Sep 15, 2014 at 2:39 PM, Arch Robison wrote: > I've posted an article on the @simd feature to > https://software.intel.com/en-us/articles/vectorization-in-julia . > @simd is

[julia-users] Article on `@simd`

2014-09-15 Thread Arch Robison
I've posted an article on the @simd feature to https://software.intel.com/en-us/articles/vectorization-in-julia . @simd is an experimental feature in Julia 0.3 that gives the compiler more latitude