Am Mi., 20. Nov. 2019 um 10:21 Uhr schrieb HAPPY Mahto <cs17btech11...@iith.ac.in>: >> #pragma clang loop vectorize_assume_alignment(32) >> for(int i = 0;i < n; i++){ >> a[i] = b[i] + i*i; >> } > > for this all-access inside the loop will be aligned to 32bit, > ex IR >> >> for.cond: ; preds = %for.inc, %entry >> %5 = load i32, i32* %i, align 32, !llvm.access.group !2 >> %6 = load i32, i32* %n, align 32, !llvm.access.group !2 >> %cmp = icmp slt i32 %5, %6 >> br i1 %cmp, label %for.body, label %for.end >> >> for.body: ; preds = %for.cond >> %7 = load i32, i32* %i, align 32, !llvm.access.group !2 >> %8 = load i32, i32* %i, align 32, !llvm.access.group !2 >> %idxprom = sext i32 %8 to i64 >> %arrayidx = getelementptr inbounds i32, i32* %vla1, i64 %idxprom >> store i32 %7, i32* %arrayidx, align 32, !llvm.access.group !2 >> br label %for.inc >> >> for.inc: ; preds = %for.body >> %9 = load i32, i32* %i, align 32, !llvm.access.group !2 >> %inc = add nsw i32 %9, 1 >> store i32 %inc, i32* %i, align 32, !llvm.access.group !2 >> br label %for.cond, !llvm.loop !3 > > You will not need to create pointers for every array(or operand you want to > perform the operation on).
IMHO it is better if the programmer has to. It is not always obvious which arrays are used in the loop. Also, the information can be used by other optimzations that the vectorizer. >> >> void mult(float* x, int size, float factor){ >> float* ax = (float*)__builtin_assume_aligned(x, 64); >> for (int i = 0; i < size; ++i) >> ax[i] *= factor; >> } https://godbolt.org/z/Fd6HMe > the alignment is assumed whereas in #pragma it is set to the number specified. Semantically, it is the same. I wonder how you expect the assembly output to change? The __builtin_assume_aligned, will be picked up by the backend and result in movaps to be used instead of movups. > it'll be easier, and having a pragma for doing this will help as it's > provided in OMP and intel compilers. This is a compiler-specific extension. It does not have an influence on what other compilers do. Even with clang, if you try to do #pragma clang loop vectorize_assume_alignment(32) #pragma omp simd for (int i = 0; i < size; ++i) clang will silently swallow the vectorize_assume_alignment. Michael _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits