On Tue, Aug 9, 2016 at 4:43 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Sat, Aug 6, 2016 at 9:20 PM, Andreas Schwab <sch...@linux-m68k.org> wrote: >> On Mi, Jul 13 2016, "Bin.Cheng" <amker.ch...@gmail.com> wrote: >> >>> Patch re-tested/applied on trunk as r238301. >> >> This breaks gcc.dg/vect/vect-117.c on powerpc. > Hi Andreas, > Sorry for the inconvenience, I will have a look. Looks like the patch exposed another latent issue in vectorizer. Before patching, the loop is vectorized under below alias check condition:
<bb 3>: _7 = (int[5] *) ivtmp.88_168; _20 = (unsigned int) n_15(D); _12 = _20 > 3; _180 = ivtmp.85_166 + 32; _181 = (ssizetype) _180; _182 = ivtmp.85_166 + 20; _183 = (ssizetype) _182; _32 = _181 <= _183; _184 = ivtmp.85_166 + 36; _185 = (ssizetype) _184; _38 = (ssizetype) ivtmp.85_166; _39 = _38 >= _185; _40 = _32 | _39; _41 = _12 & _40 if (_41 != 0) goto <bb 4>; else goto <bb 13>; Note the condition _40 = (_32 | _39) can never be true thus vectorized loop will never be executed in practice. This patch only bypasses the known to be false condition at compile time. After investigation, I believe root causes is in unalignment handling, especially vectorizer tries to optimize unaligned load by scheme dr_explicit_realign_optimized on powerpc. The loop itself should be vectorized successfully: static int a[N][N] = {{ 1, 2, 3, 4, 5}, { 6, 7, 8, 9,10}, {11,12,13,14,15}, {16,17,18,19,20}, {21,22,23,24,25}}; volatile int foo; __attribute__ ((noinline)) int main1 (int A[N][N], int n) { int i,j; /* vectorizable */ for (i = 1; i < N; i++) { for (j = 0; j < n; j++) { A[i][j] = A[i-1][j] + A[i][j]; } } return 0; } But vectorizer tries to realign A[i-1][j] using realign_load, thus the real memory address accessed is different to the original address in each iteration. Function vect_vfa_segment_size takes this into consideration and generates false alias check condition. IMO, The unalignment support scheme should be improved in a way that allows more vectorization. In other words, it should fall back to dr_unaligned_supported to enable vectorization, rather than choose dr_explicit_realign_optimized which blocks vectorization. I will file a PR for this. Thanks, bin > > Thanks, > bin >> >> Andreas. >> >> -- >> Andreas Schwab, sch...@linux-m68k.org >> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 >> "And now for something completely different."