Hi,

Timo Betcke <[email protected]> wrote:

> which I already suspected. I tried POCL_VECTORIZER_REMARKS=1 to
> activate vectorizer remarks. But it does not create any kind of

Yes, it doesn't work - even if pocl registers the LLVM options to
print debug info successfully.. i haven't yet figured out why it's not
working.

> The question is what prevents the auto vectorizer from working at
> all. The code seems quite straight forward with very simple for-loops

It's possible to use POCL_DEBUG_LLVM_PASSES=1 and grep for lines
starting with "LV:" - this shows that vectorizer is in fact running.

I tried to compile "evaluate_regular" from your .cl file. It seems to
find two loops; the first one (smaller) is only vectorized when you
build with "-cl-fast-relaxed-math" option.

The second, longer loop prints this:

LV: Checking a loop in "_pocl_launcher_evaluate_regular"
from /tmp/POCL_CACHE/tempfile-4b-64-39-6d-57.cl
LV: Loop hints: force=? width=0 unroll=0
LV: Found a loop: for.body.i
LV: Found an induction variable.
LV: We can vectorize this loop!
LV: The Smallest and Widest types: 32 / 32 bits.
LV: The Widest register is: 256 bits.

LV: Scalar loop costs: 20.
LV: Vector loop of width 2 costs: 38.
LV: Vector loop of width 4 costs: 37.
LV: Vector loop of width 8 costs: 36.
LV: Selecting VF: 1.
LV: Vectorization is possible but not beneficial.

... changing N_QUAD_POINTS doesn't seem to make any difference.

Cheers,
 -- mb

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to