Hi all,
the following talk from IWOCL'17 has some numbers on PoCL on Xeon (HSW),
and Xeon Phi (KNL) for two applications using PoCL and other SDKs:
http://www.iwocl.org/wp-content/uploads/iwocl2017_matthias-noack-good-bad-ugly.pdf
The numbers for the comet simulation (slide 18, pdf 53), show some weird
outliers every 32 work-items, when increasing the overall number of
work-items (particles).
The HEOM code reveals a large performance gap in comparison with the
Intel SDK on KNL, while is looks quite competitive on HSW. This is
especially strange, as the Intel SDK has only AVX2 support, while PoCL
should be able to generate AVX-512 code using LLVM. I compiled PoCL and
LLVM on the target architectures, and made sure the result is an actual
Haswell and KNL built.
Is anyone interested in working on improving these issues?
Thanks,
Matthias
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel