Dear All,
I have on my Kaby Lake laptop hand-vectorized the routine by processing
items in multiple of 8.
The performance of POCL is now within a factor of two of the Intel OpenCL
runtime with auto-vectorization.
So while not perfect this is good enough for us, and when Intel is
available we can
easily automatically switch to a corresponding Intel optimized routine.
Best wishes
Timo
On 8 February 2018 at 23:15, Michal Babej <[email protected]>
wrote:
> Hi,
>
> Nicholas Curtis <[email protected]> wrote:
>
> > Does LLVM have to be compiled in Debug mode in order for the output
> > to work?
>
> Not sure if it has to be Debug, but i've noticed that it doesn't work
> with all LLVM builds. It fails on my distro's LLVM, but works on my own
> LLVM build (RelWithDebInfo+Asserts).
>
> -- mb
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> pocl-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/pocl-devel
>
--
Dr. Timo Betcke
Reader in Mathematics
University College London
Department of Mathematics
E-Mail: [email protected]
Tel.: +44 (0) 20-3108-4068
Fax.: +44 (0) 20-7383-5519
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel