Hi guys,
So I expect ViennaCL 1.6 to offer some really good performance on CPUs with
the OpenCL backend -- possibly 80% of OpenBLAS / MKL on a Core i7 4770, for
example. As the OpenCL kernel generator and the auto-tuner will get better,
we can hope for further improvements.
This will create a huge gap with the fallback OpenMP version, which hardly
reaches 0.5 GFLOP/s. What would you thinking about extracting the assembly
output of the Intel OpenCL compiler? I'm not familiar *at all* with
assembly code. How would we handle multi-threading in such a setting?
Philippe
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
ViennaCL-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-devel