I liked

http://arxiv.org/abs/1307.6209

which seems to show that modified ELLPACK can be good for
multicore as well as manycore. The sad part of course is that no
mention of parallel is made.

I still have to go over it in detail, yet the performance numbers they get look reasonable. This was also confirmed by some folks I met in Lausanne two weeks back. The only drawback is that their kernels use all these low-level intrinsics, which is a pain when it comes to portable code and productivity.

Best regards,
Karli

Reply via email to