Hi Philippe,

 >     Since our generator is skeleton-based anyway, what about having a 
look
>     at the best performing kernels in RaijinCL and then extending the
>     current generator accordingly such that these kernels are covered as
>     well? I consider this to be *far* less painful then trying to merge in
>     RaijinCL - as you certainly know, it's not that easy to 'just interface
>     with a kernel generator', particularly if this is supposed to happen at
>     runtime and in a reliable way. Even just within ViennaCL this took us
>     (at least) three iterations to come up with a useful model in
>     practice...
>
>
>
> Yes, probably. Plus, we need not all functionalities of RaijinCL
> (images, for example). I have taken contact with Rahul (author of
> RaijinCL). I just want to make sure that RaijinCL gets the credits it
> deserves (3TFLOP/s on HD7970 is a lot !), and maybe join our expertise
> to get even better performance :)

Sure, they should get all the credits for their work.

On the other hand, I'm not sure whether we can use images for peak GEMM 
performance, as this would hit us otherwise when trying to do reasonable 
things inside algorithms. So maybe we already get the 'best' performance 
under the constraint of not using images?

Best regards,
Karli

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to