Hi, > Okay, so I am writing a little wrapper in order to allow for CBLAS > (OpenBlas, and possibly MKL, provided that the headers are compatible) > linking. > I started wondering, why not doing the same thing for CuBLAS?
Yes, CuBLAS and clAmdBlas both make sense, even if we only use it for comparing the performance of our kernels. :-) I'd like to able to select the 'worker' implementation at runtime to get the best out of all the options rather than nailing everything down at compilation time. > It seems > like our efforts are getting more focused on dynamic code generation and > auto-tuning, which is hardly applicable to CUDA at the moment. While > having a header-only CUDA implementation is certainly a good thing for > portability issues, I feel like our user-base could enormously benefit > from CuBLAS linking functionnalities (at least for blas2 and blas3, > since expression trees probably allow us to beat CuBLAS on arithmetic > operations). Is there any major obstacle to enabling this? There is no main obstacle, it's mostly a matter of doing it. :-) A good runtime selection layer needs a couple of thoughts, but I don't think it needs a lot of code. Best regards, Karli ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ ViennaCL-devel mailing list ViennaCL-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/viennacl-devel