Hi,

 > Okay, so I am writing a little wrapper in order to allow for CBLAS
> (OpenBlas, and possibly MKL, provided that the headers are compatible)
> linking.
> I started wondering, why not doing the same thing for CuBLAS?

Yes, CuBLAS and clAmdBlas both make sense, even if we only use it for 
comparing the performance of our kernels. :-) I'd like to able to select 
the 'worker' implementation at runtime to get the best out of all the 
options rather than nailing everything down at compilation time.


> It seems
> like our efforts are getting more focused on dynamic code generation and
> auto-tuning, which is hardly applicable to CUDA at the moment. While
> having a header-only CUDA implementation is certainly a good thing for
> portability issues, I feel like our user-base could enormously benefit
> from CuBLAS linking functionnalities (at least for blas2 and blas3,
> since expression trees probably allow us to beat CuBLAS on arithmetic
> operations). Is there any major obstacle to enabling this?

There is no main obstacle, it's mostly a matter of doing it. :-) A good 
runtime selection layer needs a couple of thoughts, but I don't think it 
needs a lot of code.

Best regards,
Karli


------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to