Hi.

 > A short update : I've implemented linkage to CBlas and CuBlas with
> dynamic selection.
> If activated through VIENNACL_WITH_CUBLAS, one can go back and forth
> between cublas and the original backend by doing:
>
> A.blas().gemm(NULL);
> A.blas().gemm(viennacl::backend::blas::cublas_functions<value_type>::gemm);
>
> (and similarly for cblas.)

Nice, thanks! I think we can shorten the second call to something like
  A.blas().gemm(viennacl::backend::cublas);
for convenience.


> There is some trickery going on with transpositions and layout, but it
> works for every transpose/layout combination. One can also link A's blas
> to his own gemm function, provided a tiny wrapper (essentially to ensure
> signature compatibility)

Cool!

> A very good news is that this allows viennacl to work very well on very
> recent NVidia Hardware, until our autotuning engine is fully operational.
> On my laptop, cublasSgemm is about 5 times faster than the current CUDA
> implementation , and 20% faster than the OpenCL kernel found by the
> autotuner (120GFLOPs vs 25GFLOPs vs 95GFLOPs). Also,linking with
> OpenBlas leads to HUGE performance boost on the CPU ( 0.02GFLOP/s vs
> 70GFLOP/s)...!

For our native CUDA implementation it's probably only a matter of 
porting the results from the OpenCL tuner over. Unfortunately I don't 
see a good way of doing this with CUDA without a significant penalty on 
compilation times, because there is no concept of runtime kernel 
selection in CUDA so far. The performance difference for GEMM of our CPU 
backend is not surprising, this was never subject to optimization ;-)



> A little question remains. For now, the behavior is really weird when
> one defines both VIENNACL_WITH_CBLAS and VIENNACL_WITH_CUBLAS. How to
> handle this? I am not very familiar with the multiple backends and I
> don't know to which extent they can be combined. Therefore, I see
> multiple options, but can't tell which one is better.
>
> 1 -> trigger a preprocessor error when both commands are defined together
> 2 -> slightly modify the API : A.cuda_blas(), A.host_blas(), A.cl_blas()
>
> I think that option 2 is better, considering that there is already
> cuda_handle(), opencl_handle(), cpu_handle() or something similar, if
> I'm correct. Any advice?

The reason why cuda_handle(), opencl_handle() and cpu_handle() exists 
under different names is that they return different types (i.e. the 
memory buffer). For the BLAS backends I don't want to have different 
member names, because this gets annoying for users. For example, if a 
user wants to cycle through the backends for e.g. benchmark purposes, 
she would have to write

   if (my_constant == CUDA)
     A.cuda_blas()...
   else if (my_constant == HOST)
     A.host_blas()...
   else
     A.cl_blas()...

so making the code longer than necessary. I suggest to query some 
central registry where the backends are registered and then cycle 
through them:

   SomeListType blas_list = viennacl::blas_implementations_available();
   for ( it = blas_list.begin(); ... )
   {
     A.blas(*it);
     do_something(A);
   }

I don't know whether .blas() is the best name for this, because in the 
future we might also have more non-BLAS operations such as sorting or 
FFT - maybe we use .operations() to better reflect the operations table?

---

It seems to me that this is going in a very fruitful directions. Any 
objections in pushing and extending this for the 1.6.0 release? 1.5.0 is 
essentially done, I'm currently writing the last bits of documentation 
and resolve some minor warnings on Visual Studio...

Best regards,
Karli


------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to