Hey,
2013/12/15 Karl Rupp <r...@iue.tuwien.ac.at>
> Hi again,
>
>
> > While we're at it, let's discuss the dynamic dispatching mechanism we'd
>
>> ideally want. I see two options:
>>
>> (1) A global function pointer table. So, one could for example set:
>> viennacl::internal_blas::sgemv_ptr = &viennacl::cblas_wrapper;
>> where cblas_wrapper essentially checks for the stride in the non-leading
>> dimension and forwards to cblas if this stride is one. Of course, if the
>> current backend is different, cblas_wrapper is not defined, and
>> cublas_wrapper can be defined instead.
>>
>
> I'd prefer to have this function table per object or per memory backend
> rather than being global, otherwise this will sooner or later bite us in a
> multi-threaded setting. We (or a user) might want to use one implementation
> of a certain operation for smaller or skinny matrices and other
> implementations for larger/square matrices, in which case things are much
> easier if tied to the particular object.
>
>
I agree. However, it seems to me that setting the implementation for each
matrix would end up being tedious... one table per memory backend since to
make sense conceptually to me, since the performance (and the portability)
of each blas implementation is determined by the underlying memory system.
If there is no objection, I think I will go for that neat solution.
Now, another question, how to set the default? I think that a preprocessor
directive would be fine here. We already need the preprocessor's #ifdef to
define the includes (and some wrappers) anyway. So using it to initialize
that table seems reasonable to me (ie VIENNACL_WITH_CBLAS would enable some
internal definitions and would initialize the table).
Best regards,
Philippe
>
> I like this solution a lot, since this allows one to mix multiple blas
>> implementation in the same program. This can be useful in some case
>> (OpenBlas is faster than MKL for BLAS3, but MKL is supposedly faster for
>> all the rest). HOWEVER, this requires linkage if we want to avoid
>> multiple definitions of that global pointer table.
>>
>
> That's another reason why it shouldn't be global ;-)
>
>
> Since we now provide
>> a libviennacl.so, though, we could include the global table therein, and
>> one would link with it if he wants to use the additional
>> functionnalities. Plus, if one has his own blas function he wants to
>> benchmark against ours, for example, then this solution is very
>> convenient.
>>
>
> The shared library is available in addition to the header-only
> implementation, it's not compulsory. We might change that for ViennaCL
> 2.0.0, but 1.x.y will stay header-only.
>
>
>
> (2) A template parameter. So that one would write:
>> viennacl::prod<CBlasBackend>(), similarly to how I did with UMinTL.
>> However, I am not very fond of this solution for ViennaCL, because it
>> will create a huge bloat in the code, since templates essentially need
>> to propagate, and it might screw up a bit the template deduction
>> mechanism of some compiler (since prod<> is already templated with the
>> underlying ScalarType...)
>>
>
> Same here, I consider this to be a wrong use of templates for the reasons
> you mentioned. Fortunately we don't have to worry about performance for
> something tiny like 3x3-matrices, so a bit of runtime logic is not an issue.
>
> Best regards,
> Karli
>
>
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel