Hey,
> There is some trickery going on with transpositions and layout,
> but it
> works for every transpose/layout combination. One can also link
> A's blas
> to his own gemm function, provided a tiny wrapper (essentially
> to ensure
> signa
2013/12/19 Philippe Tillet
> Hey,
>
>
>
>
> 2013/12/18 Karl Rupp
>
>> Hi.
>>
>>
>> > A short update : I've implemented linkage to CBlas and CuBlas with
>>
>>> dynamic selection.
>>> If activated through VIENNACL_WITH_CUBLAS, one can go back and forth
>>> between cublas and the original backend b
Hey,
2013/12/18 Karl Rupp
> Hi.
>
>
> > A short update : I've implemented linkage to CBlas and CuBlas with
>
>> dynamic selection.
>> If activated through VIENNACL_WITH_CUBLAS, one can go back and forth
>> between cublas and the original backend by doing:
>>
>> A.blas().gemm(NULL);
>> A.blas(
Hi.
> A short update : I've implemented linkage to CBlas and CuBlas with
> dynamic selection.
> If activated through VIENNACL_WITH_CUBLAS, one can go back and forth
> between cublas and the original backend by doing:
>
> A.blas().gemm(NULL);
> A.blas().gemm(viennacl::backend::blas::cublas_functio
Hi,
A short update : I've implemented linkage to CBlas and CuBlas with dynamic
selection.
If activated through VIENNACL_WITH_CUBLAS, one can go back and forth
between cublas and the original backend by doing:
A.blas().gemm(NULL);
A.blas().gemm(viennacl::backend::blas::cublas_functions::gemm);
(a
Hi,
2013/12/15 Karl Rupp
> Hi,
>
>
> > Yeah, it certainly is a bit tedious. Feel free to only do this for
>
>> matrix-matrix multiplications for now, a full operation table is
>> presumably too much of a refactoring for ViennaCL 1.x.y, but much
>> better suited for 2.0.0.
>>
>
Hi,
> Yeah, it certainly is a bit tedious. Feel free to only do this for
> matrix-matrix multiplications for now, a full operation table is
> presumably too much of a refactoring for ViennaCL 1.x.y, but much
> better suited for 2.0.0.
>
>
> Yes. It's actually a pretty complicated
Hi,
2013/12/15 Karl Rupp
> Hey,
>
>
> I agree. However, it seems to me that setting the implementation for
>> each matrix would end up being tedious... one table per memory backend
>> since to make sense conceptually to me, since the performance (and the
>> portability) of each blas implementa
Hey,
> I agree. However, it seems to me that setting the implementation for
> each matrix would end up being tedious... one table per memory backend
> since to make sense conceptually to me, since the performance (and the
> portability) of each blas implementation is determined by the underlying
>
Hey,
2013/12/15 Karl Rupp
> Hi again,
>
>
> > While we're at it, let's discuss the dynamic dispatching mechanism we'd
>
>> ideally want. I see two options:
>>
>> (1) A global function pointer table. So, one could for example set:
>> viennacl::internal_blas::sgemv_ptr = &viennacl::cblas_wrapper;
Hi again,
> While we're at it, let's discuss the dynamic dispatching mechanism we'd
> ideally want. I see two options:
>
> (1) A global function pointer table. So, one could for example set:
> viennacl::internal_blas::sgemv_ptr = &viennacl::cblas_wrapper;
> where cblas_wrapper essentially checks
Hi,
> I've just realized that most BLAS implementation don't provide anyway to
> do strided matrix accesses in the non-leading dimension ... ! Is this
> correct?
yes, this is correct. I probably wasn't considered to be important
enough back then when memory bandwidths were still high compared t
Hey again,
While we're at it, let's discuss the dynamic dispatching mechanism we'd
ideally want. I see two options:
(1) A global function pointer table. So, one could for example set:
viennacl::internal_blas::sgemv_ptr = &viennacl::cblas_wrapper;
where cblas_wrapper essentially checks for the str
Hello,
I've just realized that most BLAS implementation don't provide anyway to do
strided matrix accesses in the non-leading dimension ... ! Is this correct?
I was hoping that we could have avoided such special cases, but it seems
like a couple of tests will need to be made.
Philippe
2013/12/1
Hey,
> Okay. I'll probably do it statically at first, and I'll keep in mind
> that we want it dynamic at the end of the day (well, not at the end of
> today :D). Once everything works statically, I think we can discuss the
> details of the API we want.
Fine with me. This way we can first collect
Hey,
Okay. I'll probably do it statically at first, and I'll keep in mind that
we want it dynamic at the end of the day (well, not at the end of today
:D). Once everything works statically, I think we can discuss the details
of the API we want.
Best regards,
Philippe
2013/12/14 Karl Rupp
> Hi
Hi,
> Okay, so I am writing a little wrapper in order to allow for CBLAS
> (OpenBlas, and possibly MKL, provided that the headers are compatible)
> linking.
> I started wondering, why not doing the same thing for CuBLAS?
Yes, CuBLAS and clAmdBlas both make sense, even if we only use it for
comp
PS : Of course, since BLAS is only for floating point arithmetics, the
current ViennaCL kernel would be used for integer types even under such
linkage
2013/12/14 Philippe Tillet
> Hello,
>
> Okay, so I am writing a little wrapper in order to allow for CBLAS
> (OpenBlas, and possibly MKL, provid
Hello,
Okay, so I am writing a little wrapper in order to allow for CBLAS
(OpenBlas, and possibly MKL, provided that the headers are compatible)
linking.
I started wondering, why not doing the same thing for CuBLAS? It seems like
our efforts are getting more focused on dynamic code generation and
19 matches
Mail list logo