Hey,

2014/1/22 Karl Rupp <r...@iue.tuwien.ac.at>

> Hey,
>
>
> > So today I went back to ViennaCL. I tried to move the equivalence
>
>> column&trans <=> row&notrans upwards in the dispatching mechanism but it
>> turns out to be impossible, because matrix<T,row_major> is not (and
>> should not be) convertible to matrix<T, column_major>, rendering the
>> underlying signature inappropriate...
>>
>
> yep, having the storage layout tag explicitly encoded in the type is
> something I'm not at all happy with. It is certainly useful in uBLAS, but
> it doesn't quite fly with what we want to do.
>
>
>
>  I am very skeptical so as to how to handle this problem. I am thinking
>> about changing the internal signature to something more "low-level",
>>
>> void gemm(bool/*is_A_trans*/,bool/*is_B_trans*/
>>
>>          ,const  vcl_size_t  /*M*/,  const  vcl_size_t  /*N*/,  const
>>  vcl_size_t  /*K*/,  const  T  /*alpha*/
>>
>>          ,viennacl::backend::mem_handle   const &  /*A*/  ,  const
>>  vcl_size_t  /*A_internal_size1*/,  const  vcl_size_t  /*A_internal_size2*/
>>
>>          ,const  vcl_size_t  /*A_start1*/,  const  vcl_size_t
>>  /*A_start2*/,  const  vcl_size_t  /*A_inc1*/,  const  vcl_size_t
>>  /*A_inc2*/
>>
>>          ,viennacl::backend::mem_handle   const &  /*B*/,  const
>>  vcl_size_t  /*B_internal_size1*/,  const  vcl_size_t  /*B_internal_size2*/
>>
>>          ,const  vcl_size_t  /*B_start1*/,  const  vcl_size_t
>>  /*B_start2*/,  const  vcl_size_t  /*B_inc1*/,  const  vcl_size_t
>>  /*B_inc2*/
>>
>>          ,const  T  /*beta*/,  viennacl::backend::mem_handle &  /*C*/,
>>  const  vcl_size_t  /*C_internal_size1*/,  const  vcl_size_t
>>  /*C_internal_size2*/
>>
>>          ,const  vcl_size_t  /*C_start1*/,  const  vcl_size_t
>>  /*C_start2*/,  const  vcl_size_t  /*C_inc1*/,  const  vcl_size_t
>>  /*C_inc2*/);
>>
>
> Yes, this is a good option. I think we can reduce the many arguments by
> wrapping informations into a common struct, e.g.
>
>  struct backend_matrix
>  {
>    viennacl::backend::mem_handle h_;
>    vcl_size_t internal_size_1_;
>    ...
>    bool is_trans_;
>  };
> and only pass
>  gemm(alpha, A, B, beta, C);
> rather than a lengthy list of parameters which we will get crazy about.
> Note that this is also more in line with the object-oriented C-interface in
> the shared libviennacl library.
>
> Anyhow, the simplification of gemm() and friends is similar to what I
> already had in mind for reducing compilation times and the code required
> for 'worker functions' in the respective backends. Right now we have
> function overloads for transposed/non-transposed matrix parameters in each
> of the backends, which is clunky and duplicates quite some code. Also, such
> a more BLAS-like interface makes it easier for us to hook in other
> libraries as backends :-)
>
>
>
>  While this solution is acceptable to me, I fear that it will introduce a
>> lack of harmony considering that some other functions will stay
>> otherwise like
>>
>> template  <typename  NumericT,  typename  F,
>>
>>                  typename  ScalarType1>
>>
>> void  am(matrix_base<NumericT,  F>  &  mat1,
>>
>>                matrix_base<NumericT,  F>  const  &  mat2,  ScalarType1
>>  const  &  alpha,  vcl_size_t  len_alpha,  bool  reciprocal_alpha,  bool
>>  flip_sign_alpha)
>>
>
> We can (and probably should) migrate them in the same manner.
>
>
>
>  The only reasonable solution I see is to clearly separate in the code
>> the functions which could be linked with BLAS (and give them a lower
>> level signature), from the other ones. For example, putting them in two
>> separate files... is there any problem with doing this?
>>
>
> We should take this opportunity to further decouple the convenience API
> from the actual numerical kernels. This would also pave the way for
> ViennaCL 2.0.0, where I'd like the C++ API to be sitting entirely on top of
> the shared libviennacl (and/or other BLAS backends if enabled). Of course
> this won't be done within a few days, but I think this is a great long-term
> perspective :-)
>
> Does this make sense to you?
>

Yes it does! Actually, what we would ideally do is to, by default, link
ViennaCL to the integrated set of numerical kernels (those of libviennacl,
which would be generated dynamically for the OpenCL backend), and allow one
to switch backend to MKL/OpenBLAS/CuBLAS/FunFunFunBLAS... The only
"obstacle" being that the set of kernels supported by ViennaCL is bigger
than the standard BLAS interface.
It was itching me to do it, but I was hesitating because it involves
significant changes. I'll start working on it in my "external-blas_linking"
folder.
There is still a dilemma, however, that I would like to sort out if
possible: for OpenCL, we have a set of pre-generated kernel sources (for
compilation time reasons). However, if they are pre-generated, then it
means that they cannot easily be coupled to the generator, and that they
are not optimal in performance. We've seen that even in the case of a
simple axpy operation, the bandwidth may greatly vary if the parameters are
not properly tuned (for CPUs and AMD GPUs, particularly). Wouldn't it make
sense to default everything to the generator, and to allow a
"VIENNACL_WITH_STATIC_OPENCL" flag? In the end, this would make the shared
libviennacl an alternative to GATLAS... Plus, I'm not convinced that this
would have a huge impact on the C++ compilation time, since most of the
workload only appears in the first time the generator is instantiated.

Best regards,
Philippe




> Best regards,
> Karli
>
>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to