Hello Karli,
Here is a problem I am facing:

I have an expression (X - y*c') .^2.rowwise().sum() (in Eigen notation)
X is a row-major matrix with rows >> cols

I have initialized all data using the process given in custom_context.cpp.
This is the code that pertains to viennacl.

`viennacl::matrix<ScalarType, viennacl::row_major> vcl_X(bufPoints(),
temp.rows(), temp.cols());`
 `viennacl::vector<ScalarType> vcl_Ones(bufOnes(), temp.rows());`
 `viennacl::vector<ScalarType> vcl_Ones2(testOnes(), cols);`

  `viennacl::vector<ScalarType> currCluster(testPoint(),cols);`
  `viennacl::vector<ScalarType> vcl_s1 =
(viennacl::linalg::prod(viennacl::linalg::element_pow((vcl_X -
viennacl::linalg::outer_prod(vcl_Ones, currCluster)),2.0),vcl_Ones2));`

The time taken to execute this operation is : 1.54 (data size 1936*1216
rows and 3 columns)
The time that I have shown is excluding the data offload time to GPU.

Now, if I implement the same operation using Eigen on the CPU (without any
optimization) the time reported in 0.04754!
The results obtained by both processes are the same. So what could be wrong
here?

Am I missing out something here?
Sumit
_______________________________________________
ViennaCL-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-support

Reply via email to