Re: [ViennaCL-support] Program Optimization

Karl Rupp Wed, 08 Jun 2016 11:30:52 -0700

Hi Alan,

 > Thanks for your prompt response yesterday.


you're welcome!


> I have been using ViennaCL for a number of months and only joined the
> support group yesterday ! I therefore have a back catologue of queries
> that I hope you can assist with.
>
> _Background _
> 1. writing a bundle least squares adjustment for my masters.
> The matrices I am solving have the following properties.
> 1. Design matrices are large and very sparse
> 2. The Normal matrice (I have seen this called the Hessian) is also
> large and sparse but has a systematic structure (with correct parameter
> arrangement
> - One partition is comprised of 3x3 blocks down the diagonal
> - One partition is effectively a solid block of data
> - The matrix is symetrical (not sure that is the right term). The upper
> triangular data is mirrored in the lower triangular part.
>
> _Matrix Formation
> _
>  From the ViennaCL help I interpreted that there is a overhead
> associated with computing on the CPU and then copying element by element
> to the GPU.
> Therefore I am using Eigen matrices to form the design matrices (which
> are sparse).
>
> I continue using Eigen to multiple the matrices to get the Normal Matrix
> (or Hessian). I have found that multipling sparse matrices is very
> efficient even on the CPU so have kept uses Eigen.

Are you sure that you want to form the normal equations? This usually 
squares your condition number, amplifying the effect of round-off errors.

I recommend using the QR factorization for least squares problems. You 
can find an example in examples/tutorial/least-squares.cpp (it currently 
requires Boost, but I'm working on a Boost-free version).
Depending on the sparsity of your matrices, a sparse QR factorization 
might be the best option. This, however, is not implemented in ViennaCL, 
as sparse QR runs better on CPUs. Maybe Eigen offers a sparse QR 
factorization via an external package.


>
> _Normal Equation solution_
>
> I then do the solution using ViennaCL.
>
> _Question 1._
>
> I am using the Levenberg-Marquardt method to assist in the solution.
> This applies a damping term down the main diagonal which assists in
> convergence and minimizing iterations.
> - Currently I apply this damping term by multiplying element by element
> down the diagonal.
>
> Is there a more efficient way to do this in ViennaCL (i.e. multiplying a
> scalar along the diagonal)?

Well, you could rewrite this manipulation for a matrix A as
  A_dampled = A - alpha D
where D is the diagonal of A. For matrix-vector products you could no 
compute
  Ax - alpha D x
for a vector x, without modifying the original A. Maybe you need other 
operations, where you could use the same trick.


> _Question 2._
>
> The Normal equation is sparse but the resultant Inverse is dense.
> I solve using         Norm_inv = viennacl::linalg::solve(Norm, I,
> upper_tag());   // I is the identity matrix
>
> The Normal inverse is required as it contains information required for
> statistical analysis of the solution.
>
> Is it possible to solve a sparse matrix and generate a dense matrix
> solution ?

One usually avoids to explicitly set up the inverse of a sparse matrix, 
because memory requirements may be huge. Instead, one uses iterative 
solvers for the right hand side(s). In your case, you could try CG if 
your initial matrix had full column rank.

> _Question 3.
> _
> What is the suggested solver to use for a symmetric matrix with the
> block structure I describe above.
> upper_tag() or cg_tag() ?

I still suggest a sparse QR factorization instead of dealing with Normal 
equations :-)


> _Question 4.
> _
> I have looked at the reorder function. However I am not sure whether I
> need this as I can control the matrix structure by organizing the
> parameter order in the design matrix.
>
> Just for my own interest I found bandwidth-reduction.cpp on the website.
>
> I cut out these three lines which I believe are the main function calls
>
> std::vector<int> r = generate_random_reordering(n);
> std::vector< std::map<int, double> > matrix2 = reorder_matrix(matrix, r);
> r = viennacl::reorder
> <http://viennacl.sourceforge.net/doc/namespaceviennacl.html#af22338e452cee008bee5e38070ddc611>(matrix2,
> viennacl::gibbs_poole_stockmeyer_tag
> <http://viennacl.sourceforge.net/doc/structviennacl_1_1gibbs__poole__stockmeyer__tag.html>());
>
> I don't understand how this works (I haven't looked up the map class).
> A. Does the original Matrix get rearranged ?
> B. How is the resultant Vector used elsewhere.
> C. Do you need to unorder later on ?
>     - I need to multiple the Inverse by another matrix to generate the
> final solution.
>     - I would need to unorder the inverse or reorder this second matrix
> the same way.
>     - I also need to know how to find the correct solution elements.

Don't bother about the reorder routines. They minimize the bandwidth of 
the sparsity pattern (i.e. they bring nonzero entries closer to the 
diagonal) and thus increase cache utilization, but don't affect solver 
performance much. I expect that it won't make a difference for 
least-squares problems.


> _Question 5.
> _
> I am currently using a computer with a AMD Radeon HD 7600M Series
> graphics card.
> My understanding is that I can only use OpenCL for GPU operations.
>
> I am considering getting a machine with a NVIDIA graphics card as I
> understand that this enable using CUDA.
>
> Is CUDA the most mature and best option for GPU processing (specifically
> for ViennaCL).

CUDA and OpenCL are both fully supported in ViennaCL. There are some 
minor differences for historical reasons, but they will disappear soon.
It doesn't matter which one you use on NVIDIA GPUs, so you can for 
simplicity just stick with OpenCL and enjoy cross-platform portability :-)


> I notice that ViennaCL has an open for VIENNACL_WITH_OPENMP define.
> I am not sure what OpenMP actually does but should this be enabled and I
> assume OpenMP installed ?

OpenMP is for multithreading on CPUs:
  https://en.wikipedia.org/wiki/OpenMP
You can ignore it if you're looking for GPU acceleration.

Best regards,
Karli




------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
ViennaCL-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-support

Re: [ViennaCL-support] Program Optimization

Reply via email to