Hi, Beside the available optimised CPU-based linear algebra libraries, is there any profiling available for Dolfin with a GPU-based BLAS, such as CUBLAS [1]? In this specific case, given the right hardware, switching to the GPU-based back end is trivial as it comes with the exact interface of standard BLAS.
Should we expect a significant boost in Dolfin performance or are there other remarkable non-BLAS bottlenecks in Dolfin? -Ali [1] http://developer.download.nvidia.com/compute/cuda/1_0/CUBLAS_Library_1.0.pdf _______________________________________________ DOLFIN-dev mailing list [email protected] http://www.fenics.org/mailman/listinfo/dolfin-dev
