On Tue, Nov 23, 2010 at 10:32, Clemens Domanig <clemens.domanig at uibk.ac.at>wrote:
> I'm writing a FEM-program and at the moment I try to parallize the assembly > of the stiffness-matrix. The idea would be if I for example use 4 threads I > create 4 matrices and 4 rhs-vectors. Each thread fills its matrix and > rhs-vector. At the end I add all matrices and vectors. > Adding sparse matrices together is a relatively expensive operation, usually more expensive than assembling the matrix you wanted in the first place. > But the Petsc-manual says that Petsc is not thread-safe. > There is a minor issue of some logging functions that are not thread safe. Wrapping locks around some of those operations would not be a huge deal, but hasn't been done. > Is that the reason why it doesn't work? > The matrix data structures cannot cheaply be mutated in a thread-safe way. We would either need fine-grained locks which have more overhead, and are very tricky when the user does not preallocate correctly (requiring "rollback" logic), or a coarse-grained lock for the whole MatSetValues. If you want to do multi-threaded assembly, you should just put your own lock around your call to MatSetValues. As long as your physics does some work (e.g. is a bit more than a linear constant-coefficient problem on affine elements), this can work fine for a few threads, usually 8-16 or so, before serialization of MatSetValues becomes a bottleneck. Note that you can also use multiple MPI processes to keep your cores busy, there have been a few discussions on this list, and petsc-dev, about the relative merits of threads versus processes. If you tell us a bit about your problem, we may be able to predict how each will perform. This helps to assess the importance of making PETSc thread-safe and making certain kernels perform well with threads, relative to other features. Jed -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20101123/9842dee8/attachment-0001.htm>
