On Aug 8, 2013, at 9:14 AM, Karl Rupp <[email protected]> wrote:

> Hi Michael,
> 
> > We have recently been trying to re-align our OpenMP fork
> 
>> 2) Nonzero-based thread partitioning:
>> Rather than evenly dividing the number of rows among threads, we can
>> partition the thread ownership ranges according to the number of
>> non-zeros in each row. This balances the work load between threads and
>> thus increases strong scalability due to optimised bandwidth
>> utilisation. In general, this optimisation should integrate well with
>> threadcomms, since it only changes the thread ownership ranges, but it
>> does require some structural changes since nnz is currently not passed
>> to PetscLayoutSetUp. Any thoughts on whether people regard such a scheme
>> as useful would be greatly appreciated.
> 
> This is a reasonable optimization, I used a similar strategy for sparse 
> matrices on the GPU. Others should comment on whether the interface change to 
> PetscLayoutSetUp is acceptable.

    I don't think PetscLayoutSetUp() should be complicated in this fashion.  
This kind of non-trivial parallel partitioning decisions that depend on the 
mesh or graph of the problem are made by DMs.   

    Barry

> 

Reply via email to