Matthew Knepley <knep...@gmail.com> writes:

>> The matrix entries are multiplied by 2, that is, the number of processes
>> used to execute the code.
>>
>
> No. This was mostly intended for GPUs, where there is 1 process. If you
> want to use multiple MPI processes, then each process can only introduce
> some disjoint subset of the values. This is also how MatSetValues() works,
> but it might not be as obvious.

They need not be disjoint, just sum to the expected values. This interface is 
very convenient for FE and FV methods. MatSetValues with ADD_VALUES has similar 
semantics without the intermediate storage, but it forces you to submit one 
element matrix at a time. Classic parallelism granularity versus memory use 
tradeoff with MatSetValuesCOO being a clear win on GPUs and more nuanced for 
CPUs.

Reply via email to