On Thu, Dec 31, 2020 at 5:23 PM Jed Brown <j...@jedbrown.org> wrote:

> Stefano Zampini <stefano.zamp...@gmail.com> writes:
>
> > You should swap fieldsplit and ASM
> >
> > -pc_type fieldsplit
> > -fieldsplit_0_pc_type asm
>
> Note that this incurs separate communication for each split. If you nest
> them the other way, there would be one heavy communication and then a bunch
> of local work. This latency impact may not matter as much when you're
> already launching GPU kernels for the local work, though the communication
> to/from device memory is also expensive.
>

This is one processor, but I don't know what you mean by 'the other way'. I
did try making one field with a 10D vector but DM died on that.

I am now thinking that I could assemble factorization matrices at the same
time as the operator matrix, on the GPU, and then stuff them into ASM or
factor them and then stuff if ASM can not drive parallel GPU factoring. I
hope that we can come up with a coherent model but I do want to get this
done in early April for SC.

Thanks,

Reply via email to