Barry Smith <[email protected]> writes: > For the default PCMG using PCREDUNDANT on the coarse level with LU > > VecScatterBegin 2356 1.0 1.6390e+01 4.7 0.00e+00 0.0 5.9e+09 1.7e+01 > 0.0e+00 2 0 98 99 0 2 0 98 99 0 0 > VecScatterEnd 2356 1.0 4.1647e+02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 69 0 0 0 0 69 0 0 0 0 0
433/2356 = 0.18 seconds per scatter. > so the gathering together of the vector values from all processes to the one > process is killing performance (presumably it is just using the "default" > VecScatter so sending individual messages to all processes. Terribly slow. If > the VecScatter were smart enough to switch to an alltoall here it would > actually help an enormous amount). > > --------- > > For the DMDAREPART the two sets of VecScatter is still killing you > > VecScatterBegin 2907 1.0 2.7453e-02 2.1 0.00e+00 0.0 9.2e+07 7.6e+02 > 0.0e+00 0 0 92 99 0 0 0 98 99 0 0 > VecScatterEnd 2907 1.0 1.8748e-01 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 73 µs per scatter > VecScatterBegin 1393 3.0 3.2119e-02112.6 0.00e+00 0.0 5.9e+06 3.2e+01 > 0.0e+00 0 0 6 0 0 0 0 99 99 0 0 > VecScatterEnd 1393 3.0 3.2946e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 4 0 0 0 0 99 0 0 0 0 0 259 µs per scatter These scatter costs don't seem that bad. > Ideas on how to proceed? From anyone? We actually have a lot of irregular/high-degree communication patterns in DMPlex's use of SF. I think it would be valuable to add an analysis component that builds a good mapping to the communication primitives. It's not clear to me that it's needed here at the present scale.
signature.asc
Description: PGP signature
