I'm not sure any of us have experience with Amesos:SuperLU, so I'm not sure
    anyone will know right away what the problem may be.

I was wondering if, while writing the wrappers and testing them out, someone managed to figure out the requisite combination of installation time options.

I don't recall who wrote the wrappers and when. You might have to do some git-archeology to find out.


    * What happens if you run the program with just two MPI jobs on one machine?
    In that case, you can watch what the two programs are doing by having 'top'
    run in a separate window.

I ran a job with a direct solver from the beginning on one node with 72 processes (see job107396-flops graph) and it is evident that all except a few cores do a similar amount of flops. I guess some of the processes are meant to "coordinate the work" and not do much actual computation.

Too many processes. Try to get it down to two.


    * If the answer to the last question is yes, then either Amesos or SuperLU 
is
    apparently copying the data of the linear system from all other processes to
    just one process that then solves the linear system. It might be useful to
    take a debugger, running with just two MPI processes, to step into the 
Amesos
    routines to see if you get to a place where that is happening, and then to
    read the code in that place to see what flags need to be set to make sure 
the
    solution really does happen in a distributed way.

One would probably need to debug the code while running on at least two nodes . I do not have much experience with debugging an MPI code. Will try to learn more about this.

There's a video lecture on that topic :-)

Best
 W.


--
------------------------------------------------------------------------
Wolfgang Bangerth          email:                 [email protected]
                           www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/452eceac-1c98-2f8d-bb80-6f203f3452a5%40colostate.edu.

Reply via email to