Suggest running the non-mumps case with -log_summary [to confirm that '-np 6' is actually used in both cases]
Secondly - you can try a 'release' version of openmpi or mpich and see if that works. [I don't see a mention of openmpi-1.9a on the website] Also you can try -log_trace to see where its hanging [or figure out how to run code in debugger on this cluster]. But that might not help in figuring out the solution to the hang.. Satish On Wed, 25 Jun 2014, Matthew Knepley wrote: > On Wed, Jun 25, 2014 at 7:09 AM, Gunnar Jansen <[email protected]> > wrote: > > > You are right about the queuing system. The job is submitted with a PBS > > script specifying the number of nodes/processors. On the cluster petsc is > > configured in a module environment which sets the appropriate flags for > > compilers/rules etc. > > > > The same exact job script on the same exact nodes with a standard krylov > > method does not give any trouble but executes nicely on all processors (and > > also give the correct result). > > > > Therefore my suspicion is a missing flag in the mumps interface. Is this > > maybe rather a topic for the mumps-dev team? > > > > I doubt this. The whole point of MPI is to shield code from these details. > > Can you first try this system with SuperLU_dist? > > Thanks, > > MAtt > > > > Best, Gunnar > > > > > > > > 2014-06-25 15:52 GMT+02:00 Dave May <[email protected]>: > > > > This sounds weird. > >> > >> The launch line you provided doesn't include any information regarding > >> how many processors (nodes/nodes per core to use). I presume you are using > >> a queuing system. My guess is that there could be an issue with either (i) > >> your job script, (ii) the configuration of the job scheduler on the > >> machine, or (iii) the mpi installation on the machine. > >> > >> Have you been able to successfully run other petsc (or any mpi) codes > >> with the same launch options (2 nodes, 3 procs per node)? > >> > >> Cheers. > >> Dave > >> > >> > >> > >> > >> On 25 June 2014 15:44, Gunnar Jansen <[email protected]> wrote: > >> > >>> Hi, > >>> > >>> i try to solve a problem in parallel with MUMPS as the direct solver. As > >>> long as I run the program on only 1 node with 6 processors everything > >>> works > >>> fine! But using 2 nodes with 3 processors each gets mumps stuck in the > >>> factorization. > >>> > >>> For the purpose of testing I run the ex2.c on a resolution of 100x100 > >>> (which is of course way to small for a direct solver in parallel). > >>> > >>> The code is run with : > >>> mpirun ./ex2 -on_error_abort -pc_type lu -pc_factor_mat_solver_package > >>> mumps -ksp_type preonly -log_summary -options_left -m 100 -n 100 > >>> -mat_mumps_icntl_4 3 > >>> > >>> The petsc-configuration I used is: > >>> --prefix=/opt/Petsc/3.4.4.extended --with-mpi=yes > >>> --with-mpi-dir=/opt/Openmpi/1.9a/ --with-debugging=no --download-mumps > >>> --download-scalapack --download-parmetis --download-metis > >>> > >>> Is this common behavior? Or is there an error in the petsc configuration > >>> I am using here? > >>> > >>> Best, > >>> Gunnar > >>> > >> > >> > > > > >
