Re: [deal.II] Deal.ii installation problem:- Step 40 runtime error
Hello Professor, Thanks for the reply. I had been struggling with this issue for 5 days now. I raised the ticket on XSEDE forum on 18th Jan. The technical team was expecting everything was fine from their side and advised me to reinstall with some different modules loaded. They were also expecting mixing of different mpi components. I actually tried every possible combination (over 20) of modules. When nothing worked, I finally decided that to seek help from this forum. I was hesitating at first as I was sure that this is not the problem with library or its installation but rather it is the problem with the system. I needed some guidance to prove this thing. However, I finally received an email yesterday (after posting this), that they made some security update on the system last week which caused the issue. After the change was corrected by them, everything works fine just using candi. Thanks a lot for the help. I really appreciate it. I am still posting the answers to the questions you asked. 1. It always used to happen at a different position when run with different processors on a single node. 2. Even using multiple nodes was giving this error. On Monday, January 22, 2018 at 11:09:41 PM UTC-5, Wolfgang Bangerth wrote: > > On 01/22/2018 08:48 AM, RAJAT ARORA wrote: > > > > Running with PETSc on 2 MPI rank(s)... > > Cycle 0: > > Number of active cells: 1024 > > Number of degrees of freedom: 4225 > > Solved in 10 iterations. > > > > > > > +-+++ > > | Total wallclock time elapsed since start| 0.222s | > | > > | || > | > > | Section | no. calls | wall time | % of total > | > > > +-+---+++ > > | assembly| 1 | 0.026s |12% > | > > | output | 1 |0.0448s |20% > | > > | setup | 1 |0.0599s |27% > | > > | solve | 1 |0.0176s | 7.9% > | > > > +-+---+++ > > > > > > Cycle 1: > > Number of active cells: 1960 > > Number of degrees of freedom: 8421 > > r001.pvt.bridges.psc.edu.27927Assertion failure at > > > /nfs/site/home/phcvs2/gitrepo/ifs-all/Ofed_Delta/rpmbuild/BUILD/libpsm2-10.3.3/ptl_am/ptl.c:152: > > > > nbytes == req->recv_msglen > > r001.pvt.bridges.psc.edu.27927step-40: Reading from remote process' > memory > > failed. Disabling CMA support > > [r001:27927] *** Process received signal *** > > These error messages suggest that the first cycle actually worked. So your > MPI > installation is not completely broken apparently. > > Is the error message reproducible? Is it always in the same place and with > the > same message? When you run two processes, are they on separate machines or > on > the same one? > > Best > W. > > -- > > Wolfgang Bangerth email: bang...@colostate.edu > > www: http://www.math.colostate.edu/~bangerth/ > > -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [deal.II] Deal.ii installation problem:- Step 40 runtime error
On 01/22/2018 08:48 AM, RAJAT ARORA wrote: Running with PETSc on 2 MPI rank(s)... Cycle 0: Number of active cells: 1024 Number of degrees of freedom: 4225 Solved in 10 iterations. +-+++ | Total wallclock time elapsed since start| 0.222s || | ||| | Section | no. calls | wall time | % of total | +-+---+++ | assembly| 1 | 0.026s |12% | | output | 1 |0.0448s |20% | | setup | 1 |0.0599s |27% | | solve | 1 |0.0176s | 7.9% | +-+---+++ Cycle 1: Number of active cells: 1960 Number of degrees of freedom: 8421 r001.pvt.bridges.psc.edu.27927Assertion failure at /nfs/site/home/phcvs2/gitrepo/ifs-all/Ofed_Delta/rpmbuild/BUILD/libpsm2-10.3.3/ptl_am/ptl.c:152: nbytes == req->recv_msglen r001.pvt.bridges.psc.edu.27927step-40: Reading from remote process' memory failed. Disabling CMA support [r001:27927] *** Process received signal *** These error messages suggest that the first cycle actually worked. So your MPI installation is not completely broken apparently. Is the error message reproducible? Is it always in the same place and with the same message? When you run two processes, are they on separate machines or on the same one? Best W. -- Wolfgang Bangerth email: bange...@colostate.edu www: http://www.math.colostate.edu/~bangerth/ -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[deal.II] Deal.ii installation problem:- Step 40 runtime error
Hello all, I recently installed deal.ii using candi but after everything successfully finishes, I am not able to run step-40 on more than 1 processor. For, 1 processor it runs fine. To install everything, I used candi and and loaded following modules. 1) psc_path/1.1 2) slurm/17.02.5 3) cmake/3.7.2 4) mpi/ gcc_openmpi I have tried several combinations of the modules in the past week but it is not working. Can someone please help me finding out what is the problem with the installation? I will be able to provide any relevant log file or command output if needed. Any help will be appreciated. Here is the error message when I do mpirun -np 2 ./step-40 Running with PETSc on 2 MPI rank(s)... Cycle 0: Number of active cells: 1024 Number of degrees of freedom: 4225 Solved in 10 iterations. +-+++ | Total wallclock time elapsed since start| 0.222s || | ||| | Section | no. calls | wall time | % of total | +-+---+++ | assembly| 1 | 0.026s |12% | | output | 1 |0.0448s |20% | | setup | 1 |0.0599s |27% | | solve | 1 |0.0176s | 7.9% | +-+---+++ Cycle 1: Number of active cells: 1960 Number of degrees of freedom: 8421 r001.pvt.bridges.psc.edu.27927Assertion failure at /nfs/site/home/phcvs2/gitrepo/ifs-all/Ofed_Delta/rpmbuild/BUILD/libpsm2-10.3.3/ptl_am/ptl.c:152: nbytes == req->recv_msglen r001.pvt.bridges.psc.edu.27927step-40: Reading from remote process' memory failed. Disabling CMA support [r001:27927] *** Process received signal *** [r001:27927] Signal: Aborted (6) [r001:27927] Signal code: (-6) [r001:27927] [ 0] /lib64/libpthread.so.0(+0xf370)[0x2b8ccfc72370] [r001:27927] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8cd04bf1d7] [r001:27927] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b8cd04c08c8] [r001:27927] [ 3] /lib64/libpsm2.so.2(+0x14df8)[0x2b8cda72cdf8] [r001:27927] [ 4] /lib64/libpsm2.so.2(+0xeb59)[0x2b8cda726b59] [r001:27927] [ 5] /lib64/libpsm2.so.2(+0x8fcf)[0x2b8cda720fcf] [r001:27927] [ 6] /lib64/libpsm2.so.2(+0x7b1a)[0x2b8cda71fb1a] [r001:27927] [ 7] /lib64/libpsm2.so.2(+0x21012)[0x2b8cda739012] [r001:27927] [ 8] /lib64/libpsm2.so.2(psm2_mq_ipeek2+0xa8)[0x2b8cda734198] [r001:27927] [ 9] /usr/mpi/gcc/openmpi-1.10.4-hfi/lib64/openmpi/mca_mtl_psm2.so(ompi_mtl_psm2_progress+0x6b)[0x2b8cdc461d7b] [r001:27927] [10] /usr/mpi/gcc/openmpi-1.10.4-hfi/lib64/libopen-pal.so.13(opal_progress+0x2a)[0x2b8cd2e381ea] [r001:27927] [11] /usr/mpi/gcc/openmpi-1.10.4-hfi/lib64/libmpi.so.12(ompi_request_default_wait_all+0x225)[0x2b8ccf46f5f5] [r001:27927] [12] /usr/mpi/gcc/openmpi-1.10.4-hfi/lib64/libmpi.so.12(PMPI_Waitall+0x9f)[0x2b8ccf49fdff] [r001:27927] [13] /pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_MPI_Waitall+0x9)[0x2b8cc6c1c485] [r001:27927] [14] /pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_ParCSRCommHandleDestroy+0x3f)[0x2b8cc6ba7664] [r001:27927] [15] /pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_ParCSRMatrixExtractBExt+0x48)[0x2b8cc6bab3a7] [r001:27927] [16] /pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_BoomerAMGBuildInterp+0x39b)[0x2b8cc6b649b5] [r001:27927] [17] /pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_BoomerAMGSetup+0x4056)[0x2b8cc6b507f6] [r001:27927] [18] /pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(HYPRE_BoomerAMGSetup+0x9)[0x2b8cc6b470d6] [r001:27927] [19] /pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(+0x7ea066)[0x2b8cc67e8066] [r001:27927] [20] /pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(PCSetUp+0x551)[0x2b8cc67efcbb] [r001:27927] [21] /pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/deal.II-v8.5.1/lib/libdeal_II.g.so.8.5.1(_ZN6dealii13PETScWrappers21PreconditionBoomerAMG10initializeERKNS0_10MatrixBaseERKNS1_14AdditionalDataE+0x63)[0x2b8cc1bcf94d] [r001:27927] [22] ./step-40(_ZN6Step4014LaplaceProblemILi2EE5solveEv+0x106)[0x42ca38] [r001:27927] [23] ./step-40(_ZN6Step4014LaplaceProblemILi2EE3runEv+0x245)[0x42ead1] [r001:27927] [24] ./step-40(main+0x3c)[0x41d07d] [r001:27927] [25] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2b8cd04abb35] [r001:27927] [26] ./step-40[0x41cf29] [r001:27927] *** End of error message *** -- The deal.II project is located at http://www.dealii.org/ For