Re: [deal.II] Deal.ii installation problem:- Step 40 runtime error

2018-01-23 Thread RAJAT ARORA
Hello Professor,

Thanks for the reply. 
I had been struggling with this issue for 5 days now. I raised the ticket 
on XSEDE forum on 18th Jan.
The technical team was expecting everything was fine from their side and 
advised me to reinstall with some different modules loaded. They were also 
expecting mixing of different mpi components.

I actually tried every possible combination (over 20) of modules. When 
nothing worked, I finally decided that to seek help from this forum. I was 
hesitating at first as I was sure that this is not the problem with library 
or its installation but rather it is the problem with the system. I needed 
some guidance to prove this thing.

However, I finally received an email yesterday (after posting this), that 
they made some security update on the system last week which caused the 
issue.  After the change was corrected by them, everything works fine just 
using candi.

Thanks a lot for the help. I really appreciate it.

I am still posting the answers to the questions you asked.

1. It always used to happen at a different position when run with different 
processors on a single node.

2. Even using multiple nodes was giving this error.










On Monday, January 22, 2018 at 11:09:41 PM UTC-5, Wolfgang Bangerth wrote:
>
> On 01/22/2018 08:48 AM, RAJAT ARORA wrote: 
> > 
> > Running with PETSc on 2 MPI rank(s)... 
> > Cycle 0: 
> > Number of active cells:   1024 
> > Number of degrees of freedom: 4225 
> > Solved in 10 iterations. 
> > 
> > 
> > 
> +-+++ 
> > | Total wallclock time elapsed since start| 0.222s |   
>  | 
> > | ||   
>  | 
> > | Section | no. calls |  wall time | % of total 
> | 
> > 
> +-+---+++ 
> > | assembly| 1 | 0.026s |12% 
> | 
> > | output  | 1 |0.0448s |20% 
> | 
> > | setup   | 1 |0.0599s |27% 
> | 
> > | solve   | 1 |0.0176s |   7.9% 
> | 
> > 
> +-+---+++ 
> > 
> > 
> > Cycle 1: 
> > Number of active cells:   1960 
> > Number of degrees of freedom: 8421 
> > r001.pvt.bridges.psc.edu.27927Assertion failure at 
> > 
> /nfs/site/home/phcvs2/gitrepo/ifs-all/Ofed_Delta/rpmbuild/BUILD/libpsm2-10.3.3/ptl_am/ptl.c:152:
>  
>
> > nbytes == req->recv_msglen 
> > r001.pvt.bridges.psc.edu.27927step-40: Reading from remote process' 
> memory 
> > failed. Disabling CMA support 
> > [r001:27927] *** Process received signal *** 
>
> These error messages suggest that the first cycle actually worked. So your 
> MPI 
> installation is not completely broken apparently. 
>
> Is the error message reproducible? Is it always in the same place and with 
> the 
> same message? When you run two processes, are they on separate machines or 
> on 
> the same one? 
>
> Best 
>   W. 
>
> -- 
>  
> Wolfgang Bangerth  email: bang...@colostate.edu 
>  
> www: http://www.math.colostate.edu/~bangerth/ 
>
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Deal.ii installation problem:- Step 40 runtime error

2018-01-22 Thread Wolfgang Bangerth

On 01/22/2018 08:48 AM, RAJAT ARORA wrote:


Running with PETSc on 2 MPI rank(s)...
Cycle 0:
Number of active cells:   1024
Number of degrees of freedom: 4225
Solved in 10 iterations.


+-+++
| Total wallclock time elapsed since start| 0.222s ||
| |||
| Section | no. calls |  wall time | % of total |
+-+---+++
| assembly| 1 | 0.026s |12% |
| output  | 1 |0.0448s |20% |
| setup   | 1 |0.0599s |27% |
| solve   | 1 |0.0176s |   7.9% |
+-+---+++


Cycle 1:
Number of active cells:   1960
Number of degrees of freedom: 8421
r001.pvt.bridges.psc.edu.27927Assertion failure at 
/nfs/site/home/phcvs2/gitrepo/ifs-all/Ofed_Delta/rpmbuild/BUILD/libpsm2-10.3.3/ptl_am/ptl.c:152: 
nbytes == req->recv_msglen
r001.pvt.bridges.psc.edu.27927step-40: Reading from remote process' memory 
failed. Disabling CMA support

[r001:27927] *** Process received signal ***


These error messages suggest that the first cycle actually worked. So your MPI 
installation is not completely broken apparently.


Is the error message reproducible? Is it always in the same place and with the 
same message? When you run two processes, are they on separate machines or on 
the same one?


Best
 W.

--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[deal.II] Deal.ii installation problem:- Step 40 runtime error

2018-01-22 Thread RAJAT ARORA
Hello all,

I recently installed deal.ii using candi but after everything successfully 
finishes, I am not able to run step-40 on more than 1 processor. For, 1 
processor it runs fine.


To install everything, I used candi and and loaded following modules.

  1) psc_path/1.1  2) slurm/17.02.5 3) cmake/3.7.2   4) mpi/
gcc_openmpi


I have tried several combinations of the modules in the past week but it is 
not working.
Can someone please help me finding out what is the problem with the 
installation?

I will be able to provide any relevant log file or command output if 
needed. 
Any help will be appreciated.

Here is the error message when I do

mpirun -np 2 ./step-40


Running with PETSc on 2 MPI rank(s)...
Cycle 0:
   Number of active cells:   1024
   Number of degrees of freedom: 4225
   Solved in 10 iterations.


+-+++
| Total wallclock time elapsed since start| 0.222s ||
| |||
| Section | no. calls |  wall time | % of total |
+-+---+++
| assembly| 1 | 0.026s |12% |
| output  | 1 |0.0448s |20% |
| setup   | 1 |0.0599s |27% |
| solve   | 1 |0.0176s |   7.9% |
+-+---+++


Cycle 1:
   Number of active cells:   1960
   Number of degrees of freedom: 8421
r001.pvt.bridges.psc.edu.27927Assertion failure at 
/nfs/site/home/phcvs2/gitrepo/ifs-all/Ofed_Delta/rpmbuild/BUILD/libpsm2-10.3.3/ptl_am/ptl.c:152:
 
nbytes == req->recv_msglen
r001.pvt.bridges.psc.edu.27927step-40: Reading from remote process' memory 
failed. Disabling CMA support
[r001:27927] *** Process received signal ***
[r001:27927] Signal: Aborted (6)
[r001:27927] Signal code:  (-6)
[r001:27927] [ 0] /lib64/libpthread.so.0(+0xf370)[0x2b8ccfc72370]
[r001:27927] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8cd04bf1d7]
[r001:27927] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b8cd04c08c8]
[r001:27927] [ 3] /lib64/libpsm2.so.2(+0x14df8)[0x2b8cda72cdf8]
[r001:27927] [ 4] /lib64/libpsm2.so.2(+0xeb59)[0x2b8cda726b59]
[r001:27927] [ 5] /lib64/libpsm2.so.2(+0x8fcf)[0x2b8cda720fcf]
[r001:27927] [ 6] /lib64/libpsm2.so.2(+0x7b1a)[0x2b8cda71fb1a]
[r001:27927] [ 7] /lib64/libpsm2.so.2(+0x21012)[0x2b8cda739012]
[r001:27927] [ 8] /lib64/libpsm2.so.2(psm2_mq_ipeek2+0xa8)[0x2b8cda734198]
[r001:27927] [ 9] 
/usr/mpi/gcc/openmpi-1.10.4-hfi/lib64/openmpi/mca_mtl_psm2.so(ompi_mtl_psm2_progress+0x6b)[0x2b8cdc461d7b]
[r001:27927] [10] 
/usr/mpi/gcc/openmpi-1.10.4-hfi/lib64/libopen-pal.so.13(opal_progress+0x2a)[0x2b8cd2e381ea]
[r001:27927] [11] 
/usr/mpi/gcc/openmpi-1.10.4-hfi/lib64/libmpi.so.12(ompi_request_default_wait_all+0x225)[0x2b8ccf46f5f5]
[r001:27927] [12] 
/usr/mpi/gcc/openmpi-1.10.4-hfi/lib64/libmpi.so.12(PMPI_Waitall+0x9f)[0x2b8ccf49fdff]
[r001:27927] [13] 
/pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_MPI_Waitall+0x9)[0x2b8cc6c1c485]
[r001:27927] [14] 
/pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_ParCSRCommHandleDestroy+0x3f)[0x2b8cc6ba7664]
[r001:27927] [15] 
/pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_ParCSRMatrixExtractBExt+0x48)[0x2b8cc6bab3a7]
[r001:27927] [16] 
/pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_BoomerAMGBuildInterp+0x39b)[0x2b8cc6b649b5]
[r001:27927] [17] 
/pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(hypre_BoomerAMGSetup+0x4056)[0x2b8cc6b507f6]
[r001:27927] [18] 
/pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(HYPRE_BoomerAMGSetup+0x9)[0x2b8cc6b470d6]
[r001:27927] [19] 
/pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(+0x7ea066)[0x2b8cc67e8066]
[r001:27927] [20] 
/pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/petsc-3.7.6/lib/libpetsc.so.3.7(PCSetUp+0x551)[0x2b8cc67efcbb]
[r001:27927] [21] 
/pylon5/ss4s82p/rarora/Code-Libraries/dealii85me-gnu4.9/deal.II-v8.5.1/lib/libdeal_II.g.so.8.5.1(_ZN6dealii13PETScWrappers21PreconditionBoomerAMG10initializeERKNS0_10MatrixBaseERKNS1_14AdditionalDataE+0x63)[0x2b8cc1bcf94d]
[r001:27927] [22] 
./step-40(_ZN6Step4014LaplaceProblemILi2EE5solveEv+0x106)[0x42ca38]
[r001:27927] [23] 
./step-40(_ZN6Step4014LaplaceProblemILi2EE3runEv+0x245)[0x42ead1]
[r001:27927] [24] ./step-40(main+0x3c)[0x41d07d]
[r001:27927] [25] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2b8cd04abb35]
[r001:27927] [26] ./step-40[0x41cf29]
[r001:27927] *** End of error message ***

-- 
The deal.II project is located at http://www.dealii.org/
For