I tried openmpi v 1.3.3 but, I got a same error. mdrun_mpi -multi works fine. REMD has a problem.
## error message ## step 500, will finish Fri Aug 13 16:43:25 2010[localhost:20171] *** Process received signal *** [localhost:20172] *** Process received signal *** [localhost:20172] Signal: Segmentation fault (11) [localhost:20172] Signal code: Address not mapped (1) [localhost:20172] Failing at address: (nil) [localhost:20171] Signal: Segmentation fault (11) [localhost:20171] Signal code: Address not mapped (1) [localhost:20171] Failing at address: (nil) [localhost:20172] [ 0] /lib64/libpthread.so.0 [0x36e9e0eb10] [localhost:20172] [ 1] mdrun_mpi_d(replica_exchange+0x1136) [0x42a446] [localhost:20172] [ 2] mdrun_mpi_d(do_md+0x48a8) [0x433ca8] [localhost:20172] [ 3] mdrun_mpi_d(mdrunner+0x11f1) [0x42f181] [localhost:20172] [ 4] mdrun_mpi_d(main+0x9f1) [0x438101] [localhost:20172] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x36e921d994] [localhost:20172] [ 6] mdrun_mpi_d [0x420359] [localhost:20172] *** End of error message *** [localhost:20171] [ 0] /lib64/libpthread.so.0 [0x36e9e0eb10] [localhost:20171] [ 1] mdrun_mpi_d(replica_exchange+0x1136) [0x42a446] [localhost:20171] [ 2] mdrun_mpi_d(do_md+0x48a8) [0x433ca8] [localhost:20171] [ 3] mdrun_mpi_d(mdrunner+0x11f1) [0x42f181] [localhost:20171] [ 4] mdrun_mpi_d(main+0x9f1) [0x438101] [localhost:20171] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x36e921d994] [localhost:20171] [ 6] mdrun_mpi_d [0x420359] [localhost:20171] *** End of error message *** -------------------------------------------------------------------------- mpiexec noticed that process rank 2 with PID 20172 on node localhost.localdomain exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- -------------------------------------------------- From: <[email protected]> Sent: Wednesday, August 11, 2010 7:00 PM To: <[email protected]> Subject: gmx-users Digest, Vol 76, Issue 53 > Send gmx-users mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.gromacs.org/mailman/listinfo/gmx-users > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gmx-users digest..." > > > Today's Topics: > > 1. RE: Replica Exchange problem in gmx-4.5 beta3 (Berk Hess) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 11 Aug 2010 11:23:54 +0200 > From: Berk Hess <[email protected]> > Subject: RE: [gmx-users] Replica Exchange problem in gmx-4.5 beta3 > To: Discussion list for GROMACS users <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset="iso-8859-1" > > > Hi, > > This could be due to problems with MPICH. > Could you please try openmpi and report back? > > Thanks, > > Berk > > From: [email protected] > To: [email protected] > Date: Wed, 11 Aug 2010 18:09:00 +0900 > Subject: [gmx-users] Replica Exchange problem in gmx-4.5 beta3 > > > > > > > > > Hello! > I'm doing a simple REMD test with 4 > replicas. > Time step : 2 fs > Exchange : every 500fs > > md_0.tpr md_1.tpr md_2.tpr md_3.tpr > > mpiexec(or mpirun) -np 4 mdrun_mpi_d -deffnm md_ > -multi 4 -replex 200 > I got a error message. > > ##error## > 1000000 steps, 2000.0 ps. > step 600 rank > 3 in job 10 localhost.localdomain_50305 caused collective > abort of all ranks > exit status of rank 3: killed by signal 11 > rank > 2 in job 10 localhost.localdomain_50305 caused collective > abort of all ranks > exit status of rank 2: killed by signal 11 > > > Using gmx4.0.7 , It works fine. > Is this bug in gmx-4.5 beta ? > In log files, no error message were > found. > gmx-4.5 beta3 was compiled with icc 11.0 > and mpich2-1.2.1p1. > > > > -- > gmx-users mailing list [email protected] > http://lists.gromacs.org/mailman/listinfo/gmx-users > Please search the archive at http://www.gromacs.org/search before posting! > Please don't post (un)subscribe requests to the list. Use the > www interface or send it to [email protected]. > Can't post? Read http://www.gromacs.org/mailing_lists/users.php > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://lists.gromacs.org/pipermail/gmx-users/attachments/20100811/b4d58a9c/attachment-0001.html > > ------------------------------ > > -- > gmx-users mailing list > [email protected] > http://lists.gromacs.org/mailman/listinfo/gmx-users > Please search the archive at http://www.gromacs.org/search before posting! > > End of gmx-users Digest, Vol 76, Issue 53 > ***************************************** >
-- gmx-users mailing list [email protected] http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [email protected]. Can't post? Read http://www.gromacs.org/mailing_lists/users.php

