If someone is interested: Finally our nodes are working fine using Fedora Core 8, fftw-3.1.3, openmpi-1.3 and gromacs-4.0.3. Thank you very much for your time and effort!

cheers
Bernhard



Message: 2
Date: Thu, 15 Jan 2009 08:37:16 +0100
From: patrick fuchs <patrick.fu...@univ-paris-diderot.fr>
Subject: Re: Subject: Re: Re: [gmx-users] Gromacs 4 bug?
To: Discussion list for GROMACS users <gmx-users@gromacs.org>
Message-ID: <496ee7ac.6040...@univ-paris-diderot.fr>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi all,
finally we (Berk and I) could find that there is a problem with lam-7.1.4 under Fedora9/Fedora10. Initially I thought it affected only gromacs-4 but a PhD student of my lab reported identical problems with gromacs-3.3 (hanging problems), while under FC8 I had no problem at all with the same hardware. So if you want to run gromacs-4 (or any version) under FC9/FC10, the fix I tested and that works is to use openmpi as an alternative to lam-7.1.4 (I only tested the last version openmpi-1.2.8). I didn't test other versions of lam (7.0.?) but it seems that the developers advice to switch to openmpi. So for the two other users (Bernhard and Antoine) who reported identical problems to the mailing list (see http://www.gromacs.org/pipermail/gmx-users/2008-December/038594.html and http://www.gromacs.org/pipermail/gmx-users/2008-December/038623.html) can you please check out that it works on your hardware using openmpi?
Hope it helps,

Patrick

Berk Hess a écrit :
Hi,

We have for now concluded that this is probably an issue related to lam7.1.4.

There were a few other users with mdrun crashes/hangs.
What it the status of your problems?

Berk


> Date: Tue, 13 Jan 2009 13:02:47 +0100
> From: patrick.fu...@univ-paris-diderot.fr
> To: gmx-users@gromacs.org
> Subject: Re: Subject: Re: Re: [gmx-users] Gromacs 4 bug?
>
> Hi Berk,
> it hangs after approximatively 45000 steps (the system is a simple DLPC
> bilayer), and there was a cpt file that has been generated (but it was
> generated [09:48] before it started to hang [9:58]) :
> ---------
> [fu...@cumin 2]$ ls -ltrh
> [snip]
> -rw-r--r-- 1 fuchs dsimb 384K janv. 13 09:33 traj.trr
> -rw-r--r-- 1 fuchs dsimb 385K janv. 13 09:48 state.cpt
> -rw-r--r-- 1 fuchs dsimb 66K janv. 13 09:57 md.log
> -rw-r--r-- 1 fuchs dsimb 5,4M janv. 13 09:58 traj.xtc
> -rw-r--r-- 1 fuchs dsimb 92K janv. 13 09:58 ener.edr
> [fu...@cumin 2]$ date
> Tue Jan 13 10:16:22 CET 2009
> ---------
> The version of MPI is: LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University.
> So shall I send you the tpr and cpt files off list ?
> Ciao,
>
> Patrick
>
> Berk Hess a écrit :
> > Hi,
> >
> > This is strange.
> > You run on 4 nodes and all processes hang at the same MPI call.
> > I see no reason why they should hang if they are all at the correct call.
> >
> > After how many steps does this happen?
> > If it is not much I can try to see if it also hangs on our system.
> > Otherwise, could you try to generate a checkpoint file with
> > which it hangs quickly?
> >
> > What version of MPI are you using?
> >
> > Berk
> >
> >
> > > Date: Tue, 13 Jan 2009 10:53:25 +0100
> > > From: patrick.fu...@univ-paris-diderot.fr
> > > To: gmx-users@gromacs.org
> > > Subject: Re: Subject: Re: Re: [gmx-users] Gromacs 4 bug?
> > >
> > > Hi Berk,
> > > I did a test on gromacs-4.0.2 under Fedora 10 (with fftw-3.0.1 and
> > > lam-7.1.4), using a slightly upgraded version of gcc compared to my
> > > previous post (gcc version 4.3.2 20081105 (Red hat 4.3.2-7)) on the same > > > hardware but it still hangs (so both FC9 and FC10 give the same problem, > > > while FC8 does not). Finally I could test mdrun_mpi in the debugger and
> > > here are the results of my tests. You were right, it seems that mdrun
> > > hangs at an MPI call, here are the outputs of each xterm:
> > >
> > > XTERM1
> > > ===================================================================
> > > GNU gdb Fedora (6.8-29.fc10)
> > > Copyright (C) 2008 Free Software Foundation, Inc.
> > > License GPLv3+: GNU GPL version 3 or later
> > > <http://gnu.org/licenses/gpl.html>
> > > This is free software: you are free to change and redistribute it.
> > > There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> > > and "show warranty" for details.
> > > This GDB was configured as "x86_64-redhat-linux-gnu"...
> > > (gdb) run
> > > Starting program: /usr/local/gromacs-4.0.2/bin/mdrun_mpi
> > > [Thread debugging using libthread_db enabled]
> > > [New Thread 0x12df30 (LWP 8285)]
> > > NNODES=4, MYRANK=0, HOSTNAME=cumin.dsimb.inserm.fr
> > > NODEID=0 argc=1
> > > :-) G R O M A C S (-:
> > >
> > > Giant Rising Ordinary Mutants for A Clerical Setup
> > >
> > > :-) VERSION 4.0.2 (-:
> > >
> > > [snip]
> > >
> > > starting mdrun 'Pure DLPC bilayer with 128 lipids and 3655 SPC water'
> > > 5000000 steps, 10000.0 ps.
> > > ^C
> > > Program received signal SIGINT, Interrupt.
> > > 0x0000003b978cc087 in sched_yield () from /lib64/libc.so.6
> > > Missing separate debuginfos, use: debuginfo-install
> > > e2fsprogs-libs-1.41.3-2.fc10.x86_64 glibc-2.9-3.x86_64
> > > libICE-1.0.4-4.fc10.x86_64 libSM-1.1.0-2.fc10.x86_64
> > > libX11-1.1.4-6.fc10.x86_64 libXau-1.0.4-1.fc10.x86_64
> > > libXdmcp-1.0.2-6.fc10.x86_64 libxcb-1.1.91-5.fc10.x86_64
> > > (gdb) where
> > > #0 0x0000003b978cc087 in sched_yield () from /lib64/libc.so.6
> > > #1 0x0000000000770c83 in lam_ssi_rpi_usysv_proc_read_env ()
> > > #2 0x0000000000784a39 in lam_ssi_rpi_usysv_advance_common ()
> > > #3 0x000000000074a1e0 in _mpi_req_advance ()
> > > #4 0x000000000073ced0 in lam_send ()
> > > #5 0x000000000075328e in MPI_Send ()
> > > #6 0x000000000074d7ec in MPI_Sendrecv ()
> > > #7 0x00000000004aebfd in gmx_sum_qgrid_dd ()
> > > #8 0x00000000004b40bb in gmx_pme_do ()
> > > #9 0x0000000000479a58 in do_force_lowlevel ()
> > > #10 0x00000000004d1d32 in do_force ()
> > > #11 0x00000000004214d2 in do_md ()
> > > #12 0x000000000041bea0 in mdrunner ()
> > > #13 0x0000000000422b94 in main ()
> > > (gdb)
> > > ===================================================================
> > >
> > >
> > > XTERM2
> > > ===================================================================
> > > GNU gdb Fedora (6.8-29.fc10)
> > > Copyright (C) 2008 Free Software Foundation, Inc.
> > > License GPLv3+: GNU GPL version 3 or later
> > > <http://gnu.org/licenses/gpl.html>
> > > This is free software: you are free to change and redistribute it.
> > > There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> > > and "show warranty" for details.
> > > This GDB was configured as "x86_64-redhat-linux-gnu"...
> > > (gdb) run
> > > Starting program: /usr/local/gromacs-4.0.2/bin/mdrun_mpi
> > > [Thread debugging using libthread_db enabled]
> > > [New Thread 0x12df30 (LWP 8294)]
> > > NNODES=4, MYRANK=1, HOSTNAME=cumin.dsimb.inserm.fr
> > > NODEID=1 argc=1
> > > ^C
> > > Program received signal SIGINT, Interrupt.
> > > 0x0000003b978cc087 in sched_yield () from /lib64/libc.so.6
> > > Missing separate debuginfos, use: debuginfo-install
> > > e2fsprogs-libs-1.41.3-2.fc10.x86_64 glibc-2.9-3.x86_64
> > > libICE-1.0.4-4.fc10.x86_64 libSM-1.1.0-2.fc10.x86_64
> > > libX11-1.1.4-6.fc10.x86_64 libXau-1.0.4-1.fc10.x86_64
> > > libXdmcp-1.0.2-6.fc10.x86_64 libxcb-1.1.91-5.fc10.x86_64
> > > (gdb) where
> > > #0 0x0000003b978cc087 in sched_yield () from /lib64/libc.so.6
> > > #1 0x0000000000770c83 in lam_ssi_rpi_usysv_proc_read_env ()
> > > #2 0x0000000000784a39 in lam_ssi_rpi_usysv_advance_common ()
> > > #3 0x000000000074a1e0 in _mpi_req_advance ()
> > > #4 0x000000000073ea90 in MPI_Wait ()
> > > #5 0x000000000074d800 in MPI_Sendrecv ()
> > > #6 0x00000000004aed44 in gmx_sum_qgrid_dd ()
> > > #7 0x00000000004b40bb in gmx_pme_do ()
> > > #8 0x0000000000479a58 in do_force_lowlevel ()
> > > #9 0x00000000004d1d32 in do_force ()
> > > #10 0x00000000004214d2 in do_md ()
> > > #11 0x000000000041bea0 in mdrunner ()
> > > #12 0x0000000000422b94 in main ()
> > > (gdb)
> > > ===================================================================
> > >
> > >
> > > XTERM3
> > > ===================================================================
> > > GNU gdb Fedora (6.8-29.fc10)
> > > Copyright (C) 2008 Free Software Foundation, Inc.
> > > License GPLv3+: GNU GPL version 3 or later
> > > <http://gnu.org/licenses/gpl.html>
> > > This is free software: you are free to change and redistribute it.
> > > There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> > > and "show warranty" for details.
> > > This GDB was configured as "x86_64-redhat-linux-gnu"...
> > > (gdb) run
> > > Starting program: /usr/local/gromacs-4.0.2/bin/mdrun_mpi
> > > [Thread debugging using libthread_db enabled]
> > > [New Thread 0x12df30 (LWP 8276)]
> > > NNODES=4, MYRANK=2, HOSTNAME=cumin.dsimb.inserm.fr
> > > NODEID=2 argc=1
> > > ^C
> > > Program received signal SIGINT, Interrupt.
> > > 0x0000000000770c70 in lam_ssi_rpi_usysv_proc_read_env ()
> > > Missing separate debuginfos, use: debuginfo-install
> > > e2fsprogs-libs-1.41.3-2.fc10.x86_64 glibc-2.9-3.x86_64
> > > libICE-1.0.4-4.fc10.x86_64 libSM-1.1.0-2.fc10.x86_64
> > > libX11-1.1.4-6.fc10.x86_64 libXau-1.0.4-1.fc10.x86_64
> > > libXdmcp-1.0.2-6.fc10.x86_64 libxcb-1.1.91-5.fc10.x86_64
> > > (gdb) where
> > > #0 0x0000000000770c70 in lam_ssi_rpi_usysv_proc_read_env ()
> > > #1 0x0000000000784a39 in lam_ssi_rpi_usysv_advance_common ()
> > > #2 0x000000000074a1e0 in _mpi_req_advance ()
> > > #3 0x000000000073ced0 in lam_send ()
> > > #4 0x000000000075328e in MPI_Send ()
> > > #5 0x000000000074d7ec in MPI_Sendrecv ()
> > > #6 0x00000000004aed44 in gmx_sum_qgrid_dd ()
> > > #7 0x00000000004b40bb in gmx_pme_do ()
> > > #8 0x0000000000479a58 in do_force_lowlevel ()
> > > #9 0x00000000004d1d32 in do_force ()
> > > #10 0x00000000004214d2 in do_md ()
> > > #11 0x000000000041bea0 in mdrunner ()
> > > #12 0x0000000000422b94 in main ()
> > > (gdb)
> > > ===================================================================
> > >
> > >
> > > XTERM4
> > > ===================================================================
> > > GNU gdb Fedora (6.8-29.fc10)
> > > Copyright (C) 2008 Free Software Foundation, Inc.
> > > License GPLv3+: GNU GPL version 3 or later
> > > <http://gnu.org/licenses/gpl.html>
> > > This is free software: you are free to change and redistribute it.
> > > There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> > > and "show warranty" for details.
> > > This GDB was configured as "x86_64-redhat-linux-gnu"...
> > > (gdb) run
> > > Starting program: /usr/local/gromacs-4.0.2/bin/mdrun_mpi
> > > [Thread debugging using libthread_db enabled]
> > > [New Thread 0x12df30 (LWP 8267)]
> > > NNODES=4, MYRANK=3, HOSTNAME=cumin.dsimb.inserm.fr
> > > NODEID=3 argc=1
> > > ^C
> > > Program received signal SIGINT, Interrupt.
> > > 0x0000000000770c70 in lam_ssi_rpi_usysv_proc_read_env ()
> > > Missing separate debuginfos, use: debuginfo-install
> > > e2fsprogs-libs-1.41.3-2.fc10.x86_64 glibc-2.9-3.x86_64
> > > libICE-1.0.4-4.fc10.x86_64 libSM-1.1.0-2.fc10.x86_64
> > > libX11-1.1.4-6.fc10.x86_64 libXau-1.0.4-1.fc10.x86_64
> > > libXdmcp-1.0.2-6.fc10.x86_64 libxcb-1.1.91-5.fc10.x86_64
> > > (gdb) where
> > > #0 0x0000000000770c70 in lam_ssi_rpi_usysv_proc_read_env ()
> > > #1 0x0000000000784a39 in lam_ssi_rpi_usysv_advance_common ()
> > > #2 0x000000000074a1e0 in _mpi_req_advance ()
> > > #3 0x000000000073ea90 in MPI_Wait ()
> > > #4 0x000000000074d800 in MPI_Sendrecv ()
> > > #5 0x00000000004aebfd in gmx_sum_qgrid_dd ()
> > > #6 0x00000000004b40bb in gmx_pme_do ()
> > > #7 0x0000000000479a58 in do_force_lowlevel ()
> > > #8 0x00000000004d1d32 in do_force ()
> > > #9 0x00000000004214d2 in do_md ()
> > > #10 0x000000000041bea0 in mdrunner ()
> > > #11 0x0000000000422b94 in main ()
> > > (gdb)
> > > ===================================================================
> > >
> > >
> > > Cheers,
> > >
> > > Patrick
> > >
> >
> >
> > ------------------------------------------------------------------------
> > Express yourself instantly with MSN Messenger! MSN Messenger
> > <http://clk.atdmt.com/AVE/go/onm00200471ave/direct/01/>
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > gmx-users mailing list gmx-users@gromacs.org
> > http://www.gromacs.org/mailman/listinfo/gmx-users
> > Please search the archive at http://www.gromacs.org/search before posting!
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-requ...@gromacs.org.
> > Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>
> --
> _________________________________________________________________
> !!!! new E-mail address: patrick.fu...@univ-paris-diderot.fr !!!!
> !!!! new postal address !!!
> Patrick FUCHS
> Equipe de Bioinformatique Genomique et Moleculaire
> INTS, INSERM UMR-S726, Université Paris Diderot,
> 6 rue Alexandre Cabanel, 75015 Paris
> Tel : +33 (0)1-44-49-30-57 - Fax : +33 (0)1-47-34-74-31
> Web Site: http://www.dsimb.inserm.fr/~fuchs
> _______________________________________________
> gmx-users mailing list gmx-users@gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php

------------------------------------------------------------------------
What can you do with the new Windows Live? Find out <http://www.microsoft.com/windows/windowslive/default.aspx>


------------------------------------------------------------------------

_______________________________________________
gmx-users mailing list    gmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

_______________________________________________
gmx-users mailing list    gmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Reply via email to