On 02/10/10, Jennifer Williams <jennifer.willi...@ed.ac.uk> wrote:

Sorry for the delay in replying back. I start the job using the following script file:

#$ -S /bin/bash
#$ -l h_rt=47:59:00
#$ -j y
#$ -pe mpich2 8
#$ -cwd
cd /home/jwillia4/GRO/gromacs-4.0.7/JJW_003/PH_TORUN
/home/jwillia4/GRO/bin/mpirun -np 8 /home/jwillia4/GRO/bin/mdrun_mpi -v -s md.tpr

The strange thing is that sometimes it works and the job runs to completion and sometimes it crashes immediately with the orte error so I know that it is not the input files causing the problems. It seems entirely random.
That sounds like some kind of dynamic linking problem. You may be able to constrain the GROMACS configure program to link statically to your choice of MPI library with --enable-static or something - but only if static versions of the MPI libraries exist.

Has it to do with the -pe mpich2 8 line? I was previously using Open MPI installed on the cluster for common use but now have downloaded everything into my home directory. The script has been adapted from the time when I didn't have my own OpenMPI in my home directory. Perhaps it needs further alteration but I don't know what.
Try things and see. We've no idea what your queueing flags are or should be doing, but involving two different MPI libraries is asking for trouble.
How would I do about checking whether MPI is running?
By running a test program. Either get a "Hello world" program from an MPI tutorial, or perhaps something available with the library itself.

Mark

If you spot anything suspicious in the above commands please let me know.

Thanks

Jenny


Quoting Chandan Choudhury <iitd...@gmail.com>:

>As Justin said give the command line options for mdrun and also check that
>your mpi environment is running.  Better to run a parallel job and check its
>output.
>
>Chadnan
>
>--
>Chandan kumar Choudhury
>NCL, Pune
>INDIA
>
>
>On Mon, Feb 8, 2010 at 8:02 PM, Justin A. Lemkul <jalem...@vt.edu> wrote:
>
>>
>>
>>Jennifer Williams wrote:
>>
>>>
>>>Dear All,
>>>
>>>I am having problems compiling gromacs 4.0.7 in parallel. I am following
>>>the
>>>Quick and Dirty Installation instructions on the gromacs webpage.
>>>I downloaded the the versions of fftw, OpenMPI and gromacs-4.0.7 following
>>>these instructions.
>>>
>>>Everything seems to compile OK and I get all the serial executables
>>>including mdrun written to my bin directory and they seem to run fine.
>>>However when I try to run mdrun_mpi on 6 nodes I get the following:
>>>
>>>[vlxbig16:08666] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>>runtime/orte_init_stage1.c at line 182
>>>[vlxbig16:08667] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>>runtime/orte_init_stage1.c at line 182
>>>[vlxbig16:08700] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>>runtime/orte_init_stage1.c at line 182
>>>[vlxbig16:08670] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>>runtime/orte_init_stage1.c at line 182
>>>[vlxbig16:08681] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>>runtime/orte_init_stage1.c at line 182
>>>[vlxbig16:08659] [NO-NAME] ORTE_ERROR_LOG: Not found in file
>>>runtime/orte_init_stage1.c at line 182
>>>--------------------------------------------------------------------------
>>>It looks like orte_init failed for some reason; your parallel process is
>>>likely to abort.  There are many reasons that a parallel process can
>>>fail during orte_init; some of which are due to configuration or
>>>environment problems.  This failure appears to be an internal failure;
>>>here's some additional information (which may only be relevant to an
>>>Open MPI developer):
>>>
>>> orte_rml_base_select failed
>>> --> Returned value -13 instead of ORTE_SUCCESS
>>>
>>>
>>>Does anyone have any idea what is causing this? Computer support at my
>>>University is not sure.
>>>
>>>
>>How are you launching mdrun_mpi (command line)?
>>
>>-Justin
>>
>>
>>>Thanks
>>>
>>>
>>>
>>>
>>--
>>========================================
>>
>>Justin A. Lemkul
>>Ph.D. Candidate
>>ICTAS Doctoral Scholar
>>MILES-IGERT Trainee
>>Department of Biochemistry
>>Virginia Tech
>>Blacksburg, VA
>>jalemkul[at]vt.edu | (540) 231-9080
>>http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
>>
>>========================================
>>--
>>gmx-users mailing list    gmx-users@gromacs.org
>>http://lists.gromacs.org/mailman/listinfo/gmx-users
>>Please search the archive at http://www.gromacs.org/search before posting!
>>Please don't post (un)subscribe requests to the list. Use the www interface
>>or send it to gmx-users-requ...@gromacs.org.
>>Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>>
>



Dr. Jennifer Williams
Institute for Materials and Processes
School of Engineering
University of Edinburgh
Sanderson Building
The King's Buildings
Mayfield Road
Edinburgh, EH9 3JL, United Kingdom
Phone: ++44 (0)131 650 4 861


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use thewww interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/mailing_lists/users.php
-- 
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Reply via email to