Re: [gmx-users] compilation problems orte error
Sorry for the delay in replying back. I start the job using the following script file: #$ -S /bin/bash #$ -l h_rt=47:59:00 #$ -j y #$ -pe mpich2 8 #$ -cwd cd /home/jwillia4/GRO/gromacs-4.0.7/JJW_003/PH_TORUN /home/jwillia4/GRO/bin/mpirun -np 8 /home/jwillia4/GRO/bin/mdrun_mpi -v -s md.tpr The strange thing is that sometimes it works and the job runs to completion and sometimes it crashes immediately with the orte error so I know that it is not the input files causing the problems. It seems entirely random. Has it to do with the -pe mpich2 8 line? I was previously using Open MPI installed on the cluster for common use but now have downloaded everything into my home directory. The script has been adapted from the time when I didn't have my own OpenMPI in my home directory. Perhaps it needs further alteration but I don't know what. How would I do about checking whether MPI is running? If you spot anything suspicious in the above commands please let me know. Thanks Jenny Quoting Chandan Choudhury iitd...@gmail.com: As Justin said give the command line options for mdrun and also check that your mpi environment is running. Better to run a parallel job and check its output. Chadnan -- Chandan kumar Choudhury NCL, Pune INDIA On Mon, Feb 8, 2010 at 8:02 PM, Justin A. Lemkul jalem...@vt.edu wrote: Jennifer Williams wrote: Dear All, I am having problems compiling gromacs 4.0.7 in parallel. I am following the Quick and Dirty Installation instructions on the gromacs webpage. I downloaded the the versions of fftw, OpenMPI and gromacs-4.0.7 following these instructions. Everything seems to compile OK and I get all the serial executables including mdrun written to my bin directory and they seem to run fine. However when I try to run mdrun_mpi on 6 nodes I get the following: [vlxbig16:08666] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08667] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08700] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08670] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08681] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08659] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 -- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_rml_base_select failed -- Returned value -13 instead of ORTE_SUCCESS Does anyone have any idea what is causing this? Computer support at my University is not sure. How are you launching mdrun_mpi (command line)? -Justin Thanks -- Justin A. Lemkul Ph.D. Candidate ICTAS Doctoral Scholar MILES-IGERT Trainee Department of Biochemistry Virginia Tech Blacksburg, VA jalemkul[at]vt.edu | (540) 231-9080 http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php Dr. Jennifer Williams Institute for Materials and Processes School of Engineering University of Edinburgh Sanderson Building The King's Buildings Mayfield Road Edinburgh, EH9 3JL, United Kingdom Phone: ++44 (0)131 650 4 861 -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php
Re: [gmx-users] compilation problems orte error
On 02/10/10, Jennifer Williams jennifer.willi...@ed.ac.uk wrote:Sorry for the delay in replying back. I start the job using the following script file:#$ -S /bin/bash#$ -l h_rt=47:59:00#$ -j y#$ -pe mpich2 8#$ -cwdcd /home/jwillia4/GRO/gromacs-4.0.7/JJW_003/PH_TORUN/home/jwillia4/GRO/bin/mpirun -np 8 /home/jwillia4/GRO/bin/mdrun_mpi -v -s md.tprThe strange thing is that sometimes it works and the job runs to completion and sometimes it crashes immediately with the orte error so I know that it is not the input files causing the problems. It seems entirely random.That sounds like some kind of dynamic linking problem. You may be able to constrain the GROMACS configure program to link statically to your choice of MPI library with --enable-static or something - but only if static versions of the MPI libraries exist.Has it to do with the -pe mpich2 8 line? I was previously using Open MPI installed on the cluster for common use but now have downloaded everything into my home directory. The script has been adapted from the time when I didn't have my own OpenMPI in my home directory. Perhaps it needs further alteration but I don't know what.Try things and see. We've no idea what your queueing flags are or should be doing, but involving two different MPI libraries is asking for trouble.How would I do about checking whether MPI is running?By running a test program. Either get a Hello world program from an MPI tutorial, or perhaps something available with the library itself.MarkIf you spot anything suspicious in the above commands please let me know.ThanksJennyQuoting Chandan Choudhury iitd...@gmail.com:As Justin said give the command line options for mdrun and also check thatyour mpi environment is running. Better to run a parallel job and check itsoutput.Chadnan--Chandan kumar ChoudhuryNCL, PuneINDIAOn Mon, Feb 8, 2010 at 8:02 PM, Justin A. Lemkul jalem...@vt.edu wrote:Jennifer Williams wrote:Dear All,I am having problems compiling gromacs 4.0.7 in parallel. I am followingtheQuick and Dirty Installation instructions on the gromacs webpage.I downloaded the the versions of fftw, OpenMPI and gromacs-4.0.7 followingthese instructions.Everything seems to compile OK and I get all the serial executablesincluding mdrun written to my bin directory and they seem to run fine.However when I try to run mdrun_mpi on 6 nodes I get the following:[vlxbig16:08666] [NO-NAME] ORTE_ERROR_LOG: Not found in fileruntime/orte_init_stage1.c at line 182[vlxbig16:08667] [NO-NAME] ORTE_ERROR_LOG: Not found in fileruntime/orte_init_stage1.c at line 182[vlxbig16:08700] [NO-NAME] ORTE_ERROR_LOG: Not found in fileruntime/orte_init_stage1.c at line 182[vlxbig16:08670] [NO-NAME] ORTE_ERROR_LOG: Not found in fileruntime/orte_init_stage1.c at line 182[vlxbig16:08681] [NO-NAME] ORTE_ERROR_LOG: Not found in fileruntime/orte_init_stage1.c at line 182[vlxbig16:08659] [NO-NAME] ORTE_ERROR_LOG: Not found in fileruntime/orte_init_stage1.c at line 182--It looks like orte_init failed for some reason; your parallel process islikely to abort. There are many reasons that a parallel process canfail during orte_init; some of which are due to configuration orenvironment problems. This failure appears to be an internal failure;here's some additional information (which may only be relevant to anOpen MPI developer): orte_rml_base_select failed -- Returned value -13 instead of ORTE_SUCCESSDoes anyone have any idea what is causing this? Computer support at myUniversity is not sure.How are you launching mdrun_mpi (command line)?-JustinThanks--Justin A. LemkulPh.D. CandidateICTAS Doctoral ScholarMILES-IGERT TraineeDepartment of BiochemistryVirginia TechBlacksburg, VAjalemkul[at]vt.edu | (540) 231-9080http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin--gmx-users mailing list gmx-users@gromacs.orghttp://lists.gromacs.org/mailman/listinfo/gmx-usersPlease search the archive at http://www.gromacs.org/search before posting!Please don't post (un)subscribe requests to the list. Use the www interfaceor send it to gmx-users-requ...@gromacs.org.Can't post? Read http://www.gromacs.org/mailing_lists/users.phpDr. Jennifer WilliamsInstitute for Materials and ProcessesSchool of EngineeringUniversity of EdinburghSanderson BuildingThe King's BuildingsMayfield RoadEdinburgh, EH9 3JL, United KingdomPhone: ++44 (0)131 650 4 861-- The University of Edinburgh is a charitable body, registered inScotland, with registration number SC005336.-- gmx-users mailing list gmx-users@gromacs.orghttp://lists.gromacs.org/mailman/listinfo/gmx-usersPlease search the archive at http://www.gromacs.org/search before posting!Please don't post (un)subscribe requests to the list. Use thewww interface or send it to gmx-users-requ...@gromacs.org.Can't post? Read http://www.gromacs.org/mailing_lists/users.php -- gmx-users mailing list
Re: [gmx-users] compilation problems orte error
Jennifer Williams wrote: Dear All, I am having problems compiling gromacs 4.0.7 in parallel. I am following the Quick and Dirty Installation instructions on the gromacs webpage. I downloaded the the versions of fftw, OpenMPI and gromacs-4.0.7 following these instructions. Everything seems to compile OK and I get all the serial executables including mdrun written to my bin directory and they seem to run fine. However when I try to run mdrun_mpi on 6 nodes I get the following: [vlxbig16:08666] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08667] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08700] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08670] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08681] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08659] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 -- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_rml_base_select failed -- Returned value -13 instead of ORTE_SUCCESS Does anyone have any idea what is causing this? Computer support at my University is not sure. How are you launching mdrun_mpi (command line)? -Justin Thanks -- Justin A. Lemkul Ph.D. Candidate ICTAS Doctoral Scholar MILES-IGERT Trainee Department of Biochemistry Virginia Tech Blacksburg, VA jalemkul[at]vt.edu | (540) 231-9080 http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php
Re: [gmx-users] compilation problems orte error
As Justin said give the command line options for mdrun and also check that your mpi environment is running. Better to run a parallel job and check its output. Chadnan -- Chandan kumar Choudhury NCL, Pune INDIA On Mon, Feb 8, 2010 at 8:02 PM, Justin A. Lemkul jalem...@vt.edu wrote: Jennifer Williams wrote: Dear All, I am having problems compiling gromacs 4.0.7 in parallel. I am following the Quick and Dirty Installation instructions on the gromacs webpage. I downloaded the the versions of fftw, OpenMPI and gromacs-4.0.7 following these instructions. Everything seems to compile OK and I get all the serial executables including mdrun written to my bin directory and they seem to run fine. However when I try to run mdrun_mpi on 6 nodes I get the following: [vlxbig16:08666] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08667] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08700] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08670] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08681] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 [vlxbig16:08659] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_init_stage1.c at line 182 -- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_rml_base_select failed -- Returned value -13 instead of ORTE_SUCCESS Does anyone have any idea what is causing this? Computer support at my University is not sure. How are you launching mdrun_mpi (command line)? -Justin Thanks -- Justin A. Lemkul Ph.D. Candidate ICTAS Doctoral Scholar MILES-IGERT Trainee Department of Biochemistry Virginia Tech Blacksburg, VA jalemkul[at]vt.edu | (540) 231-9080 http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php -- gmx-users mailing listgmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php