[gmx-users] Can not open file: topol.tpr

Guillem Plasencia Thu, 27 Apr 2006 03:37:39 -0700

Hello listers,

this is my first try with Gromacs (3.3.1). I've installed LAM-MPI (7.1.2)and FFTW3 in my two Dual Intel P4 CPU machines (4 physical CPUs, 8 withhyperthreading on, i've already read in the mailing list archive that ishould turn off hyperthreading until Gromacs 4 release to improveperformance) running Fedora Core 4 (kernel 2.6).

Just to test the parallel processing, i downloaded and tried to run one ofthe benchmark tests (d.lzm).


I prepared it with:

grompp -f cutoff.mdp -c conf.gro -p topol.top -np 2

(here i had to read the archives to avoid temptation to include -nt 2, whicheven including --enable-threads in configure options gave me an error).



But when tried to run it in my two-nodes as a parallel task with:

mpirun n0,1 mdrun -s topol.tpr -np 2

i got the following output from mdrun:

NNODES=2, MYRANK=1, HOSTNAME=lead8
NNODES=2, MYRANK=0, HOSTNAME=lead7
NODEID=1 argc=5
NODEID=0 argc=5

CUT SOME MDRUN HELP INFO >>>

-------------------------------------------------------
Program mdrun, VERSION 3.3.1
Source code file: gmxfio.c, line: 706

Can not open file:
topol.tpr
-------------------------------------------------------

"I'm a Jerk" (F. Black)

Error on node 0, will try to stop all the nodes
Halting parallel program mdrun on CPU 0 out of 2

gcq#171: "I'm a Jerk" (F. Black)

-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 27859 failed on node n1 (192.168.1.9) with exit status 1.
-----------------------------------------------------------------------------

You can see from lamnodes that node n1 is the originating node

lamnodes

n0      lead7:2:
n1      192.168.1.9:2:origin,this_node

and from ps -leaf | grep mdrun i can see that both processes have beenstarted, but neither uses CPU at all. So far, i guess this is because if theoriginating node (n1) can't read topol.tpr file, it can't distribute tasksamongst nodes (which would be causing the unknown error in node 0, the othernode).


Any ideas on what's happening? How do i solve it?

Thank you very much !

Guillem Plasencia
Spain.

P.D. I've read on the archives that there was some interest in knowing ifhyperthreading is still doing wrong balancing in linux kernel 2.6, whichhappens to be the kernel i'm running. I'd be pleased to test both HT on andoff on my nodes, of course as soon as i solve this problem with topol.tprfile.



_______________________________________________
gmx-users mailing list    [email protected]
http://www.gromacs.org/mailman/listinfo/gmx-users

Please don't post (un)subscribe requests to the list. Use thewww interface or send it to [EMAIL PROTECTED]

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

[gmx-users] Can not open file: topol.tpr

Reply via email to