Dear all,
I am running a parallel execution (pw.x) on a SLURM LINUX interface and once I
run the command sbatch filename.srm, the calculation starts running and then
stops with the follwing error:
"mpiexec_veredas5: cannot
connect to local mpd
(/tmp/mpd2.console_sushil); possible causes:
1. no mpd is running on this host
2. an mpd is running but was started without
a "console" (-n option)
In case 1, you
can start an mpd on this host with:
mpd &
and you will be
able to run jobs just on this host.
For more details
on starting mpds on a set of hosts, see
the MPICH2
Installation Guide."
I saw a previous message posted in 2009 about this error. I followed what prof.
andrea did: I created a file elie.mpd.hosts and included one line in it
(localhosts) then ran the coomad mpdboot -f ~/elie.mpd.hosts and run sbatch
command again but in 3 seconds time, it stops with the same error. can anyone
help..
N.B: veredas 5 is the node at which I am executing the command but whatever
node I try on , I get the same error
Elie MoujaesUniversity of NottinghamNG7 2RDUK
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.democritos.it/pipermail/pw_forum/attachments/20120205/ac4eb61e/attachment.htm