Hi Jim, I'm irritated:
On Wednesday 22 February 2006 16:12, Jim Summers wrote: > -bash-3.00$ cat testrun.pbs.o24 > p0_28782: (60.473654) Procgroup: > p0_28782: (60.473807) entry 0: node15.oscardomain 0 0 > /home/tmac2/clib/node-test tmac2 > p0_28782: (60.473846) entry 1: rhel4.ehpctc.intern 1 1 > /home/tmac2/clib/node-test tmac2 > p0_28782: (60.473868) entry 2: rhel4.ehpctc.intern 1 2 > /home/tmac2/clib/node-test tmac2 > p0_28782: (60.473888) entry 3: rhel4.ehpctc.intern 1 3 > /home/tmac2/clib/node-test tmac2 > p0_28782: (60.473909) entry 4: rhel4.ehpctc.intern 1 4 > /home/tmac2/clib/node-test tmac2 > p0_28782: (60.473930) entry 5: rhel4.ehpctc.intern 1 5 > /home/tmac2/clib/node-test tmac2 > p0_28782: p4_error: Could not gethostbyname for host > rhel4.ehpctc.intern; may be invalid name rhel4.ehpctc.intern is one of my machines. Where does this output come from? Could you have a look at the file /opt/mpich-ch_p4-gcc-1.2.7/share/machines.LINUX It should contain a list of YOUR nodes. Are you using the $PBS_NODEFILE variable in your qsub script? That contains the list of nodes attributed to your job from the queuing system. Actually you should use in your script something like mpirun -machinefile $PBS_NODEFILE your_executable Otherwise you will not use the nodes assigned to you by the queuing system, but the default nodes from the machines.LINUX file. Those actually should contain YOUR nodes, but maybe something went wrong in the post_clients script of mpich... Hope it helps... Best regards, Erich ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 _______________________________________________ Oscar-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-users
