Hi,
i have a question with suspending mpi Jobs.
We are running torque with maui on a cluster .
It's running fine. The Problem we have, is with mpi Jobs.
Suspension does'nt work an all Processes:
Job ID Username Queue Jobname SessID NDS TSK
Memory Time S Time
-------------------- -------- -------- ---------------- ------ ----- ---
------ ----- - -----
791.rzcluster2 xxxxx short test2 26698 1 --
-- 02:30 S 01:33
rzcl124/7+rzcl124/6+rzcl124/5+rzcl124/4+rzcl124/3+rzcl124/2+rzcl124/1
+rzcl124/0
795.rzcluster2 xxxxx batch test 28098 1 --
-- --
[r...@rzcl124 tmp]# cat JJJ
4 T xxxxx 26698 16450 0 77 0 - 21014 - 14:20 ? 00:00:00 -bash
0 T xxxxx 26761 26698 0 77 0 - 15965 - 14:20 ? 00:00:00
/bin/bash /var/spool/pbs/mom_priv/jobs/791.rzcluster2.SC
0 T xxxxx 26773 26761 0 75 0 - 35915 - 14:20 ? 00:00:00
python2.4 /data/xxxxx/Dtesten/Drzcluster2/Dmpich2/mpich2-1.1.1/bin/mpirun
-np 16 ./m1
0 S xxxxx 26790 1 51 75 0 - 5290 - 14:20 ? 00:00:30 ./m1
0 S xxxxx 26791 1 38 76 0 - 4265 - 14:20 ? 00:00:23 ./m1
0 S xxxxx 26792 1 38 75 0 - 4266 - 14:20 ? 00:00:23 ./m1
0 S xxxxx 26793 1 25 76 0 - 4265 - 14:20 ? 00:00:15 ./m1
0 S xxxxx 26794 1 25 76 0 - 4266 - 14:20 ? 00:00:15 ./m1
0 S xxxxx 26795 1 26 75 0 - 4266 - 14:20 ? 00:00:15 ./m1
0 S xxxxx 26796 1 25 75 0 - 4267 - 14:20 ? 00:00:15 ./m1
0 S xxxxx 26797 1 12 75 0 - 5290 - 14:20 ? 00:00:07 ./m1
0 S xxxxx 26798 1 12 75 0 - 5289 - 14:20 ? 00:00:07 ./m1
0 S xxxxx 26799 1 12 75 0 - 5290 - 14:20 ? 00:00:07 ./m1
0 S xxxxx 26800 1 13 75 0 - 5290 - 14:20 ? 00:00:07 ./m1
0 S xxxxx 26801 1 12 75 0 - 5289 - 14:20 ? 00:00:07 ./m1
0 S xxxxx 26802 1 13 75 0 - 5290 - 14:20 ? 00:00:07 ./m1
0 S xxxxx 26803 1 12 75 0 - 5289 - 14:20 ? 00:00:07 ./m1
0 S xxxxx 26804 1 12 75 0 - 5289 - 14:20 ? 00:00:07 ./m1
0 R xxxxx 26805 1 52 75 0 - 4320 - 14:20 ? 00:00:31 ./m1
After resuming the job also the Prozesses don't start again.
Is that a known Problem ?
Thank you
Alfred Wagner
Rechenzentrum der Universität Kiel
Ludewig-Meyn-Straße 4
24118 Kiel
0431-8804494 FAX:0431-8801523
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers