-------- Original Message -------- Subject: Re: [Oscar-users] problem killing a running job From: "David Gutierrez" <[EMAIL PROTECTED]> Date: Fri, July 11, 2003 5:00 pm To: <[EMAIL PROTECTED]>
i finalli got rid of the job in the qstat but now after reboot everything i am getting the following messages: PBS Job Id: 109.oscarnode1.oscardomain Job Name: PBSMPIcpiTEST Aborted by PBS Server Job cannot be executed See Administrator for help I saw the Mom logs in /var/spool/pbs/....and there is nothing rare but in the server log i got the attached log file. david >I had the same problem and I had to go manually and clear the queue of > the PBS system. > > If I remember correctly, I > > 1. stopped pbs on the master > 2. went to the spool directory on all the master and the slave nodes > (/var/spool/pbs/spool) and deleted everything left over. You may want > to do a recursive grep in /var/spool/pbs for "91" to make sure there > is nothing left there. > 3. I then rebooted everything too :-) > > Good luck, Yannis > > On Fri, 11 Jul 2003, David Gutierrez wrote: > >> i have tried a lot of things but still nothing.Every time i do a qdel >> JOBID i got a messages like this: >> >> PBS JOB ID: JOBID >> Job Name:JOB NAME >> Job deleted at request of [EMAIL PROTECTED] >> >> i tryed qsiging the job , and pbsnodes -o , then reboot the nodes , >> and then pbsnodes -c , but nothing. >> >> When i do qstat i see: >> >> Job id Name User Time Use S Queue >> ---------------- ---------------- ---------------- -------- - ----- >> 91.oscarnode1 PIPI oscartst 0 R workq >> >> and when i do pbsnodes -a i got: >> >> oscarnode2.oscardomain >> state = free >> np = 2 >> properties = all >> ntype = cluster >> jobs = 0/91.oscarnode1.oscardomain >> >> >> oscarnode3.oscardomain >> state = free >> np = 2 >> properties = all >> ntype = cluster >> jobs = 0/91.oscarnode1.oscardomain >> >> >> oscarnode4.oscardomain >> state = free >> np = 2 >> properties = all >> ntype = cluster >> jobs = 0/91.oscarnode1.oscardomain >> >> >> oscarnode5.oscardomain >> state = free >> np = 2 >> properties = all >> ntype = cluster >> >> oscarnode6.oscardomain >> state = free >> np = 2 >> properties = all >> ntype = cluster >> >> >> oscarnode7.oscardomain >> state = free >> np = 2 >> properties = all >> ntype = cluster >> >> >> oscarnode8.oscardomain >> state = state-unknown,down >> np = 2 >> properties = all >> ntype = cluster >> >> I am unable to execute any other job in the cluster because of this. >> >> What can i do? >> >> David >> >> >> > Check to see if any of the nodes is down. >> > >> > --- David Gutierrez <[EMAIL PROTECTED]> wrote: >> >> i tried with qdel JOBID , and i received a messages >> >> that says: >> >> >> >> PBS Job Id: 91.oscarnode1.oscardomain >> >> Job Name: PIPI >> >> Job deleted at request of >> >> [EMAIL PROTECTED] >> >> >> >> but the problem is that i still see this job as >> >> running every time i make >> >> a qstat , and the other jobs in the queue do not go >> >> to a run state >> >> >> >> david >> >> >> >> >> >> but the >> >> > qdel JOBID? >> >> > >> >> > Jeremy >> >> > >> >> > At 10:26 PM 7/10/2003 -0400, David Gutierrez >> >> wrote: >> >> >>hi: >> >> >> >> >> >>i have a problem killing a running job. >> >> >> >> >> >>i have tried with qsig jobid , but nothing.It >> >> always appears as runnig >> >> >> ecverytime i do qstat . >> >> >> >> >> >>any idea? >> >> >> >> >> >>david >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>------------------------------------------------------- >> >> >>This SF.Net email sponsored by: Parasoft >> >> >>Error proof Web apps, automate testing & more. >> >> >>Download & eval WebKing and get a free book. >> >> >>www.parasoft.com/bulletproofapps1 >> >> >>_______________________________________________ >> >> >>Oscar-users mailing list >> >> >>[EMAIL PROTECTED] >> >> >> >>>https://lists.sourceforge.net/lists/listinfo/oscar-users >> >> > >> >> > >> >> > >> >> > >> >> >> > ------------------------------------------------------- >> >> > This SF.Net email sponsored by: Parasoft >> >> > Error proof Web apps, automate testing & more. >> >> > Download & eval WebKing and get a free book. >> >> > www.parasoft.com/bulletproofapps1 >> >> > _______________________________________________ >> >> > Oscar-users mailing list >> >> > [EMAIL PROTECTED] >> >> > >> >> >> > https://lists.sourceforge.net/lists/listinfo/oscar-users >> >> >> >> >> >> -- >> >> Ing.David Gutierrez Diaz >> >> >> >> >> >> >> >> >> >> >> > ------------------------------------------------------- >> >> This SF.Net email sponsored by: Parasoft >> >> Error proof Web apps, automate testing & more. >> >> Download & eval WebKing and get a free book. >> >> www.parasoft.com/bulletproofapps1 >> >> _______________________________________________ >> >> Oscar-users mailing list >> >> [EMAIL PROTECTED] >> >> >> > https://lists.sourceforge.net/lists/listinfo/oscar-users >> > >> > >> > __________________________________ >> > Do you Yahoo!? >> > SBC Yahoo! DSL - Now only $29.95 per month! >> > http://sbc.yahoo.com >> >> >> -- >> Ing.David Gutierrez Diaz >> >> >> >> >> ------------------------------------------------------- >> This SF.Net email sponsored by: Parasoft >> Error proof Web apps, automate testing & more. >> Download & eval WebKing and get a free book. >> www.parasoft.com/bulletproofapps1 >> _______________________________________________ >> Oscar-users mailing list >> [EMAIL PROTECTED] >> https://lists.sourceforge.net/lists/listinfo/oscar-users >> > > > > --------------------- > "Ama et fac quod vis" -- Ing.David Gutierrez Diaz -- Ing.David Gutierrez Diaz
20030712
Description: Binary data
