Thanks for the input. now when i do an squeue -tall I get teh following
output

JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
   2774  all_part mysrun.s   kdsd03  CD       0:00      1 abc001

This would indicate that the job completed. Howeveer the output file was not
created., which tells me that the job did not run.

The srun comman in mysrun.sh is

srun -N1 -o /home/kdsd03/oas/klurm/test.out /home/kdsd03/oas/klurm/test.sh
input.

the test.sh script basically echos the input. So test.out should contain the
input.

Now when I remove these abc nodes, everything seems to work fine.


On Fri, Feb 11, 2011 at 11:25 AM, Jette, Moe <[email protected]> wrote:

> squeue by default only shows running or pending jobs.
> Your job either completed or failed (error state).
> Try "squeue -tall" or "squeue --state=all" or "scontrol show job <jobid>"
>
> ________________________________________
> From: [email protected] [[email protected]] On
> Behalf Of Paul Thirumalai [[email protected]]
> Sent: Friday, February 11, 2011 10:45 AM
> To: [email protected]
> Subject: [slurm-dev] sbatch seems to have stopped working
>
> So I had a slurm setup that was working fine.
> I made the following configuration changes.
> 1. Added about 150 more nodes to the slurm setup
> 2. Added a new partition for these nodes
> 3. Added a 3rd logical partiftion that contains all the nodes
> 4. Changes SelectType to select/cons_res
> 5. Changed SelectTypeParamters to CR_Core_Memory.
>
> Now after I make the changes it seems as though sbatch does not work.
>
> When I submit a job using sbatch at the command line it  says "Submitted
> batch job <jobid>"
> But when i do an squeue I dont see that job running.
>
> If I submit the same job using srun, it works fine.
>
> Any help would be appreciated.
>
> Thanks
> Paul
>
>

Reply via email to