Re: [slurm-dev] sbatch and job steps

Mark A. Grondona Thu, 15 Sep 2011 10:20:18 -0700

On Thu, 15 Sep 2011 10:16:26 -0700, "Mark A. Grondona" <[email protected]> 
wrote:
> 
> I'm not sure if this will be useful or not, but your use case reminded
> me of a project by Jim Garlick awhile back called "industrial strength
> pipes" (ISP). This project allows you to set up a chain of dependent
> tasks much like a UNIX pipeline, and it has some kind of support for
> spawning the tasks in the pipeline with srun(1). It might not exactly
> map to your usage case, but I thought I'd mention it nonetheless.


I did mean to send the URL to ISP:

http://isp.sourceforge.net/report.pdf

mark
 
> Another project that this discussion reminded me of was a set of
> scripts I wrote awhile back to run a personal instance of SLURM
> as a SLURM job. When this nested SLURM instance was launched, it
> then appeared to commands running within the job that a full SLURM
> cluster of however many nodes were in the job was available. You
> could then submit multiple batch jobs to this nested instance (even
> another request for a nested SLURM)
> 
> The solution was kind of kludgy though, and a proper implementation
> was never accepted into SLURM proper, so unfortunately no such
> support exists today.
> 
> mark
> 
> 
> 
> On Wed, 14 Sep 2011 15:09:47 -0700, Yuri D'Elia <[email protected]> wrote:
> >  On Wed, 14 Sep 2011 10:44:36 -0700, Danny Auble wrote:
> > > Have you had a look at  the HTC documentation?
> > >
> > > http://schedmd.com/slurmdocs/high_throughput.html
> > 
> >  Yes, I have. I was able to improve the scheduling speed by tuning the 
> >  configuration (before that, I couldn't even queue 65k jobs before 
> >  getting timeouts and abysmal performance). Meanwhile, I will update to 
> >  2.2 to get larger job counts, but still that doesn't address all my 
> >  concerns. Please be patient :)
> > 
> > > Without knowing what your real objective is it is hard to prescribe a
> > > real solution.
> > >
> > > From your description it seems strange you would have the script
> > > sbatch is calling call sbatch once again.  What are you trying to
> > > accomplish there?
> > > Wouldn't it just be easier to run this script outside of an 
> > > allocation?
> > 
> >  Ok, I will restate my problem in a more practical manner. Please ask if 
> >  there's any question or any idea on how to improve the behavior.
> > 
> >  I'm running bioinformatic batches of various kinds on genetic data. A 
> >  typical analysis will involve running a short batch (~ 10 minutes) 
> >  multiplied for each polymorphism we have (roughly 100k times in the 
> >  smallest case). Perfect candidate for distribution, since every step in 
> >  a single stage is independent.
> > 
> >  Analyses are usually multi-stage:
> > 
> >  - we run "stage 1" (first 100k jobs)
> >  - collect and aggregate data (a single job depending on "stage 1"
> >  - run "stage 2" using collected data (another 100k jobs)
> >  - (repeat)
> > 
> >  Let's assume queuing ~200k jobs is not a problem with 2.2.
> > 
> >  First issue:  "squeue" takes forever with more than >5000 jobs. If more 
> >  than one user is scheduling a workflow like this it becomes impossible 
> >  to use it at all. Also, managing the queue itself (managing jobs, 
> >  killing just "stage 1" is impossible). I would like to group the first 
> >  100k jobs in a single "id", so that I know that jobs 1-100k belong to 
> >  "stage 1".
> > 
> >  My impression by reading the docs is that I can create an allocation 
> >  and run "steps" to achieve this behavior. squeue or salloc is the 
> >  easiest way, but since queuing that many jobs is also time-consuming, 
> >  running the queuing script on the queue itself seemed a perfect solution 
> >  (hence sbatch --jobid within sbatch). This method (using salloc or 
> >  sbatch) also seems to work fine if I put a fat "sleep" to keep alive the 
> >  allocation.
> > 
> >  Also, consider that eventually I will need to queue jobs within a 
> >  script anyway (the ending step of "stage 1" might be scheduling "stage 
> >  2" itself).
> > 
> >  Second issue: job dependencies. If I can use a single job with steps, I 
> >  can put dependencies for "step 2" easily on a single id and schedule 
> >  everything "outside" of slurm.  If this is not possible, then I need a 
> >  barrier (like "wait" in a script like you suggested) so that as soon a 
> >  "stage 1" finishes I can schedule the next stages within the batch 
> >  itself.
> > 
> >  Right now, to word around these issues, I'm artificially limiting the 
> >  jobs by scheduling N/Z jobs, where each resuling job runs Z steps 
> >  sequentially. This limits parallelism however. To work around 
> >  dependencies issues, I'm looping with a script around "squeue" to see if 
> >  a pre-determined stage has finished. Ugly, but having people wait to 
> >  schedule more jobs (and thus letting the machines idle) is worse.
> >

Re: [slurm-dev] sbatch and job steps

Reply via email to