It sounds like the second option (partition state on jobid or ...) would be
a great general solution.  Would people here be interested in a patch for
this?

Cheers
Clay

On Wed, May 30, 2012 at 1:03 PM, Moe Jette <[email protected]> wrote:

>
> Oddly enough, I ran across this problem just yesterday on an old
> CentOS distro.
> No great solutions, but here are some options:
> * Upgrade the OS
> * Modify SLURM to spread out the job directories into subdirectories,
> say using a subdirectory based upon the last digit of the job ID. This
> applies to code in only a couple of places, so it should be pretty
> simple (search for "/environment" in src/slurmctld/job_mgr.c)
> * Configure MaxJobs=32000 in slurm.conf and force users reduce the load
> * The directories are created only for batch jobs, so if you can run
> interactive jobs (srun/salloc) this limit would not apply
>
>
> Quoting Clay Teeter <[email protected]>:
>
> > Thanks for the quick response!  Given that our system is ext3 using a 2.6
> > kernel, is there anything that we can do to configure slurm not to create
> > 32K directories/jobs in /var/slurm/state/?
> >
> > Cheers,
> > Clay
> >
> > On Wed, May 30, 2012 at 10:56 AM, Moe Jette <[email protected]> wrote:
> >
> >>
> >> See:
> >> http://superuser.com/questions/298420/cannot-mkdir-too-many-links
> >>
> >> With Ubuntu 12.4 (Linux 3.2.0-24) the limit is at least 200k rather than
> >> 32k.
> >>
> >> Quoting Clay Teeter <[email protected]>:
> >>
> >> > Hi Group,
> >> >
> >> > Anyone know how I might troubleshoot this error message?
> >> >
> >> > [2012-05-15T19:34:27] _slurm_rpc_submit_batch_job: I/O error writing
> >> > script/environment to file
> >> > [2012-05-15T19:34:28] error: mkdir(/var/slurm/state/job.3258740) error
> >> Too
> >> > many links
> >> >
> >> > Cheers,
> >> > Clay
> >> >
> >>
> >>
> >
>
>

Reply via email to