Re: [slurm-users] sacct end time for failed jobs

2019-03-06 Thread Paul Edmon
Odds are the new version won't help for that.  You will have to do some mysql work to fix it then. -Paul Edmon- On 3/6/2019 1:23 PM, Brian Andrus wrote: I am running the latest and did that, but it didn't change anything. The jobs stay in the runaway state and no changes are made to the

Re: [slurm-users] sacct end time for failed jobs

2019-03-06 Thread Brian Andrus
I am running the latest and did that, but it didn't change anything. The jobs stay in the runaway state and no changes are made to the database. Using 18.08.2-1. Maybe try updating to 19.05.0-0pre1? Brian On 3/6/2019 10:06 AM, Paul Edmon wrote: A lot of this is automated in the new

Re: [slurm-users] sacct end time for failed jobs

2019-03-06 Thread Paul Edmon
A lot of this is automated in the new versions of slurm.  You should just need to run: sacctmgr show runawayjobs It will then give you an option to clean them and slurm will handle the rest.  If you add the -i option it will just clean them automatically. -Paul Edmon- On 3/6/2019 11:58 AM,

Re: [slurm-users] sacct end time for failed jobs

2019-03-06 Thread Cyrus Proctor
Hi Brian, Others probably have better suggestions before going the route I'm about to detail. If you do go this route, be warned, you definitely have the ability to irrevocably lose data or destroy your Slurm accounting database. Do so at your own risk. I got here with Google-foo after being

Re: [slurm-users] sacct end time for failed jobs

2019-03-06 Thread Brian Andrus
It shows several jobs that all have "Unknown" for end_time. Some are PENDING and some are RUNNING (none are truly in either state). It asked to fix them, which I did, but nothing seems to have changed. They still show up with that command and in reports. Brian On 3/5/2019 10:34 PM,

Re: [slurm-users] Priority access for a group of users

2019-03-06 Thread Michael Gutteridge
It is likely that your job still does not have enough priority to preempt the scavenge job. Have a look at the output of `sprio` to see the priority of those jobs and what factors are in play. It may be necessary to increase the partition priority or adjust some of the job priority factors to