Hi all, Today a single user submitted 7000 jobs and squeue and scancel returns the error message: Insane Message Length. I have read on a previous topic in slurm devel list<https://groups.google.com/forum/#!searchin/slurm-devel/Insane$20message$20length|sort:relevance/slurm-devel/7gyGUEg3zWg/4cxCPzRMMc8J> that this is due to the fact that MAX_MSG_SIZE defines a total size of 16 Mb (our slurm version is 2.2.7), which is exceeded by these 7000 jobs. I was not able to cancel a single job with scancel. With sacct I was able to retrieve the JobID of all the jobs in the queue. My questions are: If I stop the slurm control daemon and then I start it with the startclean option will I lose all the jobs?, only the pending ones? Is there a way of cancelling all the pending jobs without cancelling also the running ones? I have 1000 jobs running at this moment and I would like to preserve them. Would it be possible to stop slurmctld and then manually deleting them from /var/slurm-clustername?
Thanks in advance. Juan Pancorbo.
