No problem. It was just to help deal with something bad happening in the 
data handling (e.g. bad pointers or the like).
________________________________________
From: [email protected] [[email protected]] On Behalf 
Of Josh England [[email protected]]
Sent: Thursday, April 28, 2011 3:58 PM
To: [email protected]
Subject: Re: [slurm-dev] Insane message length

Is there an obvious downside to bumping up the MAX_MSG_SIZE in
src/common/slurm_protocol_socket_implementation.c ?

It is currently defined as 16*1024*1024

-JE

On Thu, 2011-04-28 at 15:40 -0700, Josh England wrote:
> I've got a crapton of running jobs (maybe around 25000) on a cluster
> using slurm 2.2.4.  At some point it hit a bad threshold.  Many commands
> like squeue and scancel started complaining and the slurmctld is pegged
> at 100% CPU:
> scancel: error: slurm_receive_msg: Insane message length
> slurm_load_jobs error: Insane message length
>
> I can't cancel jobs to get back to a sane number.  Has anyone seen this
> before?
>
> -JE
>
>



Reply via email to