Is there an obvious downside to bumping up the MAX_MSG_SIZE in
src/common/slurm_protocol_socket_implementation.c ?

It is currently defined as 16*1024*1024

-JE

On Thu, 2011-04-28 at 15:40 -0700, Josh England wrote:
> I've got a crapton of running jobs (maybe around 25000) on a cluster
> using slurm 2.2.4.  At some point it hit a bad threshold.  Many commands
> like squeue and scancel started complaining and the slurmctld is pegged
> at 100% CPU:
> scancel: error: slurm_receive_msg: Insane message length
> slurm_load_jobs error: Insane message length
> 
> I can't cancel jobs to get back to a sane number.  Has anyone seen this
> before?
> 
> -JE
> 
> 


Reply via email to