No problem. It was just to help deal with something bad happening in the data handling (e.g. bad pointers or the like). ________________________________________ From: [email protected] [[email protected]] On Behalf Of Josh England [[email protected]] Sent: Thursday, April 28, 2011 3:58 PM To: [email protected] Subject: Re: [slurm-dev] Insane message length
Is there an obvious downside to bumping up the MAX_MSG_SIZE in src/common/slurm_protocol_socket_implementation.c ? It is currently defined as 16*1024*1024 -JE On Thu, 2011-04-28 at 15:40 -0700, Josh England wrote: > I've got a crapton of running jobs (maybe around 25000) on a cluster > using slurm 2.2.4. At some point it hit a bad threshold. Many commands > like squeue and scancel started complaining and the slurmctld is pegged > at 100% CPU: > scancel: error: slurm_receive_msg: Insane message length > slurm_load_jobs error: Insane message length > > I can't cancel jobs to get back to a sane number. Has anyone seen this > before? > > -JE > >
