The message "we don't have this plugin type 101" means the jobinfo message has been formatted for the select/cons_res plugin, but that plugin is not loaded on the compute node.
Have you recently changed your slurm.conf switching the select type? If so, it may be possible that the controller is using cons_res, but the compute node was not restarted and is still using a different select type. From: Roman Sirokov <[email protected]> To: "slurm-dev" <[email protected]>, Date: 07/24/2013 12:27 AM Subject: [slurm-dev] Jobs stuck in CG state and a bunch of plugin errors in slurmd.log Hello, We are having a problem with most of the jobs getting stuck in CG state. The problem occurs even with trivial runs as hostname. The offending job does indeed exits, as it is not visible in the running process list after execution, but for some reason slurm fails to register that. I've checked slurmd.log and it contains this error again and again. Quick googling has not revealed anything on this matter. Any ideas? [2013-07-24T10:09:49] error: we don't have this plugin type 101 [2013-07-24T10:09:49] error: select_g_select_jobinfo_unpack: unpack error [2013-07-24T10:09:49] error: Malformed RPC of type 6011 received [2013-07-24T10:09:49] error: slurm_receive_msg_and_forward: Header lengths are longer than data received [2013-07-24T10:09:49] error: service_connection: slurm_receive_msg: Header lengths are longer than data received Cheers, Roman
