The message "we don't have this plugin type 101" means the jobinfo message 
has been formatted for the select/cons_res plugin, but that plugin is not 
loaded on the compute node.

Have you recently changed your slurm.conf switching the select type? If 
so, it may be possible that the controller is using cons_res, but the 
compute node was not restarted and is still using a different select type.




From:   Roman Sirokov <[email protected]>
To:     "slurm-dev" <[email protected]>, 
Date:   07/24/2013 12:27 AM
Subject:        [slurm-dev] Jobs stuck in CG state and a bunch of plugin 
errors in slurmd.log



Hello,
We are having a problem with most of the jobs getting stuck in CG state. 
The problem occurs even with trivial runs as hostname.  The offending job 
does indeed exits, as it is not visible in the  running process list after 
execution, but for some reason slurm fails to register that. I've checked 
slurmd.log and it contains this error again and again. Quick googling has 
not revealed anything on this matter. Any ideas?

[2013-07-24T10:09:49] error: we don't have this plugin type 101
[2013-07-24T10:09:49] error: select_g_select_jobinfo_unpack: unpack error
[2013-07-24T10:09:49] error: Malformed RPC of type 6011 received
[2013-07-24T10:09:49] error: slurm_receive_msg_and_forward: Header lengths 
are longer than data received
[2013-07-24T10:09:49] error: service_connection: slurm_receive_msg: Header 
lengths are longer than data received

Cheers,
Roman


Reply via email to