Dear list, Twice this month I've had jobs stuck in completing state (CG). When I go to the compute node and check slurmd.log I see a message about "incompatible plugin version", for example:
[2022-08-16T03:36:25.823] [748139.batch] done with job [2022-08-16T12:54:21.404] [748139.extern] plugin_load_from_file: Incompatible Slurm plugin /usr/lib64/slurm/hash_k12.so version (22.05.3 ) [2022-08-16T12:54:21.404] [748139.extern] error: Couldn't load specified plugin name for hash/k12: Incompatible plugin version [2022-08-16T12:54:21.404] [748139.extern] error: cannot create hash context for K12 [2022-08-16T12:54:21.404] [748139.extern] error: slurm_send_node_msg: hash_g_compute: REQUEST_STEP_COMPLETE has error [2022-08-16T12:54:21.404] [748139.extern] error: Rank 0 failed sending step completion message directly to slurmctld, retrying For context, I did a minor upgrade of SLURM yesterday (22.05.2 to 22.05.3), so it's possible there is an incompatible version somewhere, but if I look up earlier in the log file I see that the running version of slurmd is correct and still prints that error right after startup: [2022-08-15T22:27:59.865] slurmd version 22.05.3 started [2022-08-15T22:27:59.867] slurmd started on Mon, 15 Aug 2022 22:27:59 +0300 [2022-08-15T22:27:59.869] CPUs=48 Boards=1 Sockets=2 Cores=24 Threads=1 Memory=386525 TmpDisk=71645 Uptime=2679297 CPUSpecList=(null) Fe aturesAvail=(null) FeaturesActive=(null) [2022-08-16T02:36:10.020] [748139.batch] plugin_load_from_file: Incompatible Slurm plugin /usr/lib64/slurm/hash_k12.so version (22.05.3) [2022-08-16T02:36:10.020] [748139.batch] error: Couldn't load specified plugin name for hash/k12: Incompatible plugin version [2022-08-16T02:36:10.022] [748139.batch] error: cannot create hash context for K12 I'm running SLURM 22.05.3. The slurmctld is running on CentOS 7, and compute nodes are on CentOS Stream 8 (not sure if this matters?). Thanks for any advice, -- Alan Orth [email protected] https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch
