[slurm-users] Question about Failed to unpack DBD_NODE_STATE message

2023-10-28 Thread mohammed shambakey
Hi

slurmdbd produces the following error in the log file:
error: CONN:X Failed to unpack DBD_NODE_STATE message

I tried to restart it many times, but it keeps getting back. I restarted
the machine, but it's still there.

Regards

-- 
Mohammed


Re: [slurm-users] RES: Change something in user's script using job_submit.lua plugin

2023-10-28 Thread Ole Holm Nielsen

Hi Paulo,

Maybe what you see is due to a bug then?  You might try to update Slurm 
to see if has been fixed.


You should not use the Slurm RPMs from EPEL - I think offering these 
RPMs was a mistake.


Anyway you ought to upgrade to the latest Slurm 23.02.6 since a serious 
security issue was fixed a couple of weeks ago.  Older Slurm versions 
are all affected!  Perhaps this Wiki guide can help you upgrade to the 
latest RPM: https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/


/Ole


On 27-10-2023 13:13, Paulo Jose Braga Estrela wrote:

Yes, the script is running and changing other fields like comment, partition, 
account is working fine. The only problem seems to be the script field of 
job_rec. I'm using Slurm 20.11.9 from EPEL repository for RHEL 8. Thank you for 
sharing your Wiki. I've accessed it before. It's really useful for HPC 
engineers.

Best regards,


PÚBLICA
-Mensagem original-
De: slurm-users  Em nome de Ole Holm 
Nielsen
Enviada em: sexta-feira, 27 de outubro de 2023 03:31
Para: slurm-users@lists.schedmd.com
Assunto: Re: [slurm-users] Change something in user's script using 
job_submit.lua plugin

Hi Paulo,

Which Slurm version do you have, and did you set this in slurm.conf:
JobSubmitPlugins=lua ?

Perhaps you may find some useful information in this Wiki page:
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_configuration/#job-submit-plugins

/Ole


On 26-10-2023 19:07, Paulo Jose Braga Estrela wrote:

Is it possible to change something in user’s sbatch script by using a
job_submit plugin? To be more specific, using Lua job_submit plugin.

I’m trying to do the following in job_submit.lua when a user changes
job’s partition to “cloud” partition, but the script got executed
without modification.

function slurm_job_modify(job_desc, job_rec, part_list, modify_uid)

  if job_desc.partition == "cloud" then

  slurm.log_info("slurm_job_modify: Bursting job %u
from uid %u to the cloud...",job_rec.job_id,modify_uid)

  script = job_rec.script

  slurm.log_info("Script BEFORE change: %s",script)

  -- changing user command to another command

  script = string.gsub(script,"local command","cloud
command")

  slurm.log_info("Script AFTER change %s",script)

  -- The script variable is really changed

  job_rec.script = script

  slurm.log_info("Job RECORD SCRIPT %s",job_rec.script)

  -- The job record also got changed, but the EXECUTED
script isn’t changed at all. It runs without modification.

  end

  return slurm.SUCCESS

end

*PAULO ESTRELA*