You'll need this too:

https://github.com/SchedMD/slurm/commit/e25249684af250eb65e424eaf12ff4755c0d0af1.patch

Quoting Tommi T <tommi_...@yahoo.com>:


Hi,

I pulled your fix but seems that it does not help:

[2014-03-11T16:54:40.587] _slurm_rpc_allocate_resources: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits) [2014-03-11T16:54:41.982] _slurm_rpc_allocate_resources: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits) [2014-03-11T16:54:41.982] error: () Error: from slurm_protocol_defs.c:434: (): Assertion (p[0] == 0x42) failed


Best Regards,
Tommi Tervo
CSC




On Tuesday, March 11, 2014 4:35 PM, Moe Jette <je...@schedmd.com> wrote:

I don't have time to test this right now, but believe the commit below 
will fix the problem by initializing a variable to NULL.

https://github.com/SchedMD/slurm/commit/e3363b95b0cedd4972c8c7b8dc87a1750f6bc3dd


Quoting Marco Passerini <marco.passer...@csc.fi>:

Hi,

I'm trying Slurm 14.03.0, and in particular the new feature which 
allows the job_submit.lua to print an error message to the shall of 
our customers upon job submission.
From what I understood from the code one can print to the user shell 
by calling the function "log_user" with a string as an argument. 
This seems to work at the first job submission, but after that Slurm 
crashes.

Here's a snippet of my code where you can see how I call the 
log_user function:

################## job_submit.lua ###################################
function slurm_job_submit ( job_desc, part_list, submit_uid )
setmetatable (job_desc, job_req_meta)
local part_rec = _build_part_table (part_list)


-- *** YOUR LOGIC GOES BELOW ***
-- print (job_desc.num_tasks, job_desc.min_nodes, 
job_desc.max_nodes, job_desc.partition)

-- Call function check_bq which checks the billing quota. If the quota is
-- exceeded return with the status 2050.

if(not check_bq(submit_uid, job_desc.group_id)) then
log_user("Job aborted because you project is over quota")
return 2050
end
return 0
end

function slurm_job_modify ( job_desc, job_rec, part_list, modify_uid )
setmetatable (job_desc, job_req_meta)
setmetatable (job_rec, job_rec_meta)

-- *** YOUR LOGIC GOES BELOW ***

return 0
#####################################################################


This is what happens at the user shell:

#####################################################################
$ srun -N1 hostname
srun: error: Job aborted because you project is over quota
srun: error: Unable to allocate resources: Job violates 
accounting/QOS policy (job submit limit, user's size and/or time 
limits)

$ srun -N1 hostname
srun: error: slurm_receive_msg: Zero Bytes were transmitted or received
srun: error: Unable to allocate resources: Zero Bytes were 
transmitted or received

$ srun -N1 hostname
** stuck ... **
#####################################################################


This is what is logged in the logs:

#################### /var/log/slurm/Slurmctld.log ###################
[2014-03-11T15:47:18.754] _slurm_rpc_allocate_resources: Job 
violates accounting/QOS policy (job submit limit, user's size and/or 
time limits)
[2014-03-11T15:47:37.505] error: () Error: from 
job_submit_lua.c:207: (): Assertion (p[0] == 0x42) failed
#####################################################################


Could you help me solving the issue?

Thanks in advance,

Marco Passerini


Reply via email to