Hi,

I pulled your fix but seems that it does not help:

[2014-03-11T16:54:40.587] _slurm_rpc_allocate_resources: Job violates 
accounting/QOS policy (job submit limit, user's size and/or time limits)
[2014-03-11T16:54:41.982] _slurm_rpc_allocate_resources: Job violates 
accounting/QOS policy (job submit limit, user's size and/or time limits)
[2014-03-11T16:54:41.982] error: () Error: from slurm_protocol_defs.c:434: (): 
Assertion (p[0] == 0x42) failed


Best Regards,
Tommi Tervo
CSC




On Tuesday, March 11, 2014 4:35 PM, Moe Jette <[email protected]> wrote:

I don't have time to test this right now, but believe the commit below  
will fix the problem by initializing a variable to NULL.

https://github.com/SchedMD/slurm/commit/e3363b95b0cedd4972c8c7b8dc87a1750f6bc3dd


Quoting Marco Passerini <[email protected]>:

> Hi,
>
> I'm trying Slurm 14.03.0, and in particular the new feature which  
> allows the job_submit.lua to print an error message to the shall of  
> our customers upon job submission.
> From what I understood from the code one can print to the user shell  
> by calling the function "log_user" with a string as an argument.  
> This seems to work at the first job submission, but after that Slurm  
> crashes.
>
> Here's a snippet of my code where you can see how I call the  
> log_user function:
>
> ################## job_submit.lua ###################################
> function slurm_job_submit ( job_desc, part_list, submit_uid )
> setmetatable (job_desc, job_req_meta)
> local part_rec = _build_part_table (part_list)
>
>
> -- *** YOUR LOGIC GOES BELOW ***
> -- print (job_desc.num_tasks, job_desc.min_nodes,  
> job_desc.max_nodes, job_desc.partition)
>
> -- Call function check_bq which checks the billing quota. If the quota is
> -- exceeded return with the status 2050.
>
> if(not check_bq(submit_uid, job_desc.group_id)) then
> log_user("Job aborted because you project is over quota")
> return 2050
> end
> return 0
> end
>
> function slurm_job_modify ( job_desc, job_rec, part_list, modify_uid )
> setmetatable (job_desc, job_req_meta)
> setmetatable (job_rec, job_rec_meta)
>
> -- *** YOUR LOGIC GOES BELOW ***
>
> return 0
> #####################################################################
>
>
> This is what happens at the user shell:
>
> #####################################################################
> $ srun -N1 hostname
> srun: error: Job aborted because you project is over quota
> srun: error: Unable to allocate resources: Job violates  
> accounting/QOS policy (job submit limit, user's size and/or time  
> limits)
>
> $ srun -N1 hostname
> srun: error: slurm_receive_msg: Zero Bytes were transmitted or received
> srun: error: Unable to allocate resources: Zero Bytes were  
> transmitted or received
>
> $ srun -N1 hostname
> ** stuck ... **
> #####################################################################
>
>
> This is what is logged in the logs:
>
> #################### /var/log/slurm/Slurmctld.log ###################
> [2014-03-11T15:47:18.754] _slurm_rpc_allocate_resources: Job  
> violates accounting/QOS policy (job submit limit, user's size and/or  
> time limits)
> [2014-03-11T15:47:37.505] error: () Error: from  
> job_submit_lua.c:207: (): Assertion (p[0] == 0x42) failed
> #####################################################################
>
>
> Could you help me solving the issue?
>
> Thanks in advance,
>
> Marco Passerini
>

Reply via email to