Hi there,

After an upgrade from 14.03.08 to 14.03.10, we noticed that our jobs didn't 
complete according to sacct. Squeue no longer showed them, as they had 
completed. After some investigation, it seems like it's the DBD's that are 
rejecting the jobs with this error:

slurmdbd[10475]: error: as_mysql_step_complete: Not inputing this job, it has 
no submit time.

However, when we query it with sacct, it does have a submit time. Starting 
slurmdbd with -D -vvv gave us the following,

slurmdbd: error: We have more allocated time than is possible (17193870 > 
15206400) for cluster calculon(4224) from 2014-11-27T10:00:00 - 
2014-11-27T11:00:00
slurmdbd: error: We have more time than is possible 
(15206400+15163934+0)(30370334) > 15206400 for cluster calculon(4224) from 
2014-11-27T10:00:00 - 2014-11-27T11:00:00
slurmdbd: debug2: No need to roll cluster calculon this day 1417042800 <= 
1417042800
slurmdbd: debug2: No need to roll cluster calculon this month 1414796400 <= 
1414796400
slurmdbd: debug2: Got 1 rolled up
slurmdbd: debug2: Everything rolled up
slurmdbd: debug2: DBD_JOB_START: ELIGIBLE CALL ID:1270167 
NAME:order_handler_serverfront
slurmdbd: debug2: as_mysql_slurmdb_job_start() called
slurmdbd: debug2: DBD_JOB_START: ELIGIBLE CALL ID:1270168 
NAME:order_handler_serverfront
slurmdbd: debug2: as_mysql_slurmdb_job_start() called
slurmdbd: debug2: DBD_STEP_COMPLETE: ID:0.0 SUBMIT:0
slurmdbd: error: as_mysql_step_complete: Not inputing this job, it has no 
submit time.
slurmdbd: debug2: DBD_STEP_COMPLETE: ID:0.0 SUBMIT:0
slurmdbd: error: as_mysql_step_complete: Not inputing this job, it has no 
submit time.

Searching for this error yielded very little information. The dbd's were 
upgraded first, then the controllers, and lastly the compute nodes. After a few 
days with this, we tried downgrading to 14.03.08, but now we get the exact same 
behavior there. I thought perhaps this was due to state files (or similar) 
being "tainted" by the new version.

Are there any options in slurm.conf that can lead to this situation? I would 
appreciate any pointers, I've ran out of ideas. Also, I believe the error 
message ought to say "inputting"?

Wbr
Andreas

--------------------------------------------------------------------------
Confidentiality Notice: This message is private and may contain confidential 
and proprietary information. If you have received this message in error, please 
notify us and remove it from your system and note that you must not copy, 
distribute or take any action in reliance on it. Any unauthorized use or 
disclosure of the contents of this message is not permitted and may be unlawful.
 

Reply via email to