Hi, We just had a slurmdbd crash yesterday with the following log.
[2016-02-10T07:00:20.066] error: mysql_query failed: 1030 Got error 28 from storage engine select job.job_db_inx, job.id_assoc, job.id_wckey, job.array_task_pending, job.time_eligible, job.time_start, job.time_end, job.time_suspended, job.cpus_req, job.id_resv, job.tres_alloc, SUM(step.consumed_energy) from "perceus-00_job_table" as job left outer join "perceus-00_step_table" as step on job.job_db_inx=step.job_db_inx and (step.id_step>=0) where (job.time_eligible && job.time_eligible < 1455116400 && (job.time_end >= 1455112800 || job.time_end = 0)) group by job.job_db_inx order by job.id_assoc, job.time_eligible This was on 15.08.6. We are also seeing a bunch of errors similar to the following. [2016-02-10T06:00:22.249] error: We have more allocated time than is possible (108445785192 > 6307200) for cluster perceus-00(1752) from 2016-02-10T05:00:00 - 2016-02-10T06:00:00 tres 2 [2016-02-10T06:00:22.262] error: We have more time than is possible (6307200+745499+0)(7052699) > 6307200 for cluster perceus-00(1752) from 2016-02-10T05:00:00 - 2016-02-10T06:00:00 tres 2 I see a bug report and it's marked as resolved in 15.08.3 ( http://bugs.schedmd.com/show_bug.cgi?id=2068). How do we fix it? Thanks, Yong Qin
