[slurm-dev] Re: Fixing corrupted slurm accounting?

Douglas Jacobsen Sat, 28 Oct 2017 09:17:26 -0700

Once you've got the end times fixed, youll need to manually update the
timestamps in the <cluster>_last_ran table to some time point before the
start of the earliest job fixed.  Then on the next hour mark, it'll start
rerolling up the past data to reflect the new reality you've set in the
database.


Unfortunately I'm away from a keyboard right now so I'm not 100% certain of
the table name.

On Oct 28, 2017 09:09, "Doug Meyer" <dameye...@gmail.com> wrote:

> Look up orphan jobs and lost.pl (quick script to find orphans) in
> https://groups.google.com/forum/#!forum/slurm-devel.
>
> Battling this myself right now.
>
> Thank you,
> Doug
>
> On Fri, Oct 27, 2017 at 9:00 PM, Bill Broadley <b...@cse.ucdavis.edu>
> wrote:
>
>>
>>
>> I noticed crazy high numbers in my reports, things like sreport user top:
>> Top 10 Users 2017-10-20T00:00:00 - 2017-10-26T23:59:59 (604800 secs)
>> Use reported in Percentage of Total
>> ------------------------------------------------------------
>> --------------------
>>   Cluster     Login     Proper Name         Account        Used   Energy
>> ---------     --------- --------------- --------------- -----------
>> --------
>>     MyClust   JoeUser   Joe User         jgrp           3710.15%    0.00%
>>
>> This was during a period when JoeUser hadn't submitted a single job.
>>
>> We have been through some slurm upgrades, figured one of the schema
>> tweaks had
>> confused things.  I looked in the slurm accounting table and found the
>> job_table.  I found 80,000 jobs with no end_time, that weren't actually
>> running.
>>  So I set the end_time = begin time for those 80,000 jobs.  It didn't
>> help the
>> reports.
>>
>> I then tried deleting all 80,000 jobs from the job_table and that didn't
>> help
>> either.
>>
>> Is there a way to rebuild the accounting data from the information in the
>> job_
>> table?
>>
>> Or any other suggestion for getting some sane numbers out?
>>
>
>

[slurm-dev] Re: Fixing corrupted slurm accounting?

Reply via email to