One other thing I noticed is that the contents of the *_job_table has entries in tres_alloc and tres_req that seem to match types in the tres_table, but there are no mem entries. For example, tres_table= +---------------+---------+------+----------------+------+ | creation_time | deleted | id | type | name | +---------------+---------+------+----------------+------+ | 1559250721 | 0 | 1 | cpu | | | 1559250721 | 0 | 2 | mem | | | 1559250721 | 0 | 3 | energy | | | 1559250721 | 0 | 4 | node | | | 1559250721 | 0 | 5 | billing | | | 1559250721 | 0 | 6 | fs | disk | | 1559250721 | 0 | 7 | vmem | | | 1559250721 | 0 | 8 | pages | | | 1559250721 | 1 | 1000 | dynamic_offset | | +---------------+---------+------+----------------+------+
But none of the jobs poplate a value for 2 (mem): +--------+-------------+------------------------------------+ | id_job | tres_req | tres_alloc | +--------+-------------+------------------------------------+ | 19779 | 1=1,4=1,5=1 | 1=4,4=1,5=4 | | 19780 | 1=1,4=1,5=1 | 1=4,4=1,5=4 | | 19781 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 | | 19782 | 1=1,4=1,5=1 | 1=16,4=1,5=16 | | 19783 | 1=1,4=1,5=1 | 1=16,4=1,5=16 | | 19784 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 | | 19785 | 1=1,4=1,5=1 | 1=16,4=1,5=16 | | 19786 | 1=1,4=1,5=1 | 1=16,4=1,5=16 | | 19787 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 | | 19788 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 | +--------+-------------+------------------------------------+ Brian Andrus On Mon, Sep 16, 2019 at 2:58 PM Brian Andrus <toomuc...@gmail.com> wrote: > I have > JobAcctGatherType = jobacct_gather/linux > > Brian > > On Mon, Sep 16, 2019 at 12:40 PM Antony Cleave <antony.cle...@gmail.com> > wrote: > >> Just a quick thought. >> >> What is your slurm.conf setting for this? >> >> *JobAcctGatherType* is operating system dependent and controls what >> mechanism is used to collect accounting information. Supported values are >> *jobacct_gather/linux* (recommended), *jobacct_gather/cgroup* and >> *jobacct_gather/none* (no information collected). >> >> Antony >> >> >> On Mon, 16 Sep 2019, 14:07 Brian Andrus, <toomuc...@gmail.com> wrote: >> >>> Yep, the maxrss field is always blank. >>> >>> I just checked on a different cluster and have the same result. Jobs >>> that completed last week even have nothing in that field. >>> >>> So how can I troubleshoot this? Is there a way to log the sql queries >>> made by slurmdbd? >>> >>> Brian >>> >>> On 9/15/2019 10:29 PM, Christopher Samuel wrote: >>> > On 9/15/19 4:17 PM, Brian Andrus wrote: >>> > >>> >> Are steps required to capture Max RSS? >>> > >>> > No, you should see a MaxRSS reported for the batch step, for instance: >>> > >>> > $ sacct -j $JOBID -o jobid,jobname,maxrss >>> > >>> > All the best, >>> > Chris >>> >>>