Hi Jagga,

Some of those messages below ('adding column...') look like what you'd see
occasionally when upgrading slurm -- sometimes the update changes the database
schema. Do you still see those messages, or was it a once-off?

Please note as well that 'sacct' only shows info for recent jobs. You probably
want to get the man page for 'sreport' for more longer-term accounting info.


Also, I'd recommend using the SlurmDBD as an interface between slurm and your
database. It'll make your life easier in the future if you have multiple
clusters.

It would involve changing your slurm.conf to use something like this:

  AccountingStorageHost=myhost01
  AccountingStorageType=accounting_storage/slurmdbd

..and creating a slurmdbd.conf on 'myhost01'.

More details here: http://slurm.schedmd.com/accounting.html

Paddy

On Sat, Jan 04, 2014 at 02:46:01PM -0800, Jagga Soorma wrote:

> Hello,
> 
> I am new to slurm and was trying to enable the accounting portion of slurm
> for better job tracking.  I was able to get things setup but seem to be
> missing the account filed from the output as well have some db related
> output when running the sacct command which won't go away:
> 
> ssfslurmd01:/etc/slurm # sacct
> 
> sacct: adding column cluster after node_name in table cluster_event_table
> 
> sacct: adding column period_start after state in table cluster_event_table
> 
> sacct: adding column period_end after period_start in table
> cluster_event_table
> 
> sacct: dropping column time_start from table cluster_event_table
> 
> sacct: dropping column time_end from table cluster_event_table
> 
> sacct: Renaming old tables with _old behind them.
> 
> sacct: Converting old event table for amber, this may take some time,
> please do not restart.
> 
> sacct: Converting old event table for cluster, this may take some time,
> please do not restart.
> 
>        JobID    JobName  Partition    Account  AllocCPUS      State
> ExitCode
> 
> ------------ ---------- ---------- ---------- ---------- ----------
> --------
> 
> 45             hostname production                     1  COMPLETED
> 0:0
> 
> 46                sleep production                     1  COMPLETED
> 0:0
> 
> ssfslurmd01:/etc/slurm #
> 
> 
> My slurm.conf:
> 
> 
> ControlMachine=ssfslurmd01
> 
> ControlAddr=10.36.245.23
> 
> AuthType=auth/munge
> 
> CacheGroups=0
> 
> CryptoType=crypto/munge
> 
> MpiDefault=none
> 
> ProctrackType=proctrack/pgid
> 
> ReturnToService=1
> 
> SlurmctldPidFile=/var/run/slurmctld.pid
> 
> SlurmctldPort=6817
> 
> SlurmdPidFile=/var/run/slurmd.pid
> 
> SlurmdPort=6818
> 
> SlurmdSpoolDir=/tmp/slurmd
> 
> SlurmUser=lsfadmin
> 
> StateSaveLocation=/tmp
> 
> SwitchType=switch/none
> 
> TaskPlugin=task/none
> 
> InactiveLimit=0
> 
> KillWait=30
> 
> MinJobAge=300
> 
> SlurmctldTimeout=120
> 
> SlurmdTimeout=300
> 
> Waittime=0
> 
> FastSchedule=1
> 
> SchedulerType=sched/backfill
> 
> SchedulerPort=7321
> 
> SelectType=select/cons_res
> 
> SelectTypeParameters=CR_CPU_Memory
> 
> GresTypes=gpu
> 
> AccountingStorageHost=127.0.0.1
> 
> AccountingStoragePass=slurm
> 
> AccountingStorageType=accounting_storage/mysql
> 
> AccountingStorageUser=simran
> 
> AccountingStoreJobComment=YES
> 
> ClusterName=cluster
> 
> JobCompType=jobcomp/none
> 
> JobAcctGatherFrequency=30
> 
> JobAcctGatherType=jobacct_gather/none
> 
> SlurmctldDebug=3
> 
> SlurmdDebug=3
> 
> NodeName=ssfslurmc0[1] Procs=2 RealMemory=2006 State=UNKNOWN
> 
> PartitionName=debug Nodes=ssfslurmc0[1] Default=NO MaxTime=INFINITE State=UP
> 
> PartitionName=production Nodes=ssfslurmc0[1] Default=YES MaxTime=INFINITE
> State=UP
> 
> Thanks for your assistance with this.
> 
> Regards,
> -J

-- 
Paddy Doyle
Trinity Centre for High Performance Computing,
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
Phone: +353-1-896-3725
http://www.tchpc.tcd.ie/

Reply via email to