Hi, The following is the error message I got from slurmdbd.log. I got this error message after I try to add my clustername=hpctesis to slurmdbd using command "sudo sacctmgr add cluster hpctesis".
[2016-05-22T10:04:33.047] error: We should have gotten a new id: Table 'slurm_acct_db.hpctesis_job_table' doesn't exist [2016-05-22T10:04:33.047] error: couldn't add job 386 at job completion [2016-05-22T10:04:33.047] DBD_JOB_COMPLETE: cluster not registered Should I create a table named hpctesis_job_table manually ? as far as I understood, slurm should able to do this by it self..am I right ? how to solve this ? I need help. Thank you in advance, Regards, Husen On Sat, May 21, 2016 at 7:31 PM, Husen R <[email protected]> wrote: > Hi daniel, > > Thank you for your reply ! > > The error regarding mysql socket has been solved. > I forget to run slurmdbd daemon prior to running slurmctld daemon. > > however, I got this error message when I try to add cluster using sacctmgr > command : > > > ------------------------------------------------------------------------------------------------------ > > $ sudo sacctmgr add cluster comeon > > Adding Cluster(s) > Name = comeon > Would you like to commit changes? (You have 30 seconds to decide) > (N/y): y > Database is busy or waiting for lock from other user. > > ----------------------------------------------------------------------------------------------------------- > > How to fix this ? > Thank you in advance. > > Regards, > > > Husen > > On Sat, May 21, 2016 at 6:28 PM, Daniel Letai <[email protected]> wrote: > >> >> Does the socket file exists? >> What's in your /etc/my.cnf (or my.cnf.d/some other config file) under >> [mysqld]? >> [mysqld] >> socket=/path/to/datadir/mysql/mysql.sock >> >> If a socket value doesn't exist, either create one, or create a link >> between the actual socket file and /var/run/mysqld/mysqld.sock >> BTW - either you have a typo in your mail, or your socket is >> misconfigured - never saw mysqld.soc (without 'k' at end) as the name of >> the socket, although it's certainly legal. >> >> Other option is that the mysql server is not running - did you start the >> daemon? >> >> On 05/21/2016 01:45 PM, Husen R wrote: >> >>> Re: [slurm-dev] How to setup slurm database accounting feature >>> I checked slurmctld.log, I got this error message. how to solve this ? >>> >>> [2016-05-21T17:37:40.589] error: mysql_real_connect failed: 2002 Can't >>> connect to local MySQL server through socket '/var/run/mysqld/mysqld.soc$ >>> [2016-05-21T17:37:40.589] fatal: You haven't inited this storage yet. >>> >>> Thank you in advance >>> oe >>> Regards, >>> >>> >>> Husen >>> >>> On Sat, May 21, 2016 at 3:16 PM, Husen R <[email protected] <mailto: >>> [email protected]>> wrote: >>> >>> dear all, >>> >>> I tried to configure slurm accounting feature using database. >>> I already read the instruction available in this page >>> http://slurm.schedmd.com/accounting.html, but the accounting >>> feature still not working. >>> I got this error message when I try to execute sacct command : >>> >>> sacct: error: Problem talking to the database: Connection refused >>> >>> the following is my slurm.conf: >>> >>> >>> ----------------------------------------------------------------------------------Slurm.conf---------------------------------------------------------------- >>> >>> # >>> # Sample /etc/slurm.conf for mcr.llnl.gov <http://mcr.llnl.gov> >>> >>> # >>> ControlMachine=head-node >>> ControlAddr=head-node >>> #BackupController=mcrj >>> #BackupAddr=emcrj >>> # >>> AuthType=auth/munge >>> CheckpointType=checkpoint/blcr >>> #Epilog=/usr/local/slurm/etc/epilog >>> FastSchedule=1 >>> #JobCompLoc=/var/tmp/jette/slurm.job.log >>> JobCompType=jobcomp/mysql >>> #AccountingStorageType=accounting_storage/mysql >>> AccountingStorageType=accounting_storage/slurmdbd >>> AccountingStorageHost=localhost >>> AccountingStoragePass=/var/run/munge/munge.socket.2 >>> ClusterName=comeon >>> JobCompHost=head-node >>> JobCompPass=password >>> JobCompPort=3306 >>> JobCompUser=root >>> JobCredentialPrivateKey=/usr/local/etc/slurm.key >>> JobCredentialPublicCertificate=/usr/local/etc/slurm.cert >>> MsgAggregationParams=WindowMsgs=2,WindowTime=100 >>> PluginDir=/usr/local/lib/slurm >>> JobCheckpointDir=/mirror/source/cr >>> #Prolog=/usr/local/slurm/etc/prolog >>> MailProg=/usr/bin/mail >>> SchedulerType=sched/backfill >>> SelectType=select/linear >>> SlurmUser=slurm >>> SlurmctldLogFile=/var/tmp/slurmctld.log >>> SlurmctldPort=7002 >>> SlurmctldTimeout=300 >>> SlurmdPort=7003 >>> SlurmdSpoolDir=/var/tmp/slurmd.spool >>> SlurmdTimeout=300 >>> SlurmdLogFile=/var/tmp/slurmd.log >>> StateSaveLocation=/var/tmp/slurm.state >>> #SwitchType=switch/none >>> TreeWidth=50 >>> # >>> # Node Configurations >>> # >>> NodeName=DEFAULT CPUs=8 RealMemory=5949 TmpDisk=64000 State=UNKNOWN >>> NodeName=head-node,compute-node,spare-node >>> NodeAddr=head-node,compute-node,spare-node SocketsPerBoard=1 >>> CoresPerSocket=4 ThreadsPerCore=2 >>> # >>> # Partition Configurations >>> # >>> PartitionName=DEFAULT State=UP >>> PartitionName=comeon Nodes=head-node,compute-node,spare-node >>> MaxTime=168:00:00 MaxNodes=32 Default=YES >>> >>> >>> -------------------------------------------------------------------------------------------------------------------- >>> >>> what is the difference between slurmdbd and mysql ? >>> based on the information in this page, >>> http://slurm.schedmd.com/accounting.html, slurmdbd has its own >>> configuration file called slurmdbd.conf. >>> is there any example of slurmdbd.conf file ? where should I store >>> this file ? how do I setup slurm to read slurmdbd.conf file ? >>> >>> I have installed mysql. I also have created slurm_acct_db database. >>> I need help. >>> >>> Thank you in advance >>> >>> regards, >>> >>> >>> Husen >>> >>> >>> >>> >
