Hi Andy, I've made the suggestions you've suggested and I'm experiencing connection issues now which I'm trying to solve.
Thanks for the help. Regards, 2014-08-15 12:52 GMT-03:00 Andy Riebs <[email protected]>: > Hi Erica, > > Two suggestions: > > 1. By convention, the fact that "MailProg" is commented out usually > implies that it's the default. If you don't have /bin/mail, but you do have > /usr/bin/mail, you could specify that. More likely, you'll need to install > one of the classic mail packages, like "mailx". > 2. I doubt it's a best practice, but we use /var/tmp for > "StateSaveLocation," and thus bypass questions about who has write access > to what. > > Andy > > On 08/15/2014 11:40 AM, Erica Riello wrote: > > Hi Andy, > > thanks for the advice. > > I've got a problem running the command you mentioned: > > scontrol show config | grep MailProg > slurm_load_ctl_conf error: Unable to contact slurm controller (connect > failure) > > There's a directory /var/spool/slurmd, where there's an empty file > calledcred_state. > > I'm using 14.03.6 version of Slurm, I've build it myself. > > The slurm.conf is copied below: > > # slurm.conf file generated by configurator easy.html. > # Put this file on all nodes of your cluster. > # See the slurm.conf man page for more information. > # > ControlMachine=erica-VirtualBox > #ControlAddr= > # > #MailProg=/bin/mail > MpiDefault=none > #MpiParams=ports=#-# > ProctrackType=proctrack/pgid > ReturnToService=1 > SlurmctldPidFile=/var/run/slurmctld.pid > #SlurmctldPort=6817 > SlurmdPidFile=/var/run/slurmd.pid > #SlurmdPort=6818 > SlurmdSpoolDir=/var/spool/slurmd > SlurmUser=slurm > #SlurmdUser=root > StateSaveLocation=/var/spool > SwitchType=switch/none > TaskPlugin=task/none > # > # > # TIMERS > #KillWait=30 > #MinJobAge=300 > #SlurmctldTimeout=120 > #SlurmdTimeout=300 > # > # > # SCHEDULING > FastSchedule=1 > SchedulerType=sched/backfill > #SchedulerPort=7321 > SelectType=select/linear > # > # > # LOGGING AND ACCOUNTING > AccountingStorageType=accounting_storage/none > ClusterName=cluster > #JobAcctGatherFrequency=30 > JobAcctGatherType=jobacct_gather/linux > #SlurmctldDebug=3 > SlurmctldLogFile=/var/log/slurm/slurmctld > #SlurmdDebug=3 > SlurmdLogFile=/var/log/slurm/slurmd > # > # > # COMPUTE NODES > NodeName=erica-VirtualBox CPUs=1 RealMemory=2002 Sockets=1 > CoresPerSocket=1 ThreadsPerCore=1 State=UNKNOWN > PartitionName=particao1 Nodes=erica-VirtualBox Default=YES > MaxTime=INFINITE State=UP > > Is there any clue of what may be wrong? > > Regards, > > > 2014-08-13 11:23 GMT-03:00 Andy Riebs <[email protected]>: > >> Hi Erica, >> >> You'll find much of this discussion takes place frequently, most recently >> about a week ago. >> >> To get started, >> >> - It looks like Slurm can't find a mail program. Use >> $ scontrol show config | grep MailProg >> to see what program Slurm is looking for. >> - You probably want a subdirectory in /var/spool, such as >> /var/spool/slurm, for your state save location so that Slurm doesn't need >> full root privs to write to it >> >> For further help from the people on this list, please include >> >> - What version of Slurm you are using >> - Whether you built it yourself, or if it came from a pre-built >> distribution >> - A copy of your slurm.conf file (you might want to obscure specific >> node names and other data that might be used to compromise your system) >> >> Also, as noted above, much of this is covered frequently; check the mail >> archives for more detail. (BTW, this is generally true of open source >> projects; most of them have frequent "Hey, I've just started using your >> program, and I've run into a hurdle..." discussions. You gain immediate >> credibility if you start your queries with "I've got a problem, and I can't >> find it in the mail archive.") >> >> Regards, >> Andy >> On 08/13/2014 09:56 AM, Erica Riello wrote: >> >> Hi all, >> >> I've installed slurm, and I when I try to start slurmctld, I get these >> errors: >> >> > slurmctld -D -vvvv >> slurmctld: pidfile not locked, assuming no running daemon >> slurmctld: error: Configured MailProg is invalid >> slurmctld: error: Job accounting information gathered, but not stored >> slurmctld: fatal: Incorrect permissions on state save loc: /var/spool >> >> Has anyone seen it before and know what might be the cause for such >> errors? >> >> Thanks in advance. >> >> Regards, >> >> -- >> =============== >> Erica Riello >> Computer Engineering Student PUC-Rio >> >> >> > > > -- > =============== > Erica Riello > Aluna Engenharia de Computação PUC-Rio > > > -- =============== Erica Riello Aluna Engenharia de Computação PUC-Rio
