[slurm-dev] Re: Failed to access munge.socket.2

2016-04-06 Thread Husen R
Hello Lachlan,Chris Thank you for your reply. I don't know why "/usr/local" is appended to the path.. I tried to locate munge.socket.2 manually using locate command and the file is not exist indeed. The directory /usr/local/var/run/munge is empty. There is no munge directory in /var/run. I

[slurm-dev] Re: Website down?

2016-04-06 Thread Tim Wickberg
Our web host is experiencing some difficulties at the moment. I'm working with them to get things fixed now. I've mirrored a copy of the just-released slurm-15.08.10.tar.bz2 over here: http://download.schedmd.com/slurm/latest/ The documentation site (http://slurm.schedmd.com) has also been

[slurm-dev] Re: How does sbatch switch users?

2016-04-06 Thread Christopher Samuel
On 06/04/16 05:19, Michael Kit Gilbert wrote: > Would someone be able to point me to the code in Slurm that allows > sbatch to submit jobs as another user (using the --uid option)? I don't > see any use of the setuid() function for sbatch...is there some other > method employed? I don't believe

[slurm-dev] Re: Website down?

2016-04-06 Thread Tim Wickberg
Our web host is experiencing some difficulties at the moment. I'm working with them to get things fixed now. I've mirrored a copy of the just-released slurm-15.08.10.tar.bz2 over here: http://download.schedmd.com/slurm/latest/ cheers, - Tim On 04/06/2016 07:59 PM, Gene Soudlenkov wrote:

[slurm-dev] Re: Slurm squeue working but not pam module

2016-04-06 Thread Christopher Samuel
On 05/04/16 09:01, Mehdi Acheli wrote: > Everything in slurm is working fine. I can issue jobs and see the state > of the eight nodes as Idle. However, when I try to connect to a compute > node with a user, even if he has a job running on, I get rejected. The > log shows that the pam module is

[slurm-dev] Re: Failed to access munge.socket.2

2016-04-06 Thread Christopher Samuel
On 06/04/16 19:50, Husen R wrote: > however, when I tried to run sbatch I get the following error message: > > Failed to access "/usr/local/var/run/munge/munge.socket.2": No such file > or directory Is that path really correct? On our systems it's: /var/run/munge/munge.socket.2 Best of luck,

[slurm-dev] RE: Failed to access munge.socket.2

2016-04-06 Thread Simpson Lachlan
Husen, I’ve never seen a socket file in /usr/local I feel like it should be in vanilla /var/run? Because I had to build it, I discovered that I had to add a file to (I use Centos, ymmv) /usr/lib/tmpfiles.d/munge.conf that looks like this: d /var/run/munge 0755 munge munge – Which meant that

[slurm-dev] Re: Website down?

2016-04-06 Thread Lachlan Musicman
The direct link to the download isn't working for me either - http://www.schedmd.com/download/latest/slurm-15.08.10.tar.bz2 Cheers L. -- The most dangerous phrase in the language is, "We've always done it this way." - Grace Hopper On 7 April 2016 at 09:57, Gene Soudlenkov

[slurm-dev] Re: Website down?

2016-04-06 Thread Gene Soudlenkov
It does look like there is a problem! Doesn't work for me either Gene On 07/04/16 11:51, Lachlan Musicman wrote: Website down? Is it just me, or is schedmd.com having issues at the moment? I'm getting intermittent responses, nothing from the downloads page cheers

[slurm-dev] Website down?

2016-04-06 Thread Lachlan Musicman
Is it just me, or is schedmd.com having issues at the moment? I'm getting intermittent responses, nothing from the downloads page cheers L. -- The most dangerous phrase in the language is, "We've always done it this way." - Grace Hopper

[slurm-dev] Re: Fix for need to restart slurmctld when adding user to accounting

2016-04-06 Thread Bill Broadley
On 04/06/2016 02:51 AM, James Oguya wrote: > I've also experienced the same problem 15.08.x. I run both slurmdbd & > slurmctld on the same head node but I've explicitly configured slurm to use > non-localhost IP address as the ControlAddr. > > - slurm.conf -> /etc/slurm/slurm.conf >

[slurm-dev] Slurm version 15.08.10 is now available

2016-04-06 Thread jette
Slurm version 15.08.10 is now available. It includes a fix for a race condition which can result in an invalid memory reference, likely causing the slurmctld daemon to crash. Other changes described below. Slurm downloads are available from http://www.schedmd.com/#repos * Changes in Slurm

[slurm-dev] Re: A couple questions about accounting

2016-04-06 Thread Michael Gutteridge
Hi > I was wondering why partition associations are optional when adding a > user to the accounting DB. I have not found any mention of what happens > if a user has no associations to any partition. My understanding is that this is simply an additional "place" where you can have a resource limit

[slurm-dev] A couple questions about accounting

2016-04-06 Thread Patrice Peterson
Hi list, I was wondering why partition associations are optional when adding a user to the accounting DB. I have not found any mention of what happens if a user has no associations to any partition. - Would the user be able to submit jobs to *all* partitions? - Would she not be able to submit

[slurm-dev] Re: Fair share priority stopped working

2016-04-06 Thread Nirmal Seenu
I tried modifying a lot of the values but the only thing that re-enabled the fairshare priority was addition of the following parameter: PriorityFlags=FAIR_TREE Nirmal On Mon, Apr 4, 2016 at 3:47 AM, Loris Bennett wrote: > > Hi Nirmal, > > Nirmal Seenu

[slurm-dev] Re: Fix for need to restart slurmctld when adding user to accounting

2016-04-06 Thread James Oguya
I've also experienced the same problem 15.08.x. I run both slurmdbd & slurmctld on the same head node but I've explicitly configured slurm to use non-localhost IP address as the ControlAddr. - slurm.conf -> /etc/slurm/slurm.conf ControlMachine=hpc ControlAddr=192.168.5.3 - slurmdbd.conf ->

[slurm-dev] Failed to access munge.socket.2

2016-04-06 Thread Husen R
Hello everyone, I have installed slurm-15.08.9 succesfully. however, when I tried to run sbatch I get the following error message: Failed to access "/usr/local/var/run/munge/munge.socket.2": No such file or directory I tried to solve this problem by reinstalling munge and recreating munge.key.

[slurm-dev] Re: Fix for need to restart slurmctld when adding user to accounting

2016-04-06 Thread Christopher Samuel
On 31/03/16 16:04, Bill Broadley wrote: > So any sacctmgr change would trigger slurmdbd to try to talk to > slurmctld over 127.0.0.1 and fail. But restarting slurmctld would work. Yeah, we would never have noticed as we run a central slurmdbd on a different machine so they've always connected

[slurm-dev] Re: Slurm squeue working but not pam module

2016-04-06 Thread Diego Zuccato
Il 05/04/2016 01:00, Mehdi Acheli ha scritto: > Everything in slurm is working fine. I can issue jobs and see the state > of the eight nodes as Idle. However, when I try to connect to a compute > node with a user, even if he has a job running on, I get rejected. The IIUC, that's the correct