Re: [slurm-users] Running pyMPI on several nodes

2019-08-12 Thread Pär Lundö
juli 2019 12:32 *To:* "Slurm User Community List" *Subject:* Re: [slurm-users] Running pyMPI on several nodes srun: error: Application launch failed: Invalid node name specified Hearns Law. All batch system problems are DNS problems. Seriously though - check out your name resolution both on

Re: [slurm-users] Running pyMPI on several nodes

2019-08-12 Thread Benson Muite
mented the proposed changes but still no luck. Best regards, Palle *From:* "slurm-users" *Sent:* 16 juli 2019 12:32 *To:* "Slurm User Community List" *Subject:* Re: [slurm-users] Running pyMPI

Re: [slurm-users] Running pyMPI on several nodes

2019-08-12 Thread Pär Lundö
rds, Palle *From:* "slurm-users" *Sent:* 16 juli 2019 12:32 *To:* "Slurm User Community List" *Subject:* Re: [slurm-users] Running pyMPI on several nodes srun: error: Application launch failed: Invalid node name specified Hearns Law. All batch system problems are DNS problems.

Re: [slurm-users] Running pyMPI on several nodes

2019-07-16 Thread Benson Muite
regards, Palle *From:* "slurm-users" *Sent:* 16 juli 2019 12:32 *To:* "Slurm User Community List" *Subject:* Re: [slurm-users] Running pyMPI on several nodes srun: error: Application launch faile

Re: [slurm-users] Running pyMPI on several nodes

2019-07-16 Thread Pär Lundö
proposed changes but still no luck. Best regards, Palle From: "slurm-users" Sent: 16 juli 2019 12:32 To: "Slurm User Community List" Subject: Re: [slurm-users] Running pyMPI on several nodes srun: error: Application launch failed: Invalid

Re: [slurm-users] Running pyMPI on several nodes

2019-07-16 Thread John Hearns
gt; Hi, > > Thank you so much for your quick responses! > It is much appreciated. > I dont have access to the cluster until next week, but I’ll be sure to > follow up on all of your suggestions and get back you next week. > > Have a nice weekend! > Best regards > Pall

Re: [slurm-users] Running pyMPI on several nodes

2019-07-16 Thread Pär Lundö
----- *From:* "slurm-users" *Sent:* 12 juli 2019 17:37 *To:* "Slurm User Community List" *Subject:* Re: [slurm-users] Running pyMPI on several nodes Par, by 'poking around' Crhis means to use tools such as netstat and lsof. Also I would look as ps -eaf --fores

Re: [slurm-users] Running pyMPI on several nodes

2019-07-12 Thread Pär Lundö
slurm-users" Sent: 12 juli 2019 17:37 To: "Slurm User Community List" Subject: Re: [slurm-users] Running pyMPI on several nodes Par, by 'poking around' Crhis means to use tools such as netstat and lsof. Also I would look as ps -eaf --forest to make sure there are no &#x

Re: [slurm-users] Running pyMPI on several nodes

2019-07-12 Thread John Hearns
Par, by 'poking around' Crhis means to use tools such as netstat and lsof. Also I would look as ps -eaf --forest to make sure there are no 'orphaned' jusbs sitting on that compute node. Having said that though, I have a dim memory of a classic PBSPro error message which says something about a netw

Re: [slurm-users] Running pyMPI on several nodes

2019-07-12 Thread Chris Samuel
On 12/7/19 7:39 am, Pär Lundö wrote: Presumably, the first 8 tasks originates from the first node (in this case the lxclient11), and the other node (lxclient10) response as predicted. That looks right, it seems the other node has two processes fighting over the same socket and that's breakin

Re: [slurm-users] Running pyMPI on several nodes

2019-07-12 Thread Pär Lundö
urmd/lxclient10_83.0´: No such file or directory [2019-07-12T14:57:56.019][83.0] done with job " Best regards Palle From: "slurm-users" Sent: 12 juli 2019 08:46 To: "Slurm User Community List" Subject: Re: [slurm-users] Running pyMPI

Re: [slurm-users] Running pyMPI on several nodes

2019-07-12 Thread Jeffrey Frey
Have you tried srun -N# -n# mpirun python3 Perhaps you have no MPI environment being setup for the processes? There was no "--mpi" flag in your "srun" command and we don't know if you have a default value for that or not. > On Jul 12, 2019, at 10:28 AM, Chris Samuel wrote:

Re: [slurm-users] Running pyMPI on several nodes

2019-07-12 Thread Chris Samuel
On 11/7/19 11:04 pm, Pär Lundö wrote: It works fine running on a single node(with ”-N1” instead of ”-N2”), but it is aborted or stopped when running on two nodes. What is the error you get? Does the same srun command but with "hostname" instead of Python work? -- Chris Samuel : http://www

Re: [slurm-users] Running pyMPI on several nodes

2019-07-12 Thread Mark Hahn
I am trouble using or running a python-mpi program involving more than one node. The pythom-mpi program is very simple, do you think there's something unique about the python program? (also, you mean mpi4py, right?) Since authentication with Slurm is used via munge, do I need a passwordless S

Re: [slurm-users] Running pyMPI on several nodes

2019-07-11 Thread John Hearns
Please try something very simple such as a hello world program or srun -N2 -n8 hostname What is the error message which you have ? On Fri, 12 Jul 2019 at 07:07, Pär Lundö wrote: > > Hi there Slurm-experts! > I am trouble using or running a python-mpi program involving more than > one node. The

Re: [slurm-users] Running pyMPI on several nodes

2019-07-11 Thread John Hearns
MY apology. You do say that the Python program simply printe the rank - so is a hello world program. On Fri, 12 Jul 2019 at 07:45, John Hearns wrote: > Please try something very simple such as a hello world program or > srun -N2 -n8 hostname > > What is the error message which you have ? > > On

[slurm-users] Running pyMPI on several nodes

2019-07-11 Thread Pär Lundö
Hi there Slurm-experts! I am trouble using or running a python-mpi program involving more than one node. The pythom-mpi program is very simple, it only lists the number of ranks that is available in its environment. I have a munge-daemon running prior to starting the slurm-service and the prog