Hi All,
Has anyone had issues with sshare segfaulting? Specifically with "sshare -l"?
Any suggestions on how to figure this one out? Maybe there is something obvious
I'm not seeing. This has been happening for many slurm versions, I can't recall
when it started. For the last couple versions I'v
Hallo,
the cpu vs. cores vs. threads issues also confused me at the very
beginning. Although, in general, we do not encourage our users to make
use of hyperthreading, we have decided to leave it enabled in the BIOS
as there are some use cases that are known to benefit from
hyperthreading.
I think
Hi;
If you want to use the threads as cpus, you should set CR_CPU, instead
of CR_Core.
Regards;
Ahmet M.
12.07.2019 21:29 tarihinde mercan yazdı:
Hi;
You can find the Definitions of Socket, Core, & Thread at:
https://slurm.schedmd.com/mc_support.html
Your status:
CPUs=COREs=Sockets*Cor
Hi;
You can find the Definitions of Socket, Core, & Thread at:
https://slurm.schedmd.com/mc_support.html
Your status:
CPUs=COREs=Sockets*CoresPerSocket=1*4=4
Threads=COREs*ThreadsPerCore=4*2=8
Regards;
Ahmet M.
12.07.2019 20:15 tarihinde Hanu Pathuri yazdı:
Hi,
Here is my node informa
Hi,
Thank you so much for your quick responses!
It is much appreciated.
I dont have access to the cluster until next week, but I’ll be sure to follow
up on all of your suggestions and get back you next week.
Have a nice weekend!
Best regards
Palle
From: "slurm-u
Hi,
Here is my node information. I am confused with the terminology w.r.t CPU vs
CORE.
NodeName=hpathuri-linux CPUs=8 RealMemory=15833 Sockets=1 CoresPerSocket=4
ThreadsPerCore=2 State=UNKNOWN.
I am unable to schedule more than 4 tasks without over subscribing even through
my configuration look
Par, by 'poking around' Crhis means to use tools such as netstat and lsof.
Also I would look as ps -eaf --forest to make sure there are no 'orphaned'
jusbs sitting on that compute node.
Having said that though, I have a dim memory of a classic PBSPro error
message which says something about a netw
Just a reminder that the Early registration for the 2019 Slurm User Group
meeting will end in 2 days.
You can register at https://slug19.eventbrite.com/
The meeting will be held on 17-18 September 2019 in Salt Lake City at the
University of Utah
- *Early registration*
- May 14 through J
On 12/7/19 7:39 am, Pär Lundö wrote:
Presumably, the first 8 tasks originates from the first node (in this
case the lxclient11), and the other node (lxclient10) response as
predicted.
That looks right, it seems the other node has two processes fighting
over the same socket and that's breakin
Hi,
Thank you for your response.
When I do run it ("srun -N2 -n8 hostname") I get an error stating:
"srun: job step 83.0 aborted before step completely launched.
srun: error: task 0 launced failed: Unspecified error.
srun: error: task 1 launced failed: Unspecified error.
srun: error: task 2 Launce
Have you tried
srun -N# -n# mpirun python3
Perhaps you have no MPI environment being setup for the processes? There was
no "--mpi" flag in your "srun" command and we don't know if you have a default
value for that or not.
> On Jul 12, 2019, at 10:28 AM, Chris Samuel wrote:
On 11/7/19 11:04 pm, Pär Lundö wrote:
It works fine running on a single node(with ”-N1” instead of ”-N2”), but
it is aborted or stopped when running on two nodes.
What is the error you get?
Does the same srun command but with "hostname" instead of Python work?
--
Chris Samuel : http://www
I am trouble using or running a python-mpi program involving more than one
node. The pythom-mpi program is very simple,
do you think there's something unique about the python program?
(also, you mean mpi4py, right?)
Since authentication with Slurm is used via munge, do I need a passwordless
S
Dear all,
I have configured pam_slurm_adopt in our Slurm test environment by
following the corresponding documentation:
https://slurm.schedmd.com/pam_slurm_adopt.html
I've set `PrologFlags=contain´ in slurm.conf and also have task/cgroup
enabled along with task/affinity (i.e. `TaskPlugin=task/
14 matches
Mail list logo