Re: [Wien] slurm script

2024-06-18 Thread Peter Blaha
Just to make sure, I have two questions: - This script will allow running WIEN2k on slurm cluster without passwordless ssh login? No, I don't think so. Yes, this script should create the .machines file on the fly in CASE. In your example, probably 32 lines (I doubt that this will be very e

Re: [Wien] slurm script

2024-06-18 Thread pluto via Wien
Dear Prof. Blaha, Thank you for the quick answer. I will be trying to adapt the script with the help of our IT experts. Just to make sure, I have two questions: - This script will allow running WIEN2k on slurm cluster without passwordless ssh login? - Should this script generate .machines

Re: [Wien] slurm script

2024-06-17 Thread Peter Blaha
Below some comments: I have gotten the following error: [lplucin@iffslurm Au-bulk-test]$ more slurm-69063.out iffcluster0414: Using Ethernet for MPI communication. SBATCH: Command not found. It looks as if you have a line with SBATCH ... in your script instead of #SBATCH Probably t

[Wien] slurm script

2024-06-17 Thread pluto via Wien
Dear All, I am trying to setup the slurm submission without paswordless ssh. My parameters are: - only k-parallel and OMP (no mpi) - 8 cores per node (it is an older cluster) I have started with this script: http://www.wien2k.at/reg_user/faq/slurm.job and slightly adopted it (see the bottom

Re: [Wien] SLURM cluster issues

2024-04-26 Thread Straus, Daniel B
. Daniel Straus Assistant Professor Department of Chemistry Tulane University 5088 Percival Stern Hall 6400 Freret Street New Orleans, LA 70118 (504) 862-3585 http://straus.tulane.edu/ From: Peter Blaha Sent: Tuesday, April 16, 2024 1:58 AM To: wien@zeus.theochem.tuwien.ac.at Subject: Re: [Wien] SLURM

Re: [Wien] SLURM cluster issues

2024-04-15 Thread Peter Blaha
Hi, I am trying to set up WIEN2k ver 23.2 to run on a SLURM cluster. I have gotten it to work with SCALAPACK, runnning with a slurm batch submission script through w2web by following the examples. I have two issues. 1. Is it possible to make the “x dstart” button in the initialize web

[Wien] SLURM cluster issues

2024-04-15 Thread Straus, Daniel B
Hi, I am trying to set up WIEN2k ver 23.2 to run on a SLURM cluster. I have gotten it to work with SCALAPACK, runnning with a slurm batch submission script through w2web by following the examples. I have two issues. 1. Is it possible to make the "x dstart" button in the initialize web inte

[Wien] Slurm with omp/mpi

2023-12-30 Thread Laurence Marks
I am struggling a little to get stable performance for a large job using slurm. It seems that default parameters are not good for Wien2k, so some tweaking is needed. Some questions/requests for comments: 1) It looks like #SBATCH --overlap Is needed, and perhaps better "export SLURM_OVERLAP=1" in t

[Wien] Slurm + Intel impi

2023-12-12 Thread Laurence Marks
This may be too technical, but I thought I would ask as someone might have seen something similar. On a supercomputer using slurm/srun I am seeing irreproducible crashes, some a Sigsev in lapw1_mpi/elpa, sometime a bus error in lapw2_mpi. These are large calculations (Matrix size ~94K) using hybri

Re: [Wien] slurm mpi

2019-05-07 Thread Peter Blaha
So it seems to works now. The last messages are probably because you are using -it with a perfectly converged calculation. Remove (temporarely) the -it flag from the runsp_lapw and remember: -it may only be faster for surfaces and large cells. On 5/7/19 4:30 PM, webfin...@ukr.net wrot

Re: [Wien] slurm mpi

2019-05-07 Thread webfinder
Dear Prof. Blaha, I'm using intel mpi 2019.3.199 the scalapack and blacs libs are located in the intel compilers_and_libraries_2019.3.199 folder  OPTIONS file: current:FOPT:-O1 -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback -assume buffered_io -I$(MKLROOT)/include current:FPOPT:-O1

Re: [Wien] slurm mpi

2019-05-07 Thread Gavin Abo
The "Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password)" comes up with different causes in a Google search.  One time, that error seemed to go away with a user by having them ssh into the nodes and fix the ssh file permissions following the webpage: https://serverfault.com/qu

Re: [Wien] slurm mpi

2019-05-07 Thread Peter Blaha
Not enough info. I briefly checked your wiki (I have no idea of French), but you seem to have Intelmpi (which I would recommend). What mpi are you loading ? Did you load all modules also in the batch job What scalapack ? What blacs-library ? Post your OPTION files from $WIENROOT and also the

Re: [Wien] slurm mpi

2019-05-07 Thread webfinder
Dear Prof. BlahaThank you! The description of script for cluster is here  https://redmine.mcia.univ-bordeaux.fr/projects/cluster-curta/wiki/Slurm (unfortunately it is in french and I'm not strong in cluster structures) yes, the cluster uses "module" system. I'v used commands like "module load ...

Re: [Wien] slurm mpi

2019-05-07 Thread Peter Blaha
So it seems that your cluster forbids to use ssh (even on assigned nodes). If this is the case. you MUST use USE_REMOTE=0 and with k-parallel mode you can use only one node (32 cores). For mpi I do not know. There should be some "userguide" (web-site, wicki, ...) for your cluster, where all

Re: [Wien] slurm mpi

2019-05-07 Thread webfinder
Dear Prof. Blaha thank you for the explanation! Sorry, I should put hostname in quotes. Script I used is based on that in the WIEN-FAQ and produce .machines based on the nodes provided by the slurm: for k-points: # 1:n270  1:n270  1:n270  1:n270  1:n270 granularity:1 extrafine:1 for mpi: # 1

Re: [Wien] slurm mpi

2019-05-06 Thread Peter Blaha
When setting USE_REMOTE=0 it means, that you do not use "ssh" in k-parallel mode. This has the following consequences: What you write for "hostname" in .machines is not important, only the number of lines counts. And it will span as many k-parallel jobs as you have lines (1:hostname), but they

Re: [Wien] slurm mpi

2019-05-06 Thread Gavin Abo
WIEN2k 18.2 usersguide (pg. 237) has: USE_REMOTE [0|1] determines whether parallel jobs are run in background (on shared memory machines) or using ssh. Since you are utilizing ssh-copy-id for using ssh, you most likely need USE_REMOTE=1 [ https://www.mail-archive.com/wien@zeus.theochem.tuwie

[Wien] slurm mpi

2019-05-06 Thread webfinder
Dear wien2k users, wien2k_18.2 I'm trying to run a test task on a cluster with slurm batch system using mpi parallelization. In "parallel_options" USE_REMOTE=0, MPI_REMOTE=0. (during the siteconfig_lapw the slurm option was chosen) the k-point parallelization works well. But if I change the "slu

Re: [Wien] SLURM support "no ssh" for WIEN2k?

2015-11-12 Thread Laurence Marks
Dear All, As I am currently trying to get Wien2k running on Stampede (also SLURM), let me add a little clarification without disagreeing with anything Peter said. A typical workflow in Wien2k is (very simplified) an iterative loop controlled by csh scripts: 1) A single serial multithreaded or mp

Re: [Wien] SLURM support "no ssh" for WIEN2k?

2015-11-12 Thread Peter Blaha
Hi, WIEN2k has a usersguide, where the different parallelization modes are extensively described. On a cluster with a queuing system (like SLURM) it should not even be possible to access nodes (except the frontend) via ssh without using SLURM (on our SLURM machine ssh is possible only to nod

Re: [Wien] Slurm

2015-11-11 Thread Elias Assmann
On 11/11/2015 03:07 PM, Laurence Marks wrote: > And, at least in an interactive job, none of these work... > > Sigh. The man page of srun is also inconsistent with the actual srun > used Not sure if it will be better for your purposes, but what I use is scontrol show hostnames $SLURM_NODEL

Re: [Wien] Slurm

2015-11-11 Thread Laurence Marks
And, at least in an interactive job, none of these work... Sigh. The man page of srun is also inconsistent with the actual srun used On Wed, Nov 11, 2015 at 7:08 AM, Laurence Marks wrote: > Thanks. Does it produce a one line/entry list? > > I found a variant that might also work (need to te

Re: [Wien] Slurm

2015-11-11 Thread Laurence Marks
Thanks. Does it produce a one line/entry list? I found a variant that might also work (need to test) # Generate Machinefile for mpich such that hosts are in the same order as if run via srun srun -l /bin/hostname | sort -n | awk '{print $2}' > MACHINEFILE # Run using generated Machine file: mp

Re: [Wien] Slurm

2015-11-10 Thread Peter Blaha
The commands srun hostname -s >slurm.hosts set proclist=`cat slurm.hosts|sort` within a slurm-job give you a list of your hosts. Am 11.11.2015 um 01:04 schrieb Laurence Marks: Does anyone know the "machines" format for slurm? I want to expand Machines2W so it can use it (& I can use a slurm ba

[Wien] Slurm

2015-11-10 Thread Laurence Marks
Does anyone know the "machines" format for slurm? I want to expand Machines2W so it can use it (& I can use a slurm based system). ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAI