Re: [Wien] lapwso_mpi error

2016-12-08 Thread Peter Blaha

What kind of job is it, that lapw0_mpi runs for 9800 seconds ???

Is there any speedup when using 40 instead of 20 cores ?

Your error is in lapw1_mpi, not in lapwso_mpi ???

No idea about your software, but I doubt that it is wien2k.

Am 08.12.2016 um 16:56 schrieb Md. Fhokrul Islam:

Hi Prof Blaha,

I am trying to run an MPI job in 2 nodes each with 20 cores. But the job
crashes
with the following error messages. I have tried with both USE_REMOTE 0 and
USE_REMOTE 1 in parallel_options file but didn't make much of a deference.
Our system administrator told me it is not probably not a hardware issue
and
suggested me to contact Wien2k. So could you please let me know if I
need to
make any change in MPI setting and recompileWien2k.

By the way, the same job runs fine if I use only 1 node with 20 cores.

Error message:

case.dayfile

   cycle 1 (Thu Dec  8 15:44:06 CET 2016)  (100/99 to go)


  lapw0 -p(15:44:06) starting parallel lapw0 at Thu Dec  8

15:44:07 CET 2016
 .machine0 : 40 processors
9872.562u 20.276s 8:20.46 1976.7%   0+0k 220752+386840io 332pf+0w

  lapw1  -up -p-c (15:52:27) starting parallel lapw1 at

Thu Dec  8 15:52:27 CET 2016
->  starting parallel LAPW1 jobs at Thu Dec  8 15:52:27 CET 2016
running LAPW1 in parallel mode (using .machines)
1 number_of_parallel_jobs
 au039 au039 au039 au039 au039 au039 au039 au039 au039 au039 au039
au039 au039 au039 au039 au039 au039 au039 au039 au039 au042 au042 au042
au042 au042 au042 au042 au042 au042 au042 au042 au042 au042 au042 au042
au042 au042 au042 au042 au042(1)
--
MPI_ABORT was invoked on rank 8 in communicator MPI_COMM_WORLD
with errorcode -726817712.


Output error file:

 LAPW0 END
w2k_dispatch_signal(): received: Terminated
w2k_dispatch_signal(): received: Terminated
forrtl: Interrupted system call
w2k_dispatch_signal(): received: Terminated
w2k_dispatch_signal(): received: Terminated


Thanks,
Fhokrul



___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html



--
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/staff/tc_group_e.php
--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] lapwso_mpi error

2016-11-17 Thread Gavin Abo
So you are using the ifort version with the unformatted file read bug.  
Based on the Intel page at the link in the previous post below, did you 
try recompiling lapwso_mpi with -O0 or revert to one of the versions of 
ifort that Intel mentioned to see if it fixed the problem or not?


On 11/14/2016 8:34 AM, Md. Fhokrul Islam wrote:


Hi Gavin,


Thanks for your suggestion. Yes, I am using 16.0.3.210 version of 
ifort. Debugging such a


big file with 'od'  seems to be difficult but I will try with 
a smaller system and see if I get the


same error.



Fhokrul




*From:* Wien <wien-boun...@zeus.theochem.tuwien.ac.at> on behalf of 
Gavin Abo <gs...@crimson.ua.edu>

*Sent:* Sunday, November 13, 2016 11:40 PM
*To:* A Mailing list for WIEN2k users
*Subject:* Re: [Wien] lapwso_mpi error
Ok, I agree that it is likely not due to the set up of the scratch 
directory.


What version of ifort was used?  If you happened to use 16.0.3.210, 
maybe it is caused by an ifort bug [ 
https://software.intel.com/en-us/articles/read-failure-unformatted-file-io-psxe-16-update-3 
].



Perhaps you can use the linux "od" command to try to troubleshot and 
identify what the data mismatch is between the writing and reading of 
the 3Mn.vectordn_1 file, similar to what is described on the web pages at:


https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/269993

https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/270436

https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/268503


Though, it might be harder to diagnose with the large 3Mn.vectordn_1, 
which looks to be about 12 GB.  So you may want to create a mpi SO 
calculation that creates a smaller case.vectordn_1 for that.


On 11/13/2016 7:30 AM, Md. Fhokrul Islam wrote:


Hi Gavin,


   In my .bashrc scratch is defined as $SCRATCH = ./ so if I use the 
command


echo $SCRATCH, it always returns ./


For large jobs, I use local temporary directory that is associated 
with each node


in our system and is given by $SNIC_TMP.  This temporary directory is 
created


on fly, so I set $SCRATCH = $SNIC_TMP in my job submission script. As 
I said


this set up works fine if I do MPI calculations without spin-orbit 
and I get converged


results. But if I submit the job after initializing with spin-orbit, 
it crashes at lapwso.


SO I think problem is probably not due to the set up with scratch 
directory, it is


something to do with MPI version of LAPWSO.



Thanks for your comment.


Fhokrul

___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] lapwso_mpi error

2016-11-16 Thread Peter Blaha

Try to run   "x lapw2 -p -dn" with these vectors.

My suspicion is still, that memory (RAM) of just one node is not 
sufficient for such big calculations.


Another test:   reduce Emax in case.in1 from the default (5.0) back to 
1.5 and rerung lapw1.


On 11/16/2016 03:29 PM, Md. Fhokrul Islam wrote:

Hi Prof Blaha,


I have now tried MPI version of spin-orbit calculation for few
different systems.

For smallest system, like a GaAs unit cell, I don't have any problem.
But for large

systems containing more than 100 atoms, job crashes at LAPWSO at 1st
cycle.

The error message is similar in all cases:

 LAPW0 END
 LAPW1 END
 LAPW1 END
forrtl: severe (24): end-of-file during read, unit 9, file
/lunarc/nobackup/users/eishfh/WIEN2k/test/Sb2Te3/test/./test.vectordn_1
Image  PCRoutineLine
 Source
lapwso_mpi 004634E3  Unknown   Unknown  Unknown
lapwso_mpi 0047F3C4  Unknown   Unknown  Unknown
lapwso_mpi 0042BA1F  kptin_ 56  kptin.F
lapwso_mpi 00431566  MAIN__523  lapwso.F
lapwso_mpi 0040B3EE  Unknown   Unknown  Unknown
libc.so.6  2B4D5A25CB15  Unknown   Unknown  Unknown
lapwso_mpi 0040B2E9  Unknown   Unknown  Unknown


The size of the vector files after LAPW1:

-rw-r--r-- 1 eishfh kalmar 7429247822 Nov 14 19:02 test.vectordn_1
-rw-r--r-- 1 eishfh kalmar  10800 Nov 14 19:03 test.vectorsodn_1
-rw-r--r-- 1 eishfh kalmar  10800 Nov 14 19:03 test.vectorsoup_1
-rw-r--r-- 1 eishfh kalmar 7661483838 Nov 14 18:21 test.vectorup_1

Are the sizes for both up and down vector files supposed to be the same?
Is there
any way I can check if these unformatted files are corrupted/incomplete?

Since MPI version works for small system, I am not sure what could be
the problem?
The system where I am running the job or in MPI version of LAPWSO? I
think disk
space is not an issue, otherwise I would get an error related to that.

I would appreciate if you could suggest me something.

Thanks,
Fhokrul





*From:* Wien <wien-boun...@zeus.theochem.tuwien.ac.at> on behalf of
Peter Blaha <pbl...@theochem.tuwien.ac.at>
*Sent:* Friday, November 11, 2016 7:12 PM
*To:* A Mailing list for WIEN2k users
*Subject:* Re: [Wien] lapwso_mpi error


I have repeated the calculation as you suggested. I have used current

 > work directory as SCRATCH but I got the same error. I don't see
 > anything wrong with lapw1.

You have to send us detailed error messages.

It cannot be true that your SCRATCH is the working directory, when an
error points to  /local/slurmtmp.287632/3Mn.vectordn_1

What is your error now ?

Do you see this missing file:
/local/slurmtmp.287632/3Mn.vectordn_1

When doingll *vector*
in the "correct" directory, what length do these files have ?

PS: There was a bugreport for non-square processor grids (20=4*5) and
RLOs. Did you fix that ?
Eventually try 16 cores only.

Am 11.11.2016 um 16:01 schrieb Md. Fhokrul Islam:

/local/slurmtmp.287632/3Mn.vectordn_1


--
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WIEN 2k <http://www.wien2k.at/>
www.wien2k.at
The program package WIEN2k allows to perform electronic structure
calculations of solids using density functional theory (DFT). It is
based on the full-potential ...



WWW:   http://www.imc.tuwien.ac.at/staff/tc_group_e.php
Institute Technische Universität Wien :Fehler 404 - Seite nicht
gefunden <http://www.imc.tuwien.ac.at/staff/tc_group_e.php>
www.imc.tuwien.ac.at
Technische Universität Wien, TU Wien



--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
Wien -- A Mailing list for WIEN2k users
<http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien>
zeus.theochem.tuwien.ac.at
A Mailing list for WIEN2k users. Please post questions, suggestions or
comments about WIEN2k ONLY in this list. Please follow the following
"Nettiquette" (depending ...



SEARCH the MAILING-LIST at:
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Messages by Thread - The Mail Archive
<http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html>
www.mail-archive.com
Messages by Thread [Wien] convert to grace file ‪Amir lot‬ ‪ Re: [Wien]
convert to grace file Peter Blaha; Re: [Wien] convert to grace file
‪Amir lot‬ ‪





___
Wien mailing list
W

Re: [Wien] lapwso_mpi error

2016-11-14 Thread Md. Fhokrul Islam
Hi Gavin,


Thanks for your suggestion. Yes, I am using 16.0.3.210 version of ifort. 
Debugging such a

big file with 'od'  seems to be difficult but I will try with a smaller system 
and see if I get the

same error.



Fhokrul



From: Wien <wien-boun...@zeus.theochem.tuwien.ac.at> on behalf of Gavin Abo 
<gs...@crimson.ua.edu>
Sent: Sunday, November 13, 2016 11:40 PM
To: A Mailing list for WIEN2k users
Subject: Re: [Wien] lapwso_mpi error

Ok, I agree that it is likely not due to the set up of the scratch directory.

What version of ifort was used?  If you happened to use 16.0.3.210, maybe it is 
caused by an ifort bug [ 
https://software.intel.com/en-us/articles/read-failure-unformatted-file-io-psxe-16-update-3
 ].


Perhaps you can use the linux "od" command to try to troubleshot and identify 
what the data mismatch is between the writing and reading of the 3Mn.vectordn_1 
file, similar to what is described on the web pages at:

https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/269993

https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/270436

https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/268503


Though, it might be harder to diagnose with the large 3Mn.vectordn_1, which 
looks to be about 12 GB.  So you may want to create a mpi SO calculation that 
creates a smaller case.vectordn_1 for that.

On 11/13/2016 7:30 AM, Md. Fhokrul Islam wrote:

Hi Gavin,


   In my .bashrc scratch is defined as  $SCRATCH = ./ so if I use the command

echo $SCRATCH, it always returns ./


For large jobs, I use local temporary directory that is associated with each 
node

in our system and is given by $SNIC_TMP.  This temporary directory is created

on fly, so I set $SCRATCH = $SNIC_TMP in my job submission script. As I said

this set up works fine if I do MPI calculations without spin-orbit and I get 
converged

results. But if I submit the job after initializing with spin-orbit, it crashes 
at lapwso.

SO I think problem is probably not due to the set up with scratch directory, it 
is

something to do with MPI version of LAPWSO.



Thanks for your comment.


Fhokrul
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] lapwso_mpi error

2016-11-13 Thread Gavin Abo
Ok, I agree that it is likely not due to the set up of the scratch 
directory.


What version of ifort was used?  If you happened to use 16.0.3.210, 
maybe it is caused by an ifort bug [ 
https://software.intel.com/en-us/articles/read-failure-unformatted-file-io-psxe-16-update-3 
].


Perhaps you can use the linux "od" command to try to troubleshot and 
identify what the data mismatch is between the writing and reading of 
the 3Mn.vectordn_1 file, similar to what is described on the web pages at:


https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/269993
https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/270436
https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/268503

Though, it might be harder to diagnose with the large 3Mn.vectordn_1, 
which looks to be about 12 GB.  So you may want to create a mpi SO 
calculation that creates a smaller case.vectordn_1 for that.


On 11/13/2016 7:30 AM, Md. Fhokrul Islam wrote:


Hi Gavin,


   In my .bashrc scratch is defined as $SCRATCH = ./ so if I use the 
command


echo $SCRATCH, it always returns ./


For large jobs, I use local temporary directory that is associated 
with each node


in our system and is given by $SNIC_TMP.  This temporary directory is 
created


on fly, so I set $SCRATCH = $SNIC_TMP in my job submission script. As 
I said


this set up works fine if I do MPI calculations without spin-orbit and 
I get converged


results. But if I submit the job after initializing with spin-orbit, 
it crashes at lapwso.


SO I think problem is probably not due to the set up with scratch 
directory, it is


something to do with MPI version of LAPWSO.



Thanks for your comment.


Fhokrul

___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] lapwso_mpi error

2016-11-12 Thread Gavin Abo

If you use the terminal command: echo $SCRATCH

Does it return:

./

Looks like there might still be a problem with how SCRATCH is defined or 
how "./" is resolved by your system.


In the error message, you can see:

/lunarc/nobackup/users/eishfh/WIEN2k/GaAs_ZB/David_project/3Mn001/ALL/test-so/*3Mn/./3Mn.vectordn_1*

The "./" may be the cause of the problem, because I would expect the 
path to be:


/lunarc/nobackup/users/eishfh/WIEN2k/GaAs_ZB/David_project/3Mn001/ALL/test-so/*3Mn/3Mn.vectordn_1

*On 11/12/2016 5:33 PM, Md. Fhokrul Islam wrote:


Hi Prof. Blaha,


   I wasn't aware of the bug but I will check the updates. I have 
repeated calculation


with 16 cores (square processor grid) as you suggested but I still got 
the same error.


As before, job crashes at lapwso. I don't see any missing file as you 
can see from the


list of vector files.


-rw-r--r--. 1 eishfh kalmar 12427583862 Nov 12 10:04 3Mn.vectordn_1

-rw-r--r--. 1 eishfh kalmar   77760 Nov 12 10:26 3Mn.vectorsodn_1

-rw-r--r--. 1 eishfh kalmar   77760 Nov 12 10:26 3Mn.vectorsoup_1

-rw-r--r--. 1 eishfh kalmar 12428559726 Nov 12 04:17 3Mn.vectorup_1


Here are the dayfile and output error files. These are the only error 
messages I got.



case.dayfile:


cycle 1 (Sat Nov 12 01:21:39 CET 2016)  (100/99 to go)


>   lapw0 -p(01:21:39) starting parallel lapw0 at Sat Nov 12 
01:21:39 CET 2016


 .machine0 : 16 processors

14031.329u 15.362s 14:40.87 1594.6% 0+0k 90152+1974560io 175pf+0w

>   lapw1  -up -p   -c  (01:36:20) starting parallel lapw1 at Sat Nov 
12 01:36:20 CET 2016


-> starting parallel LAPW1 jobs at Sat Nov 12 01:36:20 CET 2016

running LAPW1 in parallel mode (using .machines)

1 number_of_parallel_jobs

  au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 
au188 au188 au188 au188 au188(1) 121331.481u 33186.223s 2:41:04.62 
1598.7%   0+0k 0+29485672io 118pf+0w


Summary of lapw1para:

au188 k=0 user=0  wallclock=0

121367.583u 33215.702s 2:41:06.83 1599.1%   0+0k 288+29487024io 
121pf+0w


>   lapw1  -dn -p   -c  (04:17:27) starting parallel lapw1 at Sat Nov 
12 04:17:27 CET 2016


-> starting parallel LAPW1 jobs at Sat Nov 12 04:17:27 CET 2016

running LAPW1 in parallel mode (using .machines.help)

1 number_of_parallel_jobs

  au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 
au188 au188 au188 au188 au188(1) 233187.228u 100041.449s 5:47:30.00 
1598.2%  0+0k 5832+35169304io 116pf+0w


Summary of lapw1para:

au188 k=0 user=0  wallclock=0

233263.580u 100102.639s 5:47:31.69 1598.7%  0+0k 6296+35170640io 
118pf+0w


>   lapwso -up  -p -c   (10:04:59) running LAPWSO in parallel mode

** LAPWSO crashed!

1233.319u 23.612s 21:29.72 97.4%0+0k 13064+7712io 17pf+0w

error: command 
/lunarc/nobackup/users/eishfh/SRC/Wien2k14.2-iomkl/lapwsopara -up -c 
lapwso.def   failed



>   stop error

---

lapwso.error file:

** Error in Parallel LAPWSO

** Error in Parallel LAPWSO


---

output error file:

 LAPW0 END

 LAPW1 END

 LAPW1 END

forrtl: severe (39): error during read, unit 9, file 
/lunarc/nobackup/users/eishfh/WIEN2k/GaAs_ZB/David_project/3Mn001/ALL/test-so/3Mn/./3Mn.vectordn_1


Image PCRoutineLine Source

lapwso_mpi 004634E3  Unknown   Unknown Unknown

lapwso_mpi 0047F3C4  Unknown   Unknown Unknown

lapwso_mpi 0042BA1F  kptin_ 56 kptin.F

lapwso_mpi 00431566  MAIN__523 
lapwso.F


lapwso_mpi 0040B3EE  Unknown   Unknown Unknown

libc.so.6 2BA34EDECB15  Unknown   Unknown Unknown

lapwso_mpi 0040B2E9  Unknown   Unknown Unknown


---


Thanks,
Fhokrul
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] lapwso_mpi error

2016-11-11 Thread Peter Blaha
> I have repeated the calculation as you suggested. I have used current 
> work directory as SCRATCH but I got the same error. I don't see

> anything wrong with lapw1.

You have to send us detailed error messages.

It cannot be true that your SCRATCH is the working directory, when an 
error points to  /local/slurmtmp.287632/3Mn.vectordn_1


What is your error now ?

Do you see this missing file:
/local/slurmtmp.287632/3Mn.vectordn_1

When doingll *vector*
in the "correct" directory, what length do these files have ?

PS: There was a bugreport for non-square processor grids (20=4*5) and 
RLOs. Did you fix that ?

Eventually try 16 cores only.

Am 11.11.2016 um 16:01 schrieb Md. Fhokrul Islam:

/local/slurmtmp.287632/3Mn.vectordn_1


--
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/staff/tc_group_e.php
--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] lapwso_mpi error

2016-11-11 Thread Md. Fhokrul Islam
Hi Prof. Blaha,


I have repeated the calculation as you suggested. I have used current work 
directory

as SCRATCH but I got the same error. I don't see anything wrong with lapw1. The

vector and eigenvalues files are there for both up and dn spin and the length of

error files is non-zero only for lapwso.error. I didn't get also any error 
message

due to disk space  in the output error file.  So I am not sure what is causing 
the error.

Please let me know if I need to check anything else.


Thanks,

Fhokrul



From: Wien <wien-boun...@zeus.theochem.tuwien.ac.at> on behalf of Peter Blaha 
<pbl...@theochem.tuwien.ac.at>
Sent: Friday, November 11, 2016 6:34 AM
To: A Mailing list for WIEN2k users
Subject: Re: [Wien] lapwso_mpi error

At first I would have guessed that you run out of memory (need more
cores for 300 atom cells). However, the error message points to the fact
that already lapw1 had a problem. Disk space ?? Or the scratch file
system was changed on your batch job 

Try to repeat it with lapw1/lapwso in the same batch job.

Am 11.11.2016 um 01:50 schrieb Md. Fhokrul Islam:
> Hi Prof. Blaha and Wien2k users,
>
>
> I am trying to run a spin-orbit calculation for an impurity problem
> with a  surface supercell
>
> containing 360 atoms. lapw1 worked fine but lapwso crashed with the
> following error message.
>
> Could you please let me know how to fix it.
>
>
>
> 1. case.dayfile:
>
>
>>   lapwso -up  -p -c   (00:53:15) running LAPWSO in parallel mode
> **  LAPWSO crashed!
> 1228.960u 24.221s 21:17.83 98.0%0+0k 7280+7712io 19pf+0w
> error: command
> /lunarc/nobackup/users/eishfh/SRC/Wien2k14.2-iomkl/lapwsopara -up -c
> lapwso.def   failed
>
>>   stop error
>
>
> 2. lapwso.error:
>
>
> **  Error in Parallel LAPWSO
> **  Error in Parallel LAPWSO
>
>
> 3. output error file:
>
>
> forrtl: severe (24): end-of-file during read, unit 9, file
> /local/slurmtmp.287632/3Mn.vectordn_1
> Image  PCRoutineLine
>  Source
> lapwso_mpi 004634E3  Unknown   Unknown  Unknown
> lapwso_mpi 0047F3C4  Unknown   Unknown  Unknown
> lapwso_mpi 0042BA1F  kptin_ 56  kptin.F
> lapwso_mpi 00431566  MAIN__523  lapwso.F
> lapwso_mpi 0040B3EE  Unknown   Unknown  Unknown
> libc.so.6  2B4243E6BB15  Unknown   Unknown  Unknown
> lapwso_mpi 0040B2E9  Unknown   Unknown  Unknown
>
>
>
> Thanks,
> Fhokrul
>
>
> ___
> Wien mailing list
> Wien@zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
Wien -- A Mailing list for WIEN2k 
users<http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien>
zeus.theochem.tuwien.ac.at
A Mailing list for WIEN2k users. Please post questions, suggestions or comments 
about WIEN2k ONLY in this list. Please follow the following "Nettiquette" 
(depending ...



> SEARCH the MAILING-LIST at:  
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Messages by Thread - The Mail 
Archive<http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html>
www.mail-archive.com
Messages by Thread [Wien] convert to grace file ‪Amir lot‬ ‪ Re: [Wien] convert 
to grace file Peter Blaha; Re: [Wien] convert to grace file ‪Amir lot‬ ‪



>

--
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WIEN 2k<http://www.wien2k.at/>
www.wien2k.at
The program package WIEN2k allows to perform electronic structure calculations 
of solids using density functional theory (DFT). It is based on the 
full-potential ...



WWW:   http://www.imc.tuwien.ac.at/staff/tc_group_e.php
Institute Technische Universität Wien :Fehler 404 - Seite nicht 
gefunden<http://www.imc.tuwien.ac.at/staff/tc_group_e.php>
www.imc.tuwien.ac.at
Technische Universität Wien, TU Wien



--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
Wien -- A Mailing list for WIEN2k 
users<http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien>
zeus.theochem.tuwien.ac.at
A Mailing list for WIEN2k users. Please post questions, suggestions or comments 
about WIEN2k ONLY in this list. Please follow the following "Nettiquette" 
(depending ...



SEARCH the MAIL

Re: [Wien] lapwso_mpi error

2016-11-10 Thread Peter Blaha
At first I would have guessed that you run out of memory (need more 
cores for 300 atom cells). However, the error message points to the fact 
that already lapw1 had a problem. Disk space ?? Or the scratch file 
system was changed on your batch job 


Try to repeat it with lapw1/lapwso in the same batch job.

Am 11.11.2016 um 01:50 schrieb Md. Fhokrul Islam:

Hi Prof. Blaha and Wien2k users,


I am trying to run a spin-orbit calculation for an impurity problem
with a  surface supercell

containing 360 atoms. lapw1 worked fine but lapwso crashed with the
following error message.

Could you please let me know how to fix it.



1. case.dayfile:



  lapwso -up  -p -c   (00:53:15) running LAPWSO in parallel mode

**  LAPWSO crashed!
1228.960u 24.221s 21:17.83 98.0%0+0k 7280+7712io 19pf+0w
error: command
/lunarc/nobackup/users/eishfh/SRC/Wien2k14.2-iomkl/lapwsopara -up -c
lapwso.def   failed


  stop error



2. lapwso.error:


**  Error in Parallel LAPWSO
**  Error in Parallel LAPWSO


3. output error file:


forrtl: severe (24): end-of-file during read, unit 9, file
/local/slurmtmp.287632/3Mn.vectordn_1
Image  PCRoutineLine
 Source
lapwso_mpi 004634E3  Unknown   Unknown  Unknown
lapwso_mpi 0047F3C4  Unknown   Unknown  Unknown
lapwso_mpi 0042BA1F  kptin_ 56  kptin.F
lapwso_mpi 00431566  MAIN__523  lapwso.F
lapwso_mpi 0040B3EE  Unknown   Unknown  Unknown
libc.so.6  2B4243E6BB15  Unknown   Unknown  Unknown
lapwso_mpi 0040B2E9  Unknown   Unknown  Unknown



Thanks,
Fhokrul


___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html



--
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/staff/tc_group_e.php
--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html