[Wien] band character plotting with eece
Dear Wien2k mailing list, is this enough for spaghetti with band character for onsite hybrids? x lapw1 (-up/-dn) -orb -band x lapw2 (-up/-dn) -qtl -band or do I need also -eece -orb switches for lapw2? Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Few questions about onsite hybrids and so
Dear prof. Blaha, wow, thanks a lot for such a detailed answer. This was very helpful indeed. Best regards Pavel "Hi, Here are my comments. Most of them similar to what Laurence said. > I'm trying to calculate a band structure of Tb3Ga5O12 magneto-optical > crystal (cubic Ia-3d, 80 atoms). While I consider myself quite > > Luckily I'm not shooting completely blind as I have some high-quality > optical data where we can see some (very weak but also quite sharp and > hence noticeable f-f transitions in the band gap so I have some idea > how the Tb f states at least should look like). Significant optical > absorption start around 4eV but below that I see some very weak > electronic transitions in the 0.2-0.9eV range, around 2.5 and 3.5eV > (reportedly between f states located in the band gap). So I expect at > least three bunches of f states in band gap one occupied and the others > unoccupied. Unfortunately, I don't believe that these optical f-f transitions can be described by DFT. These are crystal-field splitted multiplet excitations, which are usually not accessible by DFT. PS: Optical transitions create an electron-hole pair and excitonic (correlation) effects can be very large. XPS creates a free electron and a hole and although this is also not a ground state, it is usually better described by groundstate DFT. >From your chemical formulae one expects Tb3+, i.e. a fully occupied spin-up 4f band and a single 4f electron occupied in spin-dn. Of course, PBE gives a metal and the 4f-dn states are pinned at EF. An orbital potential can split these states and single out a single 4f electron/atom. However, with orbital potentials in many cases one can obtain several different orbitals occupied, which depends on the starting density matrix. In other words, your solution may not be the ground state, but a metastable state. Therefor I'd do first GGA+SO, and "hope" that this gives me a bit larger occupancy of the "correct" 4f orbital. When you then calculate the density matrix from this solution, you may run in the lowest energy orbitally-ordered state. Eventually, you could also start from different density matrices and see to which solutions you converge and compare total energies (these manipulations are simpler in DFT+U than in EECE). RMTs: Since we cannot use HDLOs for orbital potentials, too large spheres are not good. However, (in particular for 3d systems) small spheres mean that only 80-90% of the d-charge is inside the sphere and thus gets shifted by the orbital potential. Thus one needs a larger U (or alpha) to get similar results with smaller RMTs. For later 4f atoms, however, the 4f are very localized (in Tb with RMT=2.0 97% of the 4f charge is inside spheres (see case.outputst). My personal choice would be RMT = 2.1 to 2.2). Relaxation: Yes, you can safely relax the O atoms when SO is switched off for them and the heavy atoms are fixed in case.inM. If this is just a powder X-ray structure, the O-positions could be quite wrong. Most 4f systems would be anti-ferromagnets, but with very low Neel temperature, which means that the energy difference between an AFM and FM ordering is very small. These are local moments and they do not care too much how their neighbors are polarized. > > Regarding the f electron correction I opted for onsite hybrid and > initialized it with init_orb_lapw -eece. > UG says that its better to use LDA for the exchange potential so I > copied case.in0 to case.in0eece_lapw where I replaced "XC_PBE" on the > first line with "EX_PBE VX_LDA EC_PBE VC_PBE". This is a misunderstanding. I'd use PBE in case.in0 since the Ga/O states should be much better described by PBE. However, for the double counting correction, LDA is numerically preferred and the UG says: "This is possible by copying case.in0 to case.in0eece_lda and specify VX_LDA". Note: it is case.in0eece_lda, not case.in0eece_lapw EECE vs DFT+U is a matter of taste. EECE has one adjustable parameter, DFT+U 1-2 (U and J). For 4f systems the "effective U" (J=0) is often not justified since the intraatomic J may be important. It may have quite some influence on the orbital magnetic moment. Anyway, both are approximations and for a proper gap you may need mBJ+U (or mBJ+EECE) with a smaller U (alpha). > The onsite hybrid calculation converged fine, I get a nice splitting of > the f states (albeit a bit too much maybe). > The other options would be +U obviously, I went for the hybrid because > it felt more rigorous, but I would also appreciate comments if someone > has maybe better experience with +U? > > Next step was to initialize spin-orbit interaction with init_so_lapw. I > started with the default 001 but I want to also try other directions > later and compare. I opted for no relativistic LOs (no support in > optics) and enabled it only for Tb and Ga. symetso created a new > structure (most notable I have more Tb inequivalent positions) and than > I manually fixed case.inso case.indm
Re: [Wien] Few questions about onsite hybrids and so
On Mon, 2024-02-12 at 20:57 +0800, Laurence Marks wrote: > With an RMT for Tb of 2.43 the O2p will leak into the Tb sphere. I > used 2.02. You may want to use -ecut .995 or simioar rather than a > fixed energy. Will try, thanks. > If your Ga & Tb positions are fixed then I guess -so might work in > MSR1a, I have never tried. > > N.B., I meant x-ray or neutron positions, the latter might be better > for the O. In my opinion you should not use peaks in spectra or band > gaps as these are excited state properties, and -eece is ground > state. That said, optimizing the hybrid fraction for positions gave > decent gaps for a few other cases as well. Never published as I have > no explanation. I fully agree that comparing band structure to optical spectra (and optical band gaps) is tricky (unless one can also do BSE). However on the contrary I have some good experience with XPS valence band measurements. For example I previously observed good agreement between position of some occupied defect states in the band gap as calculated with (full hybrid) DFT and observed by valence band XPS. Anyway, thanks again for all the suggestions, I'll also check if I can get good enough O positions from XRD to compare to the relaxed positions as dependent on the the hybrid fraction... Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Few questions about onsite hybrids and so
Dear prof. Marks, thanks a lot for your comments, just some follow up (I was not sure whether by "you can ask me offline" you meant private email, but hopefully this can still be also interesting for the list): On Mon, 2024-02-12 at 18:56 +0800, Laurence Marks wrote: > Many comments/responses: > a) You can do both forces and volume optimization with -eece, but not > with -so. Thanks for the clarification, this is very helpful, as I said I was under the impression that I can relax O positions with so as long as I don't turn on so for O ("UG 5.2.18 init_so_lapw: Since forces are not correct for atoms with SO, it can be very useful to suppress SO for light atoms (eg. the O-atoms in UO2 ), because then one can optimize the O-positions.") > b) For 4f what you did with case.in0eece is right, but check that it > does not get overwritten. I had to edit an overwrite out of my > runeece. Will check, thanks. > c) Expect the addition of -so to change things quite a lot -- and > very little! The nett change in the energy will be very small, and > you may want to think about the spin-ordering temperatures. Is your > compound ferromagnetic, antiferromagnetic or what? Honestly I have no idea, right now I have ferromagnetic, but this is something I want to take a closer look at as well. It could be quite complex, and maybe even questionable if I can even end with some results that are relevant for the room temperature optical measurements my colleagues are doing while within the limitations of collinear model... > d) People will tell you to use +U which will put the 4f electrons > really low. My recommendation is to ignore them. As you noted they > are in the valence regime. Noted > e) One way to fit the hybrid fraction is to get the best fit > (approximately) to the x-ray positions. This turned out for me to be > very reasonable. Just to double check, by "X-ray positions" you mean refined atomic positions from XRD or positions of the Tb states in XPS valence band spectrum? XPS is something I definitely have on my TODO list. > f) Beware too large RMTs. If you have these for the metal atoms then > you get the tails of the O 2p states within those RMTs and that can > give you artifacts. To be honest I have no feeling here about what are too large RMTs in this regard. I have 2.43 for Tb, 1.82 for Ga and 1.65 for O (this is almost touching spheres). How big decrease would you recommend 5-10%? > If you have other questions you can ask me offline if you want. You > may want to look at DOI: 10.1103/PhysRevMaterials.2.025001, > 10.1016/j.ultramic.2018.12.005, 10.1103/PhysRevMaterials.5.125002, > 10.1021/acs.inorgchem.2c04107 Note that the XPS is dominated (cross- > sections) by the 4f, and in TbScO3 that are at the Fermi edge (if it > is Tb3+, Tb4+ will be simpler). This is very exhaustive list, thanks. Will definitely read through it. Best regards Pavel > > On Mon, Feb 12, 2024 at 6:15 PM Pavel Ondračka > wrote: > > Dear Wien2k mailing list, > > > > I'm trying to calculate a band structure of Tb3Ga5O12 magneto- > > optical > > crystal (cubic Ia-3d, 80 atoms). While I consider myself quite > > experienced Wien2k user, I've always managed to stay away from f > > block > > elements, so my experience here is none. So besides the few > > questions I > > have I'll also try to somehow summarize what I did, please correct > > me > > if something was not OK. > > > > Luckily I'm not shooting completely blind as I have some high- > > quality > > optical data where we can see some (very weak but also quite sharp > > and > > hence noticeable f-f transitions in the band gap so I have some > > idea > > how the Tb f states at least should look like). Significant optical > > absorption start around 4eV but below that I see some very weak > > electronic transitions in the 0.2-0.9eV range, around 2.5 and 3.5eV > > (reportedly between f states located in the band gap). So I expect > > at > > least three bunches of f states in band gap one occupied and the > > others > > unoccupied. > > > > I've started with spin-polarized PBE, I'm reasonably sure the > > structure > > file is OK, albeit probably not much relaxed (but I was hoping I > > could > > find equilibrium volume and do relaxation at a later point). I did > > not > > opt for HDLOs even though the Tb sphere is quite big (2.43) since I > > would also like to try to get few momentum matrix elements later > > with > > optics, but I've increased the lmax to 14 and lvnsmax to 8 (lapw2 > > GMAX > > 16, fft factor 3 and 4x4x4 k-grid). > > > > The initial runsp went fine but the band structure is far from OK, &g
[Wien] Few questions about onsite hybrids and so
Dear Wien2k mailing list, I'm trying to calculate a band structure of Tb3Ga5O12 magneto-optical crystal (cubic Ia-3d, 80 atoms). While I consider myself quite experienced Wien2k user, I've always managed to stay away from f block elements, so my experience here is none. So besides the few questions I have I'll also try to somehow summarize what I did, please correct me if something was not OK. Luckily I'm not shooting completely blind as I have some high-quality optical data where we can see some (very weak but also quite sharp and hence noticeable f-f transitions in the band gap so I have some idea how the Tb f states at least should look like). Significant optical absorption start around 4eV but below that I see some very weak electronic transitions in the 0.2-0.9eV range, around 2.5 and 3.5eV (reportedly between f states located in the band gap). So I expect at least three bunches of f states in band gap one occupied and the others unoccupied. I've started with spin-polarized PBE, I'm reasonably sure the structure file is OK, albeit probably not much relaxed (but I was hoping I could find equilibrium volume and do relaxation at a later point). I did not opt for HDLOs even though the Tb sphere is quite big (2.43) since I would also like to try to get few momentum matrix elements later with optics, but I've increased the lmax to 14 and lvnsmax to 8 (lapw2 GMAX 16, fft factor 3 and 4x4x4 k-grid). The initial runsp went fine but the band structure is far from OK, I get only a single bunch of f states in the band gap clumped together (some of them are occupied so its metallic), but experimentally I should get and insulator (although the difference between the unoccupied and occupied f states in the band gap is only maybe 0.2eV). Regarding the f electron correction I opted for onsite hybrid and initialized it with init_orb_lapw -eece. UG says that its better to use LDA for the exchange potential so I copied case.in0 to case.in0eece_lapw where I replaced "XC_PBE" on the first line with "EX_PBE VX_LDA EC_PBE VC_PBE". The onsite hybrid calculation converged fine, I get a nice splitting of the f states (albeit a bit too much maybe). The other options would be +U obviously, I went for the hybrid because it felt more rigorous, but I would also appreciate comments if someone has maybe better experience with +U? Next step was to initialize spin-orbit interaction with init_so_lapw. I started with the default 001 but I want to also try other directions later and compare. I opted for no relativistic LOs (no support in optics) and enabled it only for Tb and Ga. symetso created a new structure (most notable I have more Tb inequivalent positions) and than I manually fixed case.inso case.indm and case.inorb as the init_so script warned me. I also guessed I should fix case.ineece (that seemed straightforward) but than I thought I should also fix case.in2eece. Reading UG gives the impression that case.in2eece is normal case.in2 with extra EECE on the first line and than the optional 3a and 3b lines. In the case.in2eece created automatically with init_orb_lapw - eece the 3a and 3b lines looked like: 1 1 1 3 However reading UG this actually seems wrong? Because UG says (Section 7.9 page 166) the format for optional 3b is just two values: jatom rho, l rho so I wonder if the UG is wrong or if I'm actually applying the hybrid correction to p instead of f? Also, is there anything else I should fix manually after intializing the so on top of eece? Or should I do it the other way around (first so and then eece)? The reasoning for doing first eece was that I get a metal with plain PBE and an insulator with the onsite hybrid, so I thought it might be easier to converge if I start so from insulator (but I still use TEMP smearing just to be sure I don't end with convergence problems if I get a metal during the convergence as the expected unoccupied occupied f-f distance is very small.) I was also considering mBJ later, just to get some feeling how the conduction bad would shift but I'm not sure if this would work or not on top of eece and so? One last question is regarding the forces. From reading the UG I understood that it should be OK to relax the oxygen positions with onsite hybrid and so (as long as I don't have so or eece enabled for O atoms). Is this correct? So will just switching to MSR1a and running normal runsp -so -eece work or are some other fixes needed? Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Speeding up calculations in parallel mose
Dear Viktor, at 54 atoms, you should have enough k-points to run k-parallel which is probably going to be the fastest option. So lapw1 + lapw2 k-parallel and the rest OpenMP parallelized. An example .machines file for your 8 cores could look like this: 1:localhost 1:localhost 1:localhost 1:localhost 1:localhost 1:localhost 1:localhost 1:localhost omp_global:8 omp_lapw1:1 omp_lapw2:1 Additionally double check that you are indeed using the correct libraries, go to you WIENROOT folder and do "ldd lapw1" you will see a list of libraries it is inked against, make sure there is a "libopenblas.so" and "libmvec.so" between them. Also do the same for lapw0 and double check that it is compiled with OpenMP (links to libgomp and links an OpenMP parallel openblas, I'm not familiar with Ubuntu but it should be the library that comes with the libopenblas-openmp package)). That should hopefully do the trick. In general you can also test the serial Wien2k benchmark. At single thread it should run around 20s on your CPU and it should scale quite reasonably maybe up to 4-6 threads with OMP_NUM_THREADS so that is also something you can check. Best regards Pavel On Tue, 2023-08-22 at 12:30 +0300, Victor Zenou wrote: > Dear Wien2k users!I’m investigating a 54 tungsten atoms supercell , > with 1 helium atom and 1 hydrogen atom (primitive cell) at different > interstitial sites. It takes ~ 46 hr per calculation cycle, and > half of it (~23 hr) in parallel mode. The Wien2k version 23.2 was > installed on Ubuntu 22.04.2 LTS. using gfortran and I set > OMP_NUM_THREADS to 1, and used 2 parallel_jobs in the current work. > The computer is build from i7-10700 processor @ 2.90GHz (8 cores; 16 > Threads), 32 GB memory and 500 GB SSD. > In the past using the same computer , it took me ~ 14 hr per cycle > for the same calculations, meaning 2-4 times faster than today. The > wien2k version was 21.1, bur I can’t remember if the calculations > were done in parallel, probably yes (I think the number of parallel > jobs was chosen automatically), and I think I set OMP_NUM_THREADS > to 4, but again I’m not sure. > How can I speed up my calculations using the same computer? > Best regards, Victor > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] FFTW compiling
Besides what Gavin mentioned, even if you get the correct fftw-openmpi (or elpa-mpi) package, Red Hat distros don't install mpi stuff by default into /usr/lib64. The reason is that more that one MPI runtime can be installed (like MPICH vs OpenMPI) and the mpi libraries are MPI- runtime specific. Thus the MPI libraries for the specific MPI runtime tend to be with some specific directories like /usr/lib64/openmpi/lib/ and this path will only get added to the LD_LIBRARY_PATH if you load the corresponding MPI module. Additionally, the module files are also compiler and MPI specific so also reside in custom directories, like /usr/lib64/gfortran/modules/openmpi/ Now the main issue is that this custom locations are just not possible to set up correctly with siteconfig, the script just doesn't give you that much freedom as this breaks some expectations is has. So manual editing of Makefiles is needed. In general one really needs to know what he is doing to link Wien2k against system libraries with Red Hat distros. So yeah, I've been there and I don't recommend it. In that regard the download source/configure/make/make install combo is usually simpler. This only concerns MPI though, if you can live with k-point + OpenMP parallelization only, than linking against system OpenBLAS and FFTW is quite simple. Best regards Pavel On Sun, 2023-06-04 at 13:31 +, Ilias Miroslav, doc. RNDr., PhD. wrote: > Hello, > > ad: > https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg18799.html > > I have CentOS 7, but the elpa rpm packages do not contain the > libfftw3_mpi.a file, only /usr/lib64/libfftw3_threads.a > > Can the libfftw3_threads.a file be instead of libfftw3_mpi.a ? > > Miro > > > PS: List of all fftw library files: > ls /usr/lib64/libfftw* > /usr/lib64/libfftw.a /usr/lib64/libfftw3_omp.so.3.3.2* > /usr/lib64/libfftw3f_omp.so.3@ /usr/lib64/libfftw3l_omp.so@ > /usr/lib64/libfftw.so@ /usr/lib64/libfftw3_threads.a > /usr/lib64/libfftw3f_omp.so.3.3.2* > /usr/lib64/libfftw3l_omp.so.3@ > /usr/lib64/libfftw.so.2@ /usr/lib64/libfftw3_threads.so@ > /usr/lib64/libfftw3f_threads.a > /usr/lib64/libfftw3l_omp.so.3.3.2* > /usr/lib64/libfftw.so.2.0.7* /usr/lib64/libfftw3_threads.so.3@ > /usr/lib64/libfftw3f_threads.so@ > /usr/lib64/libfftw3l_threads.a > /usr/lib64/libfftw3.a /usr/lib64/libfftw3_threads.so.3.3.2* > /usr/lib64/libfftw3f_threads.so.3@ > /usr/lib64/libfftw3l_threads.so@ > /usr/lib64/libfftw3.so@ /usr/lib64/libfftw3f.a > /usr/lib64/libfftw3f_threads.so.3.3.2* > /usr/lib64/libfftw3l_threads.so.3@ > /usr/lib64/libfftw3.so.3@ /usr/lib64/libfftw3f.so@ > /usr/lib64/libfftw3l.a > /usr/lib64/libfftw3l_threads.so.3.3.2* > /usr/lib64/libfftw3.so.3.3.2* /usr/lib64/libfftw3f.so.3@ > /usr/lib64/libfftw3l.so@ > /usr/lib64/libfftw_threads.a > /usr/lib64/libfftw3_omp.a /usr/lib64/libfftw3f.so.3.3.2* > /usr/lib64/libfftw3l.so.3@ /usr/lib64/libfftw_threads.so@ > /usr/lib64/libfftw3_omp.so@ /usr/lib64/libfftw3f_omp.a > /usr/lib64/libfftw3l.so.3.3.2* > /usr/lib64/libfftw_threads.so.2@ > /usr/lib64/libfftw3_omp.so.3@ /usr/lib64/libfftw3f_omp.so@ > /usr/lib64/libfftw3l_omp.a > /usr/lib64/libfftw_threads.so.2.0.7* > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] compilation of Wien2k with GNU OpenMPI, but with MKL library ?
I believe it should be possible, at least the MKL link time advisor https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html definitely allows to select GNU compiler and OpenMPI. But yeah, it might be more painful than going either fully Intel® oneAPI or GNU compilers+OpenBLAS+OpenMPI way. Best regards Pavel On Mon, 2023-06-05 at 12:11 +0200, Peter Blaha wrote: > As far as I know, you cannot mix libraries compiled with ifort or > with GNU compilers. At least in previous times, the objects would > have one or 2 "_" in their reference and it would not fit together. > Maybe there are some options to fix this, but I do not know. > > My recommendations is therefore: choose either Intel or GNU > compilers. > > For Intel you have to compile FFTW3 and ELPA yourself (see also the > instructions in the UG, these are always only 3 commands and it is > not so difficult) and can use the mkl for the rest. > > For GNU you can use the Openblas and the corresponding Linux packages > (if they exist) or you compile yourself with GNU. I don't know (but > doubt) if you can link the mkl-blas,... with GNU, but you don't need > mkl, because openblas is (almost) as good as mkl and "GNU-scalapack" > comes with Linux. > > When using Intel, you can use either Intelmpi or Openmpi, but the > name of the mkl blacks-library is different for the 2 mpi versions. > > > Am 05.06.2023 um 10:45 schrieb Ilias Miroslav, doc. RNDr., PhD.: > > > > > Ad: > > https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg22466.html > > > > Dear Professor Blaha, > > > > thanks for your answer. So to get Wien2k compiled with intel > > compilers, one needs FFTW3 and ELPA compiled with Intel compilers. > > > > Now the question is : if I use OpenMPI with FFTW3 and ELPA > > libraries compiled with GNU compilers, will the MKL libraries - > > blas,lapack, plus scalapack and blacs work, right ? > > > > Best, Miro > > > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Wien2k access outside local area network
Hello Chithra, it all depends on you network settings, in theory all you need is a public IP and open ports. But this forum is not the good place to ask, check with you local area network admin (who will also surely talk you through the potential security risks). Best regards Pavel On Fri, 2023-04-14 at 15:00 +0530, Chithra M Mathew wrote: > Sir > I have installed Wien2k software on my local area network Desktop. > Can I access my Wien2k software from outside Local area network. Is > it possible. please help me. > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] benchmark test withi9-12900k
Hi Sandeep, > > I have a query regarding this. > While performing serial or parallel calculations, on increasing omp > from 1 to 8 , %age use of cpu's does not increase in the same scale > (omp=2, 170to 180% , omp=4 ,300 to 330% omp=8 only 500 to 550%). > is something wrong in configuring or compiling the softwares or due > to some limitations in hardware. > Any suggestions? There are several factors, one is the threading support in the BLAS/LAPACK libraries and another one are the deficiencies of the Wien2k OpenMP parallelization. HW also comes into play, mostly in the general sense that the lower memory bandwidth you have the earlier you will see the flattening of the speedup with more threads. If you look at the lawp1 output you can see how the total time is mostly divided in 3 parts, for example: TIME HAMILT (CPU) = 2.8, HNS = 2.9, HORB = 0.0, DIAG =17.3, SYNC = 0.0 TIME HAMILT (WALL) = 0.7, HNS = 0.8, HORB = 0.0, DIAG = 4.7, SYNC = 0.0 scaling of DIAG part is mostly based on how your libraries scale (MKL does quite OK, but don't expect miracles). HAMILT scaling is based on explicit Wien2k parallelization. That one also doesn't scale too well past 4-6 cores. The reason is I was mostly learning OpenMP when I wrote it and I just went for the simplest "omp parallel for" solution probably at too high level (also because the support in ifort of higher OpenMP version with more advanced constructs was not so good at that time). I think that there could still be some speedup if this would be rewritten and the parallelization would happen at different level, maybe more similarly to how its parallelized with MPI so it fits better in the caches and could thus overcome the memory bandwidth limits better when scaling to more cores. HNS has no explicit threading at all and IIRC for the BLAS/LAPACK calls there the library-level threadidng didn't help much. This could be also improved by rewriting it to be more parallalization friendly (possibly again mirroring how the MPI version does it, which scales fine IIRC), but I'm not algebra expert so I haven't even tried. So yeah, no easy way how this can be improved, unless you know a bit about OpenMP and want to try yourself (BTW prof. Blaha was always very welcoming to contributions even though I'm not part of the Wien2k team :-) ). Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] How to find the exact value of infinte epsilon?
Dear Atefe Marasi, to add on top of what Xavier said, and thus similarly to him I'm also assuming that by "infinite epsilon, i.e., dielectric constant at high frequency" you mean the electronic part the dielectric tensor. Or in other words the part from the electronic excitations. That you can get with the optic, joint and kram commands. Regarding the band gap, if you have insulator, the mBJ potential should be a reasonable starting choice. The band gap values are OK and even though the band dispersion and thus the momentum matrix elements are not so good, it does not matter that much as the standard optic calculation neglects excitons anyway. However even if you don't get the proper shape of the imaginary dielectric function, in my experience this does not matter that much if you are interested in the real part of the dielectric function at E=0 (or at least reasonably far below the band gap). I used mBJ+optic quite a lot for refractive index calculations in vidible range for materials with band gap as low as 3eV and even there it worked quite OK. What is very important is to increase the maximum energy emax value in lapw1, optic and joint to be sure you include enough states to properly account for the high energy processes. Otherwise you will get some underestimation of the real part of the dielectric function at low energies. Best regards Pavel On Mon, 2021-11-29 at 11:46 +0100, xavier rocquefelte wrote: > Dear Atefe Marasi, > Infinite epsilon means that you extrapolate the epsilon value for the > zero of energy. > You must plot the real part of the dielectric function to properly > estimate this value. > You must be careful because if you have a band gap and a bad > description of the gap value ... the estimation will be not good at > all. > Regards > Xavier > > On 29/11/2021 11:38, Atefe Marasi wrote: > > > ___ > > Wien mailing list > > > > > > SEARCH the MAILING-LIST at: > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Compiling Wien2k 21.1 on Ubuntu 20.04 with gfortran
Hi Gavin, I think steps 1-5 (more or less the first 6 pages) can simplified to something like "sudo apt instal libxc-dev openblas-openmp-dev libscalapack-openmpi-dev openmpi-common libopenmpi-dev libfftw3-dev libfftw3-mpi-dev ...". (I just googled the package names as I'm not Ubuntu user, but I think you get the point). IMO there is no point compiling libraries on your own, especially if you don't use any extra flags or optimizations. I think everything relevant is in Ubuntu repos (maybe except ELPA?). I've been linking with system libraries (even the mpi ones like scalapack and ELPA) in Fedora for ages with no issues and no noticeable performance loss. Everything important for performance (like openblas, fftw or elpa has optimized kernels for multiple architectures and proper runtime selection anyway). Best regards Pavel On Fri, 2021-11-26 at 01:44 -0700, Gavin Abo wrote: > I'm using Ubuntu 20.04 LTS also but with a patched WIEN2k 21.1 that > was compiled with gfortran and OpenBLAS. The WIEN2k 21.1 bug fixes > (patches) I got from the past posts in the mailing list. A list of > the url links to those posts are in the README file at [1]. > I also recently encountered a SIGSEGV segmentation fault (core > dumped) runtime error when running lapw1 even though OpenBLAS 0.3.18 > compiled successfully. I try to use the latest stable release of > OpenBLAS, which is currently 0.3.18 [2]. However, in my case: My > system has an AMD processor that targets Barcelona, and as it turns > out, I found an OpenBLAS issue report at [3]. There it describes how > OpenBLAS 0.3.15 works but OpenBLAS 0.3.16, 0.3.17, and 0.3.18 crashes > for a processor that has a Barcelona target where the fix won't be > available until a future 0.3.19 release. As a workaround until > 0.3.19 becomes available, I found that I could use the current > OpenBLAS development version (0.3.18.dev) to have the fix. > I have not tried the compile settings at [4]. I'm using just a > 'basic' set of compile settings for being able to do serial, k-point > parallel, or mpi parallel with WIEN2k. By 'basic', I mean I using > non-optimized flags as I haven't went through the GNU documentation > [5] to optimize all the flags for my specific processor. > Should you be interested in the details on how I installed WIEN2k > 21.1 for my system. I have made it available at [6], which you will > probably find to be very similar to the older install described at > [7]. > [1] https://github.com/gsabo/WIEN2k-Patches/tree/master/21.1 > [2] https://github.com/xianyi/OpenBLAS/releases > [3] https://github.com/xianyi/OpenBLAS/issues/3421 > [4]https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg21482.html > [5] https://gcc.gnu.org/wiki/GFortran > [6]https://github.com/gsabo/WIEN2k-Docs/blob/main/WIEN2k21.1_Install_with_gfortran.pdf > [7]https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg21134.html > > Kind Regards, > Gavin > WIEN2k user > > On 11/24/2021 3:09 AM, David Holec wrote: > > > > > Hi Pavel, > > > > Many thanks for your insights. As you know, I am not an expert on > > how to compile codes, for me, this is sadly a trial and error > > adventure. > > > > I tried to compile it against the openblas library, but although > > the compilation ends without any errors, I get a segmentation fault > > when running lapw1 (on the test case > > from http://www.wien2k.at/reg_user/benchmark/). The current setting > > are: > > > > L Linker Flags: $(FOPT) -L/usr/lib/x86_64-linux- > > gnu/openblas64-openmp > > R R_LIBS (LAPACK+BLAS): /usr/lib/x86_64-linux- > > gnu/openblas64-openmp/libopenblas64.so.0 -lpthread > > > > (The rest is as I wrote in my first email.) Here is the list of > > linked libraries: > > $ ldd lapw1 > > linux-vdso.so.1 (0x7ffea57d6000) > > libopenblas64.so.0 => /lib/x86_64-linux- > > gnu/libopenblas64.so.0 (0x14fe2b2e5000) > > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 > > (0x14fe2b2c2000) > > libgfortran.so.5 => /lib/x86_64-linux-gnu/libgfortran.so.5 > > (0x14fe2affa000) > > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 > > (0x14fe2aeab000) > > libmvec.so.1 => /lib/x86_64-linux-gnu/libmvec.so.1 > > (0x14fe2ae7f000) > > libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 > > (0x14fe2ae3d000) > > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 > > (0x14fe2ac49000) > > /lib64/ld-linux-x86-64.so.2 (0x14fe2d4d3000) > > libquadmath.so.0 => /lib/x86_64-linux-gnu/libquadmath.so.0 > > (0x14fe2abff000) > > libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 > > (0x14fe2abe4000) > > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 > > (0x14fe2abde000) > > > > And here is the stacking fault (it doesn't tell me anything): > > $x lapw1 > > > > Program received signal SIGSEGV: Segmentation fault - invalid > > memory reference. > > > >
Re: [Wien] Compiling Wien2k 21.1 on Ubuntu 20.04 with gfortran
Hi David, this is as good as it goes I guess, the HAMILT was speed up by ~30%, but it is not so noticeable, because it actually scales much better with the thread number than the rest (so it is actually running much faster than the other parts at 4 threads, therefore further improvements are not as much visible), it is more relevant at the k- point parallel scenarios. Anyway best luck Pavel On Thu, 2021-11-25 at 09:58 +0100, David Holec wrote: > Hi Pavel, > > I have added now -DHAVE_LIBMVEC to the compiler options as you have > suggested (and removed it from the preprocessor flags). > > O Compiler options: -ffree-form -O2 -ftree-vectorize - > march=native -ffree-line-length-none -ffpe-summary=none - > DHAVE_LIBMVEC > P Preprocessor flags -DParallel > > > Here are the results of the test case: > > $ x lapw1 > STOP LAPW1 END > 109.094u 1.126s 0:29.41 374.7% 0+0k 0+37864io 0pf+0w > $ grep HORB *output1* > test_case.output1: TIME HAMILT (CPU) = 11.9, HNS = 15.2, > HORB = 0.0, DIAG = 82.4, SYNC = 0.0 > test_case.output1: TIME HAMILT (WALL) = 3.1, HNS = 4.4, > HORB = 0.0, DIAG = 21.2, SYNC = 0.0 > > > David > > --- > Dr David Holec > Computational Materials Science group > Department of Materials Science > Montanuniversität Leoben > > > > Franz-Josef-Strasse 18, A-8700 Leoben, Austria > tel. +43-(0)3842-4024211 > fax. +43-(0)3842-4024202 > materials.unileoben.ac.at > cms.unileoben.ac.at > > WHERE RESEARCH MEETS FUTURE > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Compiling Wien2k 21.1 on Ubuntu 20.04 with gfortran
On Wed, 2021-11-24 at 16:51 +0100, Peter Blaha wrote: > Just for information: the -DHAVE_LIBMVEC is a preprocessor option > (like -DINTEL_VML for ifort) and will speedup the HAMILT part due to > a vectorization of cosine/sine functions. Sorry for not being specific enough, -DHAVE_LIBMVEC should go to the Wien2k compiler options, specifically add it to the "-ffree-form -O2 - ftree-vectorize -march=native -ffree-line-length-none -ffpe- summary=none" stuff If you add it together with the -Dparallel (as I now see in the other email), it will be available only for the mpi builds. > As far as I remember, it is available only with more recent > gfortran/openblas versions, therefore not yet a "default" gfortran > option. This is actually a Glibc feature (alternative to intel VML), introduced with glibc 2.22 released in mid 2015. It is not on by default because at the time I wrote that stuff not all distros had it. Nowadays the are still enterprise distros like old RHEL, CENTOS, or similar that use older glibc (however this is mostly HPC where one would compile with ifort anyway). All supported desktop distros like Fedora, Ubuntu, Opensuse, etc. have is now, so it should be safe to add to gfortran/openblas flags by default in the next release. Best regards Pavel > > Hi Pavel, > > I don't think that that compiler flag has been used: > $ find . -name "Makefile" -exec grep "DHAVE_LIBMVEC" {} \; > yields nothing in my Wien2k source directory. > > David > > Am 24.11.2021 um 14:43 schrieb Pavel Ondračka: > > > Dear David, > > > > nice, ~30 seconds instead of ~150 :-) > > BTW is this already with "-DHAVE_LIBMVEC" in compiler options? > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Compiling Wien2k 21.1 on Ubuntu 20.04 with gfortran
Dear David, nice, ~30 seconds instead of ~150 :-) BTW is this already with "-DHAVE_LIBMVEC" in compiler options? For your real workflow you might also try to experiment with the number of threads vs number of k-points in parallel (now it seems you are already running at 4 threads for the test_case that has only 1 k- point), but for small cases with lots of k-points I would expect that k-point parallelization would be the best. Now that you link with the OpenMP-threaded OpenBLAS, it is important that the number of k-points run in parallel times number of threads allowed for lapw1/lapw2 does not exceed the total number of cores. It seems your environment already has OMP_NUM_THREADS set to 4 (at least), so you need to set the OpenMP threading explicitly for Wien2k. Specifically try this .machines file (k-point parallel in lapw1/lapw2 + OpenMP elsewhere, assuming Intel(R) Xeon(R) CPU W3550 should have 4 physical cores) with run_lapw -p for your standard use-case --- 1:localhost 1:localhost 1:localhost 1:localhost omp_global:4 omp_lapw1:1 omp_lapw2:1 or alternativelly (two k-points and two threads in lapw1+2) - 1:localhost 1:localhost omp_global:4 omp_lapw1:2 omp_lapw2:2 - Best regards Pavel On Wed, 2021-11-24 at 13:55 +0100, David Holec wrote: > Dear Pavel, > > Many thanks again for your patience and guidance. With the libopenblas- > openmp-dev package it seems to work well! > > $ ldd lapw1 > linux-vdso.so.1 (0x7ffca83d8000) > libopenblas.so.0 => /lib/x86_64-linux-gnu/libopenblas.so.0 > (0x14563f924000) > libgfortran.so.5 => /lib/x86_64-linux-gnu/libgfortran.so.5 > (0x14563f65c000) > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 > (0x14563f50d000) > libmvec.so.1 => /lib/x86_64-linux-gnu/libmvec.so.1 > (0x14563f4e1000) > libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 > (0x14563f49f000) > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 > (0x14563f47c000) > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 > (0x14563f288000) > /lib64/ld-linux-x86-64.so.2 (0x145641b2d000) > libquadmath.so.0 => /lib/x86_64-linux-gnu/libquadmath.so.0 > (0x14563f23e000) > libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 > (0x14563f223000) > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 > (0x14563f21d000) > > and these options in siteconfig: > L Linker Flags: $(FOPT) -L/usr/lib/x86_64-linux- > gnu/openblas-openmp > R R_LIBS (LAPACK+BLAS): -lopenblas > > I also get better timings now (though the TIME HAMILT are slightly > longer, but overall improvement): > $ ../x lapw1 > STOP LAPW1 END > 119.400u 1.937s 0:32.53 372.9% 0+0k 0+37864io 0pf+0w > $ grep HORB *output1* > test_case.output1: TIME HAMILT (CPU) = 17.3, HNS = 18.4, > HORB = 0.0, DIAG = 85.0, SYNC = 0.0 > test_case.output1: TIME HAMILT (WALL) = 4.6, HNS = 5.2, > HORB = 0.0, DIAG = 22.0, SYNC = 0.0 > > Thanks for your help, > David > --- > Dr David Holec > Computational Materials Science group > Department of Materials Science > Montanuniversität Leoben > > > > Franz-Josef-Strasse 18, A-8700 Leoben, Austria > tel. +43-(0)3842-4024211 > fax. +43-(0)3842-4024202 > materials.unileoben.ac.at > cms.unileoben.ac.at > > WHERE RESEARCH MEETS FUTURE > > > On Wed, 24 Nov 2021 at 12:39, Pavel Ondračka > wrote: > > Hi David, > > > > well, it is hard to say without the debug info why the OpenBLAS > > crahes. > > My guess is that you link with the 64bit interface, try to install > > the > > standard one (openblas-openmp-devel) and replace openblas64-openmp > > with > > openblas-openmp everywhere in you config. Also remove the -lpthread > > (just to be safe, but in theory should not matter), it is not needed > > with OpenMP. If it still crashes, please recompile with debug info > > enabled (add -g to compiler options) and send me the x lapw1 output > > via > > PM. > > > > BTW my response was mostly motivated by me suspecting you actually > > link > > against slow netlib BLAS (which turned out to be the case) and I > > wanted > > to warn others in case someone in the future would be using your > > settings as a reference :-) > > > > Best regards > > Pavel > > > > On Wed, 2021-11-24 at 11:09 +0100, David Holec wrote: > > > Hi Pavel, > > > > > > Many thanks for your insights. As you know, I am not an expert on > > how > > > to compile codes, for
Re: [Wien] Compiling Wien2k 21.1 on Ubuntu 20.04 with gfortran
$lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > Vendor ID: GenuineIntel > CPU family: 6 > Model: 26 > Model name: Intel(R) Xeon(R) CPU W3550 > @ 3.07GHz > ) > > > --- > Dr David Holec > Computational Materials Science group > Department of Materials Science > Montanuniversität Leoben > > > > Franz-Josef-Strasse 18, A-8700 Leoben, Austria > tel. +43-(0)3842-4024211 > fax. +43-(0)3842-4024202 > materials.unileoben.ac.at > cms.unileoben.ac.at > > WHERE RESEARCH MEETS FUTURE > > > On Wed, 24 Nov 2021 at 08:27, Pavel Ondračka > wrote: > > Hi David, > > > > as you said it works for you, so feel free to ignore, but I have > > some > > further tips if you are interested. Ubuntu switches between the > > different blas and lapack using the "alternatives", so its > > difficult > > to > > say if you actually link with the correct one. > > > > "ldd lapw1" in WIENROOT should show which one is actually linked, > > what > > you want to have is the openmp openblas > > /usr/lib/x86_64-linux-gnu/openblas-openmp/libblas.so > > /usr/lib/x86_64-linux-gnu/openblas-openmp/liblapack.so > > or alternatively > > /usr/lib/x86_64-linux-gnu/openblas-openmp/libopenblas.so > > It looks like you linked with the pthread one. This is not a > > problem > > when running at single thread but at higher thread number this > > might > > lead to oversubscription and slowdowns as the pthreaded openblas > > doesn't respect the OMP_NUM_THREADS set by Wien2k. So I would > > recommend > > to relink with the openmp OpenBLAS. BTW it is usually safer to link > > with OpenBLAS explicitly using the -lopenblas instead of the - > > llapack > > - > > lblas to be sure you don't accidentally link the netlib one > > (libopenblas is just the libblas and libblapack provided by > > OpenBLAS > > merged together). > > > > In general easy way how to check performance is to run the serial > > test_case from http://www.wien2k.at/reg_user/benchmark/ On modern > > CPUs > > (at least avx2) the runtime should be around 15-25 seconds at > > single > > thread. > > > > I see total runtime of ~18seconds on Fedora 35 with gfortran 11.2.1 > > OpenBLAS and AMD Ryzen 9 3900X 12-Core Processor. > > Also look for the following line in test_case.output1, this is what > > I > > have: > > TIME HAMILT (WALL) = 2.2, HNS = 1.7, HORB = 0.0, DIAG > > = > > > > 14.0, SYNC = 0.0 > > The time in HAMILT mostly depends on you compiler and vectorizing > > settings, while the DIAG is 99% lapack/blas related, so this can > > help > > with the diagnostics if things are slow. > > > > You might also get extra speedup of the HAMILT part by adding "- > > DHAVE_LIBMVEC" to the Compiler options. > > > > Best regards > > Pavel > > > > On Tue, 2021-11-23 at 11:07 +0100, David Holec wrote: > > > Dear all, > > > > > > I have just spent some time making Wien2k run on my single > > > machine > > > running Ubuntu 20.04 with gfortran/gcc. Since I am not an expert, > > it > > > was a trial and error, but it seems that I found a working > > combination > > > (sadly, the default parameters didn't work for me). Maybe this > > > will > > > help someone. Here are the settings that did the job for me: > > > > > > M OpenMP switch: -fopenmp > > > O Compiler options: -ffree-form -O2 -ftree-vectorize - > > > march=native -ffree-line-length-none -ffpe-summary=none > > > L Linker Flags: $(FOPT) -L/usr/lib/x86_64-linux- > > > gnu > > > P Preprocessor flags '-DParallel' > > > R R_LIBS (LAPACK+BLAS): -lblas -llapack -lpthread > > > F FFTW options: -DFFTW3 -DFFTW_OMP -I/usr/include > > > FFTW-LIBS: -L/usr/lib/x86_64-linux-gnu - > > > lfftw3 > > - > > > lfftw3_omp > > > > > > where the FFTW options were: > > > > > > R FFTWROOT: /usr/ > > > V FFTW_VERSION: FFTW3 > > > L FFTW_LIB: lib/x86_64-linux-gnu > > > N FFTW_LIBNAME: fftw3 > > > > > > Compiler versions: > > > $ gcc -
Re: [Wien] Compiling Wien2k 21.1 on Ubuntu 20.04 with gfortran
Hi David, as you said it works for you, so feel free to ignore, but I have some further tips if you are interested. Ubuntu switches between the different blas and lapack using the "alternatives", so its difficult to say if you actually link with the correct one. "ldd lapw1" in WIENROOT should show which one is actually linked, what you want to have is the openmp openblas /usr/lib/x86_64-linux-gnu/openblas-openmp/libblas.so /usr/lib/x86_64-linux-gnu/openblas-openmp/liblapack.so or alternatively /usr/lib/x86_64-linux-gnu/openblas-openmp/libopenblas.so It looks like you linked with the pthread one. This is not a problem when running at single thread but at higher thread number this might lead to oversubscription and slowdowns as the pthreaded openblas doesn't respect the OMP_NUM_THREADS set by Wien2k. So I would recommend to relink with the openmp OpenBLAS. BTW it is usually safer to link with OpenBLAS explicitly using the -lopenblas instead of the -llapack - lblas to be sure you don't accidentally link the netlib one (libopenblas is just the libblas and libblapack provided by OpenBLAS merged together). In general easy way how to check performance is to run the serial test_case from http://www.wien2k.at/reg_user/benchmark/ On modern CPUs (at least avx2) the runtime should be around 15-25 seconds at single thread. I see total runtime of ~18seconds on Fedora 35 with gfortran 11.2.1 OpenBLAS and AMD Ryzen 9 3900X 12-Core Processor. Also look for the following line in test_case.output1, this is what I have: TIME HAMILT (WALL) = 2.2, HNS = 1.7, HORB = 0.0, DIAG = 14.0, SYNC = 0.0 The time in HAMILT mostly depends on you compiler and vectorizing settings, while the DIAG is 99% lapack/blas related, so this can help with the diagnostics if things are slow. You might also get extra speedup of the HAMILT part by adding "- DHAVE_LIBMVEC" to the Compiler options. Best regards Pavel On Tue, 2021-11-23 at 11:07 +0100, David Holec wrote: > Dear all, > > I have just spent some time making Wien2k run on my single machine > running Ubuntu 20.04 with gfortran/gcc. Since I am not an expert, it > was a trial and error, but it seems that I found a working combination > (sadly, the default parameters didn't work for me). Maybe this will > help someone. Here are the settings that did the job for me: > > M OpenMP switch: -fopenmp > O Compiler options: -ffree-form -O2 -ftree-vectorize - > march=native -ffree-line-length-none -ffpe-summary=none > L Linker Flags: $(FOPT) -L/usr/lib/x86_64-linux-gnu > P Preprocessor flags '-DParallel' > R R_LIBS (LAPACK+BLAS): -lblas -llapack -lpthread > F FFTW options: -DFFTW3 -DFFTW_OMP -I/usr/include > FFTW-LIBS: -L/usr/lib/x86_64-linux-gnu -lfftw3 - > lfftw3_omp > > where the FFTW options were: > > R FFTWROOT: /usr/ > V FFTW_VERSION: FFTW3 > L FFTW_LIB: lib/x86_64-linux-gnu > N FFTW_LIBNAME: fftw3 > > Compiler versions: > $ gcc --version > gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 > gfortran --version > GNU Fortran (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 > > And I used the generic lapack and openblas packages provides by Ubuntu > repos: > liblapack-dev/focal,now 3.9.0-1build1 amd64 [installed] > liblapack3/focal,now 3.9.0-1build1 amd64 [installed,automatic] > > liblapack64-3/focal,now 3.9.0-1build1 amd64 [installed,automatic] > liblapack64-dev/focal,now 3.9.0-1build1 amd64 [installed] > > libblas-dev/focal,now 3.9.0-1build1 amd64 [installed] > libblas3/focal,now 3.9.0-1build1 amd64 [installed,automatic] > libblas64-3/focal,now 3.9.0-1build1 amd64 [installed,automatic] > libblas64-dev/focal,now 3.9.0-1build1 amd64 [installed,automatic] > > libopenblas64-0/focal-updates,now 0.3.8+ds-1ubuntu0.20.04.1 amd64 > [installed] > libopenblas64-0-openmp/focal-updates,now 0.3.8+ds-1ubuntu0.20.04.1 > amd64 [installed] > libopenblas64-0-pthread/focal-updates,now 0.3.8+ds-1ubuntu0.20.04.1 > amd64 [installed,automatic] > > (I am not totally sure if I need all the libraries above, but > certainly, with these, the compilation seems to work and I am able to > run SCF cycles & Telnes calculations without errors :-) > > All the best, > David > --- > Dr David Holec > Computational Materials Science group > Department of Materials Science > Montanuniversität Leoben > > > > Franz-Josef-Strasse 18, A-8700 Leoben, Austria > tel. +43-(0)3842-4024211 > fax. +43-(0)3842-4024202 > materials.unileoben.ac.at > cms.unileoben.ac.at > > WHERE RESEARCH MEETS FUTURE > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at
Re: [Wien] generalized regular k-point grids
Hi, > Yes, WIEN2k is ready to accept any k-grid when using smearing > methods, > however, you need to supply the proper weights. Great :-) > > I do not know what symmetry your cell has, but I can see only weights > of > 1, 2 and 4 ? Is this correct ? > It means you have only 4 symmetry operations for this cell ?? Well the posted k-points list was just few first lines, actually the maximum weight in my case is 8 and that is consistent with the 8 symmetry operations I have (orthorhombic Cmcm). > > PS: Have you tried TEMP instead of TEMPS ? As far as I understand, TEMP > corrects towards zero Kelvin and should be compatible with TETRA. And > of > course with a very small smearing parameter, TEMPS should go towards > TEMP --> TETRA (if the k-mesh is good enough). I did not, but I was consistent so I believe it should not matter? > PPS: Be aware of different coordinate systems for different lattices ! > VASP and WIEN2k may eventually ?? specify the coordinates in different > coordinates (cartesian vs. fractions of (non-orthogonal) rec. lattice > vectors. I'll check this, thanks for the pointer, but this is looking like it might be the case and I might need some rescaling. Looking at the Userguide "We use cartesian coordinates in units of 2π/a, 2π/b, 2π/c for P, C, F and B cubic, tetragonal and orthorhombic lattices, but internal coordinates for H and monoclinic/triclinic lattices". I'll check the VASP KPOINTS format... Best regards Pavel > > Peter Blaha > > Am 10/14/21 um 2:52 PM schrieb Pavel Ondračka: > > Dear Wien2k mailing list, > > > > Is Wien2k ready for a general k-point grid or is some part of the > > code > > assuming regular grid? > > > > I was reading some papers about how the generalized regular k-point > > grids have better efficiency over the standard Monkhorst-Pack ones... > > For example this paper has also an implementation > > https://msg.byu.edu/docs/papers/autoGR.pdf > > > > It generates a k-point list in VASP KPOINTS format: > > 0. 0. 0. 1 > > -0.1667 0.1667 0. 2 > > -0. 0. 0. 2 > > 0.5000 -0.5000 0. 1 > > 0.0625 -0.0208 0.035714285714 4 > > -0.10416667 0.1458 0.035714285714 4 > > -0.2708 0.3125 0.035714285714 4 > > . > > > > I just do the stupid thing and convert it to the .klist format by > > multiplying with 1e9 and applying the proper formating, i.e.: > > 1 0 0 010 1.0 > > 2-1 1 010 2.0 > > 3-3 3 010 2.0 > > 4 5-5 010 1.0 > > 5 6250 -2083 3571428510 4.0 > > 6-10416 14583 3571428510 4.0 > > 7-27083 31250 3571428510 4.0 > > . > > > > Now everything seems to run OK at the first glance (lapw2 crashes > > with > > TETRA ofc but TEMPS seems to be OK) but the energies are not so close > > (I would expect that at very large number of k-points it should give > > the same results as standard Wien2k MP grid), but there is a > > difference > > of maybe 2mRy/atom. So I guess there is still something somewhere > > missing for this to work? > > > > Best regards > > Pavel > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > > > ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] generalized regular k-point grids
Dear Wien2k mailing list, Is Wien2k ready for a general k-point grid or is some part of the code assuming regular grid? I was reading some papers about how the generalized regular k-point grids have better efficiency over the standard Monkhorst-Pack ones... For example this paper has also an implementation https://msg.byu.edu/docs/papers/autoGR.pdf It generates a k-point list in VASP KPOINTS format: 0. 0. 0. 1 -0.1667 0.1667 0. 2 -0. 0. 0. 2 0.5000 -0.5000 0. 1 0.0625 -0.0208 0.035714285714 4 -0.10416667 0.1458 0.035714285714 4 -0.2708 0.3125 0.035714285714 4 . I just do the stupid thing and convert it to the .klist format by multiplying with 1e9 and applying the proper formating, i.e.: 1 0 0 010 1.0 2-1 1 010 2.0 3-3 3 010 2.0 4 5-5 010 1.0 5 6250 -2083 3571428510 4.0 6-10416 14583 3571428510 4.0 7-27083 31250 3571428510 4.0 . Now everything seems to run OK at the first glance (lapw2 crashes with TETRA ofc but TEMPS seems to be OK) but the energies are not so close (I would expect that at very large number of k-points it should give the same results as standard Wien2k MP grid), but there is a difference of maybe 2mRy/atom. So I guess there is still something somewhere missing for this to work? Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] install WIEN2k_21.1 on Ubuntu 21.04
Dear Viktor, I don't think that the Intel® oneAPI Base Toolkit actually includes the fortran compiler, at least not according to: https://software.intel.com/content/www/us/en/develop/tools/oneapi/all-toolkits.html#hpc-kit I think you need Intel® oneAPI HPC Toolkit to get ifort. BTW in general such questions are better suited for the Intel support forums... Best regards Pavel On Tue, 2021-10-12 at 14:12 +0300, Victor Zenou wrote: > Dear all > > I’m trying to install WIEN2k_21.1 on Ubuntu 21.04. > > I updated and installed few system packages: > > sudo apt update > > sudo apt install build-essential gcc-multilib rpm default-jre-headless > python tcsh gnuplot > > sudo apt install autoconf libtool ghostscript octave > > > I downloaded and installed “Intel® oneAPI Base Toolkit” which suppose > to have ifort complier and mkl. I also add a line to bashrc: > > source /opt/intel/oneapi/setvars.sh > > > > The command “ifort -v”, gives “Command 'ifort' not found,...” > > > ./siteconfig_lapw give the following message: > > It seems you do not have the intel fortran compiler in your path. > > > So it seems that my intel compiler either dio not exist or not in the > path. > > How can I check that/ > > > Any suggestions on how to proceed? > > > Best regards, victor Zenou > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] segmentation fault in lapwso
BTW I did the Valgrind run and there is nothing there (I don't have the affected MKL, but either with OpenBLAS or with the Netlib LAPACK/BLAS there are no Valgrind defects at all in the Wien2k code, just some harmless leaked memory.) So yeah, confirming this is definitelly MKL. Pavel On Thu, 2021-08-19 at 06:56 -0500, Laurence Marks wrote: > A suggestion: check your mkl version, as there is a mkl bug that was > recently fixed, see > https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Problem-with-LAPACK-subroutine-ZHEEVR-input-array-quot-isuppz/td-p/1150816 > _ > Professor Laurence Marks > "Research is to see what everybody else has seen, and to think what > nobody else has thought", Albert Szent-Györgyi > www.numis.northwestern.edu > > On Thu, Aug 19, 2021, 06:45 Peter Blaha > wrote: > > I'm still on vacations, so cannot test myself. > > > > However, I experienced such problems before. It has to do with > > multithreading (1 thread works always fine) and the mkl routine > > zheevr. > > > > In my case I could fix the problem by enlarging the workspace > > beyond > > what the routine calculates itself. (see comment in hmsec on line > > 841). > > > > Right below, the workspace was enlarged by a factor 10, which fixed > > my > > problem. But I can easily envision that it might not be enough in > > some > > other cases. > > > > An alternative is to switch back to zheevx (commented in the code). > > > > Peter Blaha > > > > Am 18.08.2021 um 20:01 schrieb Pavel Ondračka: > > > Right, I think that the reason deallocate is failing because the > > memory > > > has been corrupted at some earlier point is quite clear, the only > > other > > > option why it should crash would be that it was not allocated at > > all, > > > which seem not to be the case here... The question is what > > corrupted > > > the memory and even more strange is why does it work if we > > > disable > > MKL > > > multithreading? > > > > > > It could indeed be that we are doing something wrong. I can > > > imagine > > the > > > memory could be corrupted in some BLAS call if the number of > > > columns/rows passed to the specific BLAS call is more than the > > actual > > > size of the matrix, than this could easily happen (and the > > > multithreading is somehow influencing what the final value of the > > > corrupted memory, and depending on the final value the deallocate > > could > > > fail or pass somehow). This should be possible to diagnose with > > > valgrind as suggested. > > > > > > Luis, can you upload the testcase somewhere, or recompile with > > > debuginfo as suggested by Laurence earlier, run "valgrind -- > > > track- > > > origins=yes lapwso lapwso.def" and send the output? Just be > > > warned, > > > there is a massive slowdown with valgrind (up to 100x) and the > > logfile > > > can get very large. > > > > > > Best regards > > > Pavel > > > > > > > > > On Wed, 2021-08-18 at 12:10 -0500, Laurence Marks wrote: > > > > Correction, I was looking at an older modules.F. It looks like > > > > it > > > > should be > > > > > > > > DEALLOCATE(vect,stat=IV) ; if(IV .ne. 0)write(*,*)IV > > > > > > > > > > > > On Wed, Aug 18, 2021 at 11:23 AM Laurence Marks > > > > wrote: > > > > > I do wonder about this. I suggest editing module.F and > > > > > changing > > > > > lines 118 and 119 to > > > > > DEALLOCATE(en,stat=Ien) ; if(Ien .ne. 0)write(*,*)'Err > > > > > en > > > > > ',ien > > > > > DEALLOCATE(vnorm,stat=Ivn ; ) if(Ivn .ne. > > > > > 0)write(*,*)'Err > > > > > vnorm ',Ivn > > > > > > > > > > There is every chance that the bug is not in those lines, but > > > > > somewhere completely different. SIGSEV often means that the > > > > > code > > > > > has been overwritten, for instance arrays going out of > > > > > bounds. > > > > > > > > > > You can also recompile with -g (don't change other options) > > > > > added, and/or -C. Sometimes this is better. Or use other > > > > > things > > > > > like debuggers or valgrind. > > > > > > > &g
Re: [Wien] segmentation fault in lapwso
Right, I think that the reason deallocate is failing because the memory has been corrupted at some earlier point is quite clear, the only other option why it should crash would be that it was not allocated at all, which seem not to be the case here... The question is what corrupted the memory and even more strange is why does it work if we disable MKL multithreading? It could indeed be that we are doing something wrong. I can imagine the memory could be corrupted in some BLAS call if the number of columns/rows passed to the specific BLAS call is more than the actual size of the matrix, than this could easily happen (and the multithreading is somehow influencing what the final value of the corrupted memory, and depending on the final value the deallocate could fail or pass somehow). This should be possible to diagnose with valgrind as suggested. Luis, can you upload the testcase somewhere, or recompile with debuginfo as suggested by Laurence earlier, run "valgrind --track- origins=yes lapwso lapwso.def" and send the output? Just be warned, there is a massive slowdown with valgrind (up to 100x) and the logfile can get very large. Best regards Pavel On Wed, 2021-08-18 at 12:10 -0500, Laurence Marks wrote: > Correction, I was looking at an older modules.F. It looks like it > should be > > DEALLOCATE(vect,stat=IV) ; if(IV .ne. 0)write(*,*)IV > > > On Wed, Aug 18, 2021 at 11:23 AM Laurence Marks > wrote: > > I do wonder about this. I suggest editing module.F and changing > > lines 118 and 119 to > > DEALLOCATE(en,stat=Ien) ; if(Ien .ne. 0)write(*,*)'Err en > > ',ien > > DEALLOCATE(vnorm,stat=Ivn ; ) if(Ivn .ne. 0)write(*,*)'Err > > vnorm ',Ivn > > > > There is every chance that the bug is not in those lines, but > > somewhere completely different. SIGSEV often means that the code > > has been overwritten, for instance arrays going out of bounds. > > > > You can also recompile with -g (don't change other options) > > added, and/or -C. Sometimes this is better. Or use other things > > like debuggers or valgrind. > > > > On Wed, Aug 18, 2021 at 10:47 AM Pavel Ondračka > > wrote: > > > I'm CCing the list back as the crash was now diagnosed to a > > > likely > > > MKL > > > problem, see below for more details. > > > > > > > > > > > > > So just to be clear, explicitly setting OMP_STACKSIZE=1g does > > > not > > > > > help > > > > > to solve the issue? > > > > > > > > > > > > > > > > > Right! OMP_STACKSIZE=1g with OMP_NUM_THREADS=4 does not solve > > > > the > > > > problem! > > > > > > > > > > > > > > The problem is that the OpenMP code in lapwso is very simple, > > > so I'm > > > > > having problems seeing how it could be causing the problems. > > > > > > > > > > Could you also try to see what happens if run with: > > > > > OMP_NUM_THREADS=1 > > > > > MKL_NUM_THREADS=4 > > > > > > > > > > > > > > > > > It does not work with these values, but I checked and it works > > > > reverting them: > > > > OMP_NUM_THREADS=4 > > > > MKL_NUM_THREADS=1 > > > > > > This was very helpfull and IMO points to a problem with MKL > > > instead > > > of > > > Wien2k. > > > > > > Unfortunatelly setting MKL_NUM_THREADS=1 globally will reduce the > > > OpenMP performance, mostly in lapw1 but also at other places. So > > > if > > > you > > > want to keep the OpenMP BLAS/lapack level parallelism you have to > > > either find some MKL version that works (if you do please report > > > it > > > here), link with OpenBLAS (using it for lapwso is enough) or > > > create > > > a > > > simple wrapper that sets the MKL_NUM_THREADS=1 just for lapwso, > > > i.e., > > > rename lapwso binary in WIENROOT to lapwso_bin and create new > > > lapwso > > > file there with: > > > > > > #!/bin/bash > > > MKL_NUM_THREADS=1 lapwso_bin $1 > > > > > > and set it to executable with chmod +x lapwso. > > > > > > Or maybe MKL has a non-OpenMP version which you could link with > > > just > > > lapwso and use standard one in other parts, but dunno, I mostly > > > use > > > OpenBLAS. If you need some further help, let me know. > > > > > > Reporting the issue to intel could be also nice, however
Re: [Wien] segmentation fault in lapwso
I'm CCing the list back as the crash was now diagnosed to a likely MKL problem, see below for more details. > > > > So just to be clear, explicitly setting OMP_STACKSIZE=1g does not > > help > > to solve the issue? > > > > > Right! OMP_STACKSIZE=1g with OMP_NUM_THREADS=4 does not solve the > problem! > > > > > The problem is that the OpenMP code in lapwso is very simple, so I'm > > having problems seeing how it could be causing the problems. > > > > Could you also try to see what happens if run with: > > OMP_NUM_THREADS=1 > > MKL_NUM_THREADS=4 > > > > > It does not work with these values, but I checked and it works > reverting them: > OMP_NUM_THREADS=4 > MKL_NUM_THREADS=1 This was very helpfull and IMO points to a problem with MKL instead of Wien2k. Unfortunatelly setting MKL_NUM_THREADS=1 globally will reduce the OpenMP performance, mostly in lapw1 but also at other places. So if you want to keep the OpenMP BLAS/lapack level parallelism you have to either find some MKL version that works (if you do please report it here), link with OpenBLAS (using it for lapwso is enough) or create a simple wrapper that sets the MKL_NUM_THREADS=1 just for lapwso, i.e., rename lapwso binary in WIENROOT to lapwso_bin and create new lapwso file there with: #!/bin/bash MKL_NUM_THREADS=1 lapwso_bin $1 and set it to executable with chmod +x lapwso. Or maybe MKL has a non-OpenMP version which you could link with just lapwso and use standard one in other parts, but dunno, I mostly use OpenBLAS. If you need some further help, let me know. Reporting the issue to intel could be also nice, however I never had any real luck there and it is also a bit problematic as you can't provide testcase due to Wien2k being proprietary code... Best regards Pavel > > > > > This should disable the Wien2k-specific OpenMP parallelism but still > > keep the rest of paralellism at the BLAS/lapack level. > > > > > So, perhaps, the problem is related to MKL! > > > > > Another option is that something is going wrong before lapwso and the > > lapwso crash is just the symptom. What happens if you run everything > > up > > to lapwso without OpenMP (OMP_NUM_THREADS=1) and than enable it just > > for lapwso? > > > > > If I run lapw0 and lapw1 with OMP_NUM_THREADS=4 and then change it to 1 > just before lapwso, it works. > If I do the opposite, starting with OMP_NUM_THREADS=1 and then change > it to 4 just before lapwso, it does not work. > So I believe that the problem is really at lapwso. > > If you need more information, please, let me know! > All the best, > Luis ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] segmentation fault in lapwso
Dear Luis, one very easy thing to try could be to set environment variable OMP_STACKSIZE to something large like "1g", i.e., "export OMP_STACKSIZE=1g" before run_lapw. Small OpenMP stacksize caused issues for us previously so could be the case here as well. The only explicit omp loop in hsocalc.F does allocates all private variables on the stack and few of them are arrays, it is feasible this could be the case. 2 prof. Blaha: from a very brief visual inspection of the OpenMP code in lapwso, I believe there could be another small issue with combined MPI OpenMP. At lines hsocalc.F:159 and hsocalc.F:160 the variables ibf_local and ibi_local should be probably private. This should not be the cause of the here reported problems though as that would only influence the lapwso_mpi. The rest seems OK though (at first glance). Best regards Pavel On Tue, 2021-08-17 at 18:18 -0300, Luis Ogando wrote: > Dear Wien2k Community, > Greetings! > This message is only to inform that I also had a fragmentation > problem with lapwso and Wien2k-21. > It was a very strange case. After a converged SCF cycle with mBJ > and SO, I could not run "run_lapw -NI -so ...". In this case, I > always got the following error after lapwso: > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > Image PC Routine Line > Source > lapwso 0046A0EA Unknown Unknown > Unknown > libpthread-2.28.s 1530B217B730 Unknown Unknown > Unknown > libiomp5.so 1530B1D132FB Unknown Unknown > Unknown > libiomp5.so 1530B1D13049 Unknown Unknown > Unknown > libiomp5.so 1530B1D14B59 Unknown Unknown > Unknown > libiomp5.so 1530B1D161E8 Unknown Unknown > Unknown > libiomp5.so 1530B1D0C926 Unknown Unknown > Unknown > lapwso 0049CA86 Unknown Unknown > Unknown > lapwso 0040D77F hmsout_mp_finit_h 119 > modules.F > lapwso 0042B94E MAIN__ 622 > lapwso.F > lapwso 00404D22 Unknown Unknown > Unknown > libc-2.28.so 1530A3E3609B __libc_start_main Unknown > Unknown > lapwso 00404C2A Unknown Unknown > Unknown > 0.167u 0.051s 0:00.10 210.0% 0+0k 0+1976io 0pf+0w > error: command /home/ogando/Wien/Wien21/lapwso lapwso.def failed > > The solution was to change OMP_NUM_THREADS from 4 to 1. > I checked and it also worked with OMP_NUM_THREADS equal to 2 but > not 3. > If someone is interested in the compilation options or any other > information, please ask. > All the best, > Luis > > > > Em qui., 10 de jun. de 2021 às 08:17, Fecher, Gerhard > escreveu: > > Dear all, > > while running a -so calculation I hit a segmentation fault in > > lapwso > > (see below) with the latest version Wien2k21.1 that does NOT appear > > in 19.2. > > (appeared for two different systems in fresh directories) > > > > Did someone experience the same, or did I miss a report and may be > > not up to date? > > > > I used all settings the same (mostly default values), and the same > > compilers and options (Intel OneAPI 2021 2.0 and Parallel Studio XE > > 2017.4.056) for both versions, 21.1 and 19.2 > > > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > > Image PC Routine Line > > Source > > lapwso 0046CE0A Unknown Unknown > > Unknown > > libpthread-2.22.s 2AFBCC6DAB10 Unknown Unknown > > Unknown > > libiomp5.so 2AFBCCF2C8E8 Unknown Unknown > > Unknown > > lapwso 0049F7A6 Unknown Unknown > > Unknown > > lapwso 00421E9E hmsec_ 926 > > hmsec.F > > > > line 926 is; deallocate(meigve) > > indeed, if this is the correct line at all. > > > > indeed in 21.2 (I have seen that hmsec.F is different in 19.2) > > > > Thanks for any suggestions that help > > > > Gerhard > > > > DEEP THOUGHT in D. Adams; Hitchhikers Guide to the Galaxy: > > "I think the problem, to be quite honest with you, > > is that you have never actually known what the question is." > > > > > > Dr. Gerhard H. Fecher > > Institut of Physics > > Johannes Gutenberg - University > > 55099 Mainz > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at >
Re: [Wien] Installing WIEN2k, w2web not running
On Wed, 2021-04-14 at 06:12 +, delamora wrote: > Thank you Pavel > > I do a search > dnf search perl-Sys-Hostname > and I get; > perl-Sys-Hostname-Long.noarch : Try every conceivable way to get full > hostname > I will try it I'm not sure about the perl-Sys-Hostname-Long.noarch package. On my Fedora 33 I have perl-Sys-Hostname.x86_64 package. I have no idea why you can't find it. BTW Another way how to determine the correct package could be to really check for the missing file (from your log this is: */Sys/Hostname.pm) with: "dnf provides */Sys/Hostname.pm" > By the way, is Pavel the same as Pablo but in your language? It seems > that no. I believe it is :-) Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Installing WIEN2k, w2web not running
Hi Pablo, there is a difference between the package with hostname utility and the missing perl package with Sys::Hostname module. The proper package should be perl-Sys-Hostname. Best regards Pavel On Tue, 2021-04-13 at 23:21 +, delamora wrote: > I have a comment at the end > Dear WIEN2k comunity, > I am installing the WIEN2k package in a new computer, all seems to > run well, except for > w2web > > I run w2web > it says; > Can't locate Sys/Hostname.pm in @INC (you may need to install the > Sys::Hostname module) (@INC contains: /usr/local/lib64/perl5/5.32 > /usr/local/share/perl5/5.32 /usr/lib64/perl5/vendor_perl > /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at > /home/Programas/WIEN2k-19.1/w2web line 3.BEGIN failed--compilation > aborted at /home/Programas/WIEN2k-19.1/w2web line 3. > > I searched for "Sys::Hostname"; > dnf search Sys::Hostname > => does not exist > > I searched for "Hostname"; dnf search Hostname > and I found it, so I try to install it > => dnf install hostname.x86_64 > and I get as an answer > => "package already installed"perl-Sys-Hostname > > == > I just want to comment that /home/Programas/WIEN2k- > 19.1/SRC_w2web/w2web > gives; > - > #!/usr/bin/perl > > use Sys::Hostname; > -- > So I do not know why w2web is not running with Fedora 33 > > > Pablo > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] consistent RKmax and sphere size settings
Thank you Laurence, I was a bit worried because the FAQ you linked also says: "Of course you should use identical Mg+O spheres for MgO and Mg(OH)2 for consistency", so I was not 100% sure if keeping the same maximum K- vector Kmax is enough. Should I also increase lmax and lvns for the larger spheres somehow? Or would you keep it the same for small and large N spheres? Best regards Pavel On Wed, 2021-04-07 at 15:33 -0500, Laurence Marks wrote: > Have a look at http://www.wien2k.at/reg_user/faq/rkmax.html. If (say) > with an RMT for the N of 1.6 a RKMAX of 6.5 is good enough, then when > you reduce the RMT to 1.3 you can reduce the RKMAX to 6.5*1.3/1.6 = > 5.28. This will not give you precisely the same relative convergence, > but is close. > > Another way is to say that an RKMAX of 7 is "OK" for RMTs of 2.0, an > RKMAX of 3 for RMTs of 0.5, then interpolate using a straight line. > This is similar. > > On Wed, Apr 7, 2021 at 3:24 PM Pavel Ondračka > wrote: > > Dear Wien2k mailing list, > > > > I have a series of TiN and TiON amorphous-like structures where I > > have > > some large differences in spheres sizes for N atoms. In most of the > > structures the smallest N sphere is around 1.6-1.7, however in some > > I > > have few N atoms with 1.3 (the structures should be OK, this much > > smaller size is due to some rare local configuration which would > > correspond to something like N split interstitial in crystalline > > structure). > > > > My goal is to calculate core electron binding energies of N1s > > levels > > of > > many atoms in the structures (at least 200 core-hole calculations) > > and > > I need to be consistent over different structures in the series. > > > > So usually I would just check what is the smallest N sphere size in > > the > > whole set, and force it for all N atoms in all structures and than > > use > > the identical RKmax for all structures, just to be sure I'm > > consistent. > > This is unfortunatelly not very efficient with respect to the > > calculation speed as I have quite large cells (around 150 atoms). > > Is > > there another way how can I save some CPU time and keep the > > consistency? > > > > I was for example thinking if I can force somehow two different N > > sphere sizes (one for the N split intestitial, which I have usually > > just one in the whole cell and one larger for the rest of N atoms), > > than I would have consistent sphere size for the rest of N atoms in > > the > > series and I could change the RKmax to keep the same largest K- > > vector > > which should be enough to guarantee consistency for all N atoms > > expect > > the split interstitials (but I don't care that much about them). > > However as far as I understand this is not possible? > > > > Any ideas would be appreciated. > > > > Best regards > > Pavel Ondracka > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!CmcMwWJhVAKhTUEoDt5KIyqaJX5T80I6NHismOuUzcHH0sD9lAytg75A7qoRWwzDI3sKJg$ > > > > SEARCH the MAILING-LIST at: > > https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!CmcMwWJhVAKhTUEoDt5KIyqaJX5T80I6NHismOuUzcHH0sD9lAytg75A7qoRWwypwEJ3kA$ > > > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] consistent RKmax and sphere size settings
Dear Wien2k mailing list, I have a series of TiN and TiON amorphous-like structures where I have some large differences in spheres sizes for N atoms. In most of the structures the smallest N sphere is around 1.6-1.7, however in some I have few N atoms with 1.3 (the structures should be OK, this much smaller size is due to some rare local configuration which would correspond to something like N split interstitial in crystalline structure). My goal is to calculate core electron binding energies of N1s levels of many atoms in the structures (at least 200 core-hole calculations) and I need to be consistent over different structures in the series. So usually I would just check what is the smallest N sphere size in the whole set, and force it for all N atoms in all structures and than use the identical RKmax for all structures, just to be sure I'm consistent. This is unfortunatelly not very efficient with respect to the calculation speed as I have quite large cells (around 150 atoms). Is there another way how can I save some CPU time and keep the consistency? I was for example thinking if I can force somehow two different N sphere sizes (one for the N split intestitial, which I have usually just one in the whole cell and one larger for the rest of N atoms), than I would have consistent sphere size for the rest of N atoms in the series and I could change the RKmax to keep the same largest K-vector which should be enough to guarantee consistency for all N atoms expect the split interstitials (but I don't care that much about them). However as far as I understand this is not possible? Any ideas would be appreciated. Best regards Pavel Ondracka ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] convergence criteria in scf file?
On Wed, 2020-11-04 at 07:30 +, Tran, Fabien wrote: > Is the grep of :ENE, :DIS and :FOR not useful enough? Right, this is what one would expect, but just to be 100% certain, the scf is stopped when the change between :ENE, :DIS and (maximum change?) in :FOR in the last ?two? iterations is below the value specified in -ec -cc or -fc argument? BTW what about the criteria itself, new Wien2k versions print in the :LABEL4 the run_lapw command and arguments (so I can see what specific values were passed to -ec -cc and -fc), but the stuff I'm looking at comes from some older version where this was not printed... Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] convergence criteria in scf file?
Dear Wien2k mailing list, I'm looking at some old saved calculations, where I don't have the dayfile (but I vaguely remember some convergence troubles). So I'm now looking at the scf file and wondering if it is possible to tell from the scf file, what was the final convergence of energy, charge and forces (and also what the convergence criteria passed to run_lapw were). Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] finding density of states for individual bands
The problematic part is that while joint claims it can give you a DOS for just one band (with the switch 2), this is actually not a DOS of a single band but of a single band index. This will be the same thing only if there is no band crossing (the difference will be obvious if you do energy band structure plot with x spaghetti with and without running x irrep before). Best regards Pavel On Sun, 2020-10-11 at 05:58 +, Lee, Yongbin [A LAB] wrote: > I guess you can do it with "joint". > Check *.injoint which is at page 170 in UG. > > Yongbin > > From: Wien on behalf of > Joseph Ross > Sent: Saturday, October 10, 2020 4:40 PM > To: wien@zeus.theochem.tuwien.ac.at > Subject: [Wien] finding density of states for individual bands > > We have a semimetallic system which has an indirect overlap of some > rather convoluted bands at Ef. In order to better understand the > holes vs. electrons in this system we would like to find the density > of states (and partial densities if possible) associated with > individual bands, rather than the total. From my understanding & > reading through the users guide, I think this is not a feature > included in wien2k. However if we are overlooking something, or if > there is a separate package that we could use to extract this type of > information, we would be interested to know. Any suggestions on this > are welcome. > -Joe Ross > - > Joseph H. Ross Jr. > Professor > Department of Physics and Astronomy > Texas A University > 4242 TAMU > College Station TX 77843-4242 > 979 845 3842 / 448 MPHY > jhr...@tamu.edu / http://faculty.physics.tamu.edu/ross > - > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] NOMAD and Wien2k
Dear Wien2k mailing list, I'm experimenting with the NOMAD database (nomad-lab.eu) and since I remembered some old post from prof. Blaha on this topic, I just thought I would ask here for user experience, because so far its not really working that well for me. So first of all the most annoying thing is that NOMAD detects two "mainfiles" per directory, specifically the scf and scf0 files, but as far as I can see it can't parse anything useful from the scf0 file (not even the potential). The main downside is that it thinks there are two calculations, while there is just one in fact and therefore creates lot of useless entries. Another think which I'm not sure how to approach is what to do when the scf file is missing. I often do the main scf loop, than save_lapw to another directory and after that I generate a new denser k-grid (for DOS or optics) and just run lapw1, 2 and the tetra (or optic) or so on. In this case I don't have scf file in the directory with the DOS or optic calculations, but I would still like to upload this. The upload somehow works, because the scf0 file from the old run stays there, so at least NOMAD recognizes there are some Wien2k data, but it really can't parse anything from the scf0 file (and in general I think that using the scf0 file is a bug). I can make it somehow detect some metadata by artificially creating a new fake scf file by combining the old and new scfxxx files with cat... so that at least the NOMAD can detect the composition, potential and some other basic things, but this is clearly not optimal. In general the NOMAD Wien2k parser cannot detect pretty much anything beyond maybe structure information, and I will try to report all this to the NOMAD developers to improve the parser, but I would be curious if someone here had some better experience or can share some tricks, how to make it work better for Wien2k with the current NOMAD state. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] parallel instalation of Wien2k: elpa, fftw
OK, thanks for the explanation, I was not aware of this, therefore please ignore my previous emails, as they are completely wrong. Sorry for misleading the original poster. Best regards Pavel On Mon, 2020-09-14 at 07:30 -0500, Laurence Marks wrote: > Linkers will by default (99.99% confidence) add ".so" to a name for > dynamic; if that is not present they will add ".a". Hence use of > -lfftw3 will pickup libfftw3.so or libfftw3.a. > > _ > Professor Laurence Marks > "Research is to see what everybody else has seen, and to think what > nobody else has thought", Albert Szent-Gyorgi > www.numis.northwestern.edu > > On Mon, Sep 14, 2020, 07:26 Pavel Ondračka > wrote: > > On Mon, 2020-09-14 at 06:08 -0600, Gavin Abo wrote: > > > See that "./configure --enable-mpi" was used. > > > > > > Of note, sometimes -gcc-sys is needed: > > > > > > > > https://urldefense.com/v3/__https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg18664.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!E4g1pwAyAnEalyTRgC-3W-G9PjrnjQyeQ2rogzXFW466AhffGupbkMeeLTiuuS6IFBptXQ$ > > > > > > Out of interest I went through the link and I don't see how the > > linker > > can find/use the FFTW libraries. The FFTW_LIBS and FFTW_PLIBS > > clearly > > specify to link with the dynamic libraries: > > FFTW_OPT : -DFFTW3 -I/home/username/fftw3/include > > FFTW_LIBS : -L/home/username/fftw3/lib -lfftw3 > > FFTW_PLIBS : -lfftw3_mpi > > > > however the directory with FFTW contains only the static libraries: > > > > /home/username/fftw3/lib: > > total 2108 > > drwxr-xr-x 3 username username4096 May 27 22:57 cmake > > -rw-r--r-- 1 username username 1933432 May 27 22:57 libfftw3.a > > -rwxr-xr-x 1 username username 893 May 27 22:57 libfftw3.la > > -rw-r--r-- 1 username username 201232 May 27 22:57 libfftw3_mpi.a > > -rwxr-xr-x 1 username username 939 May 27 22:57 libfftw3_mpi.la > > drwxr-xr-x 2 username username4096 May 27 22:57 pkgconfig > > > > IMO this could work only under two lucky circumstances: > > either one has another libfftw3.so and libfftw3_mpi.so somewhere in > > the > > system path > > or it in fact doesn't link with the FFTW libs > > in /home/username/fftw3/lib but with the FFTW-compatible interface > > inside the MKL > > > > > On 9/13/2020 2:53 AM, Ilias Miroslav, doc. RNDr., PhD. wrote: > > > > Hello, > > > > > > > > to profit from parallelization, one has to install ELPA and the > > > > parallel version fftw library. > > > > > > > > For fftw library I used ./configure --enable-mpi, but Wien2k > > > > installator says "!!! WARNING: No MPI version of the FFTW > > library > > > > found!" But my installed fftw-3.3.8/lib > > > > contains also libfftw3_mpi.a . > > > > > > > > Any clues what is wrong ? Maybe it would be better to have > > defined > > > > environmental variables from ELPA, fftw ? > > > > > > > > Miro > > > > > > > > > ___ > > > Wien mailing list > > > Wien@zeus.theochem.tuwien.ac.at > > > > > https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!E4g1pwAyAnEalyTRgC-3W-G9PjrnjQyeQ2rogzXFW466AhffGupbkMeeLTiuuS7zBzN5wQ$ > > > > > SEARCH the MAILING-LIST at: > > > > > https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!E4g1pwAyAnEalyTRgC-3W-G9PjrnjQyeQ2rogzXFW466AhffGupbkMeeLTiuuS5KGygnKw$ > > > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!E4g1pwAyAnEalyTRgC-3W-G9PjrnjQyeQ2rogzXFW466AhffGupbkMeeLTiuuS7zBzN5wQ$ > > > > SEARCH the MAILING-LIST at: > > https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!E4g1pwAyAnEalyTRgC-3W-G9PjrnjQyeQ2rogzXFW466AhffGupbkMeeLTiuuS5KGygnKw$ > > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] parallel instalation of Wien2k: elpa, fftw
On Mon, 2020-09-14 at 06:08 -0600, Gavin Abo wrote: > See that "./configure --enable-mpi" was used. > > Of note, sometimes -gcc-sys is needed: > > https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg18664.html Out of interest I went through the link and I don't see how the linker can find/use the FFTW libraries. The FFTW_LIBS and FFTW_PLIBS clearly specify to link with the dynamic libraries: FFTW_OPT : -DFFTW3 -I/home/username/fftw3/include FFTW_LIBS : -L/home/username/fftw3/lib -lfftw3 FFTW_PLIBS : -lfftw3_mpi however the directory with FFTW contains only the static libraries: /home/username/fftw3/lib: total 2108 drwxr-xr-x 3 username username4096 May 27 22:57 cmake -rw-r--r-- 1 username username 1933432 May 27 22:57 libfftw3.a -rwxr-xr-x 1 username username 893 May 27 22:57 libfftw3.la -rw-r--r-- 1 username username 201232 May 27 22:57 libfftw3_mpi.a -rwxr-xr-x 1 username username 939 May 27 22:57 libfftw3_mpi.la drwxr-xr-x 2 username username4096 May 27 22:57 pkgconfig IMO this could work only under two lucky circumstances: either one has another libfftw3.so and libfftw3_mpi.so somewhere in the system path or it in fact doesn't link with the FFTW libs in /home/username/fftw3/lib but with the FFTW-compatible interface inside the MKL > On 9/13/2020 2:53 AM, Ilias Miroslav, doc. RNDr., PhD. wrote: > > Hello, > > > > to profit from parallelization, one has to install ELPA and the > > parallel version fftw library. > > > > For fftw library I used ./configure --enable-mpi, but Wien2k > > installator says "!!! WARNING: No MPI version of the FFTW library > > found!" But my installed fftw-3.3.8/lib > > contains also libfftw3_mpi.a . > > > > Any clues what is wrong ? Maybe it would be better to have defined > > environmental variables from ELPA, fftw ? > > > > Miro > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] parallel instalation of Wien2k: elpa, fftw
On Mon, 2020-09-14 at 06:51 -0500, Laurence Marks wrote: > ? > > I have never used dynamic fftw, and never had a problem so I doubt > that is the issue. I also don't have any issues with static linking but I do fix the Makefiles manually when siteconfig fails me. If you have a way how to link with the static libraries just using the siteconfig, i.e., setting just the FFTWROOT, FFTW_VERSION, FFTW_LIB and FFTW_LIBNAME from siteconfig to make it link with the static libraries, I would be interested in your config... Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] parallel instalation of Wien2k: elpa, fftw
I think I see the issue, libfftw3_mpi.a is a static library Please build FFTW with something like --enable-dynamic, or so, to get the dynamic libraries as well (I don't remember the exact switch, but ./configure --help will list you all the options, so just find it there). If you have libfftw3_mpi.so and you still have issues, just post all of your FFTW options and the full path to the directory where you have the libraries (and the list of its contents), to debug further. As a further remark, FFTW is so common library that is should be pretty much available on every possible system, so I would suggest to save yourself some trouble and don't compile it on your own. If this is your computer, just install the correct packages from repos, or if this is on cluster, just use the FFTW module module provided by admins. This approach have the advantage, that the libraries will be either installed into your system paths (or the loading of the module should update the default paths), so even if you mess up your FFTW settings is siteconfig, the linker should still be able to find the libraries... Best regards Pavel BTW linking with static libs is doable but to do it through the siteconfig is pretty much impossible so if you really want to do it, you need to edit Makefiles manually. On Sun, 2020-09-13 at 08:53 +, Ilias Miroslav, doc. RNDr., PhD. wrote: > Hello, > > to profit from parallelization, one has to install ELPA and the > parallel version fftw library. > > For fftw library I used ./configure --enable-mpi, but Wien2k > installator says "!!! WARNING: No MPI version of the FFTW library > found!" But my installed fftw-3.3.8/lib > contains also libfftw3_mpi.a . > > Any clues what is wrong ? Maybe it would be better to have defined > environmental variables from ELPA, fftw ? > > Miro > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] error in mixer
> forrtl: severe (168): Program Exception - illegal instruction Did you compile Wien2k on different machine than you run it now on? What were your compilation options? This looks like your lapw1 binary was compiled with some instructions which are not available on the machine... Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Why the GAP appeared in DOS plot is lower than the GAP obtained during "Analysis"
Another option is that you applied some broadening to your DOS, the gap in the DOS than would look smaller. Best regards Pavel On Thu, 2020-04-23 at 08:12 +, Tran, Fabien wrote: > The gap shown in Analysis is :GAP in case.scf. If a different k-mesh > (than the one used during the SCF calculation) is used for > generating the DOS, then there may be a difference. Typically, one > should increase the k-mesh for DOS. > What is the difference between the gaps in Analysis and DOS in your > case? > > > From: Wien on behalf of > shamik chakrabarti > Sent: Thursday, April 23, 2020 10:01 AM > To: A Mailing list for WIEN2k users > Subject: [Wien] Why the GAP appeared in DOS plot is lower than the > GAP obtained during "Analysis" > > Dear Wien2k users, > > We have seen that the GAP appeared in > DOS plot is lower than the GAP obtained during "Analysis". Why it is > so? what is the meaning of GAP appeared during "Analysis" > > Looking forward to your reply in this regard. > > with regards, > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Lapw.2 error
Not directly related to this thread, but since reports of this bug (and some others with known fixes) keeps reappearing, how about making a 19.2 (or 19.1.1) bugfix release (just 19.1 code + fixes from Gavins repo)? Unless next major version is already round the corner. Just my two cents Best regards Pavel On Fri, 2020-04-17 at 12:29 -0600, Gavin Abo wrote: > Also, did compile with gfortran and not apply the bug fix to WIEN2k > 19.1: > > https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg18771.html > https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg19741.html > > > On 4/17/2020 12:20 PM, Laurence Marks wrote: > > This may have nothing to do with it, but why is your directory > > WIEN2k? The normal convention is, for instance, to have a directory > > TiC in which your structure file is TiC.struct etc. I hope you are > > not running in the installation directory, I suspect that could > > lead to chaos! > > > > On Fri, Apr 17, 2020 at 1:02 PM Johnathon Street > > wrote: > > > Prof. Blaha, > > > > > > I am running Wien2k version 19.1 on Ubuntu. When running the SFC > > > cycle I am receiving the following error in Lapw2. error file. > > > > > > 'LAPW2' - can't open unit: 15 > > > > > > 'LAPW2' -filename: WIEN2k.tmp > > > > > > 'LAPW2' - status: scratch form: unformatted > > > > > > I have searched the mailing list and found a possible solution > > > would be to delete line 15 in the lapw2.def file which appears as > > > below: > > > > > > 2,'WIEN2k.nsh','unknown','unformatted',0 > > > 3,'WIEN2k.in1', 'unknown','formatted',0 > > > 4,'WIEN2k.inso', 'unknown','formatted',0 > > > 5,'WIEN2k.in2', 'old','formatted',0 > > > 6,'WIEN2k.output2','unknown','formatted',0 > > > 7,'WIEN2k.vorb','unknown','formatted',0 > > > 8,'WIEN2k.clmval','unknown','formatted',0 > > > 10,'./WIEN2k.vector', 'unknown','unformatted',9000 > > > 13,'WIEN2k.recprlist', 'unknown','unformatted',9000 > > > 14,'WIEN2k.kgen','unknown','formatted',0 > > > 16,'WIEN2k.qtl', 'unknown','formatted',0 > > > 17,'WIEN2k.weightaver','unknown','formatted',0 > > > 18,'WIEN2k.vsp', 'old','formatted',0 > > > 19,'WIEN2k.vns', 'unknown','formatted',0 > > > 20,'WIEN2k.struct', 'old','formatted',0 > > > 21,'WIEN2k.scf2','unknown','formatted',0 > > > 922,'WIEN2k.rotlm', 'unknown','formatted',0 > > > 23,'WIEN2k.radwf', 'unknown','formatted',0 > > > 26,'WIEN2k.weight', 'unknown','formatted',0 > > > 27,'WIEN2k.weightdn', 'unknown','formatted',0 > > > 29,'WIEN2k.energydn','unknown','formatted',0 > > > 30,'WIEN2k.energy', 'unknown','formatted',0 > > > 32,'WIEN2k.qdmft', 'unknown','formatted',0 > > > 34,'WIEN2k.oubwin', 'unknown','formatted',0 > > > 231,'WIEN2k.dmftsym', 'unknown','formatted',0 > > > > > > so that the lapw.def file looks like this. > > > > > > line 15 prior to deletion is as follows: > > > > > > 15, 'WIEN2k.tmp', 'scratch', unformatted' ,0 > > > > > > I continue to get the same error following deleting line 15. Do > > > you have any suggestions > > > > > > Thank you, > > > Johnathon Street > > > > > > > > > ___ > > > Wien mailing list > > > Wien@zeus.theochem.tuwien.ac.at > > > https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!BMa1D1qNnQmptPuCfS6VM-yj1xakPH0xhKCtIv9DReLsGVEuTMTkH-QXfeTzi5FI7psSVQ$ > > > > > > SEARCH the MAILING-LIST at: > > > https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!BMa1D1qNnQmptPuCfS6VM-yj1xakPH0xhKCtIv9DReLsGVEuTMTkH-QXfeTzi5Ea7fovWg$ > > > > > > > > > > > > -- > > Professor Laurence Marks > > Department of Materials Science and Engineering > > Northwestern University > > www.numis.northwestern.edu > > Corrosion in 4D: www.numis.northwestern.edu/MURI > > Co-Editor, Acta Cryst A > > "Research is to see what everybody else has seen, and to think what > > nobody else has thought" > > Albert Szent-Gyorgi > > > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at:
Re: [Wien] fold2Bolch - gfortran
Hi, gfotran should in theory compile anything what ifort does (assuming the code is a valid standard fortran, which might not be a case for stuff which is only used/tested with ifort). Just set the compiler to gfortran instead of ifort, use the same flags as for Wien2k ("-ffree- form -ffree-line-length-none -O2" might be a good start) and if you run into any issues, provide the specific compile commands you use and also the errors. Best regards Pavel On Mon, 2020-03-30 at 11:00 +0200, Catalina Coll wrote: > Dear users and developers, > > I would like to know if is possible to compile fold2Bloch with > gfortran (I currently have WIEN2k compiled with gfortran). > > Thanks in advance. > > Catalina Coll > PhD Candidate > LENS - Laboratory of Electron Nanoscopy > MIND - Micro-Nanotechnology and Nanoscopies for electrophotonic > Devices > IN2UB - Institute of Nanoscience and Nanotechnology > Departament d'Enginyeria Electrònica i Biomèdica - Universitat de > Barcelona > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] TELNES calculation
Dear Ali, please do core holes only for atoms with multiplicity 1 (otherwise you add multiple core hole at once, and you will get interaction between them, which is what you want to avoid with the supercell in the first place)! Just name (number) one oxygen atom for every non-equivalent oxygen position, so that the symmetry is reduced as needed. Than of course you do the core hole and TELNES calculations just for the named atoms and sum the spectra with weights according to the correct multiplicities. Best regards Pavel On Tue, 2020-03-24 at 07:37 +, Ali Baghizhadeh wrote: > Dear Prof. Blaha > Thank you very much. I did create supercell (2x2x1) and I am using > LDA+U. Again some oxygen have multiplicity of 3, which may result in > increase in the intensity of that specific oxygen. Currently I do > non-spin polarized calculations, but I wish to introduce AFM state in > the cell, on Fe ions. As oxygen is non-magnetic, I do not know how > much the spin state of Fe ions will affect the TELNES spectra? > > Best regards > Ali > From: Wien on behalf of > Peter Blaha > Sent: 24 March 2020 07:05:43 > To: wien@zeus.theochem.tuwien.ac.at > Subject: Re: [Wien] TELNES calculation > > ad 1) no case.inm has no effect on telnes. It is used only during > run_lapw. > ad 2) Yes, you should do the calculations for all non-equivalent O > atoms > and sum the results including their multiplicity. (at least when you > see > some differences in their corresponding DOS). > > What you did not mention: You should create a supercell and create > the > core holes in the supercells. Please read the corresponding > literature > (or the XAS/TELNES sections in the UG and in our workshop lectures). > > And: LuFeO3 is certainly a correlated material. Use GGA+U or mBJ for > these calculations. > > Am 23.03.2020 um 22:42 schrieb Ali Baghizhadeh: > > Dear WIEN2k users > > > > I am trying to calculate the K-edge of oxygen in h-LuFeO3 using > TELNES > > program. I have two questions regarding a structure having few > oxygen > > ions of different Wyckoff positions and multiplicity. For K-edge > oxygen > > calculation, I assume we change the occupancy of specific oxygen > in > > case.inc and add an electron to background in case.inm to run SCF. > > > > 1- After SCF convergence and before TELNES, should we modify > > again case.inm and remove the additional background electron or > not? > > > > 2- Should we repeat SCF calculation for all non-equivalent oxygens > in > > the structure and sum spectra of all oxygens, to represent the > > experimental spectrum? > > > > Thank you in advance. > > > > > > Ali Baghi zadeh > > > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] PES issues
Thanks for the comments. I played with it a bit more today and indeed the c) looks the most probable. In fact the fit is very unstable (as expected), when I change the DOS a bit (for example a slightly different broadening) the fitted values can change significantly. But hard to tell for sure without actually seeing the correlation matrix. Best regards Pavel On Fri, 2020-03-13 at 15:41 +0100, Peter Blaha wrote: > It looks a bit strange if this is a simple supercell without changes. > > a) You should increase the limit for the fit (after the first fit, > changes are limited to +-0.20, with "l" you can redo the fit with a > larger limit until the "optimized weight" does not improve anymore. > > b) In particular the N-s states look strange. These are well > separated > bands of almost only 2s character. Thus the total DOS in this region > should be perfectly well represented by the sum of the renormalized- > PDOS. > > c) It could be that the fit is not unique, since without changes, > all > PDOS of the different atoms should have the same shape and one can > probably have the same total DOS with some "arbitrarily" chosen > weights. > After all you fit: Y = x1*A + x2*A (A is the identical PDOS of > atom 1 > and 2). Then x1 and x2 are are not uniquely defined. > > > On 3/13/20 3:27 PM, Pavel Ondračka wrote: > > Dear Peter, > > > > thanks for the new version. It seems to be working now (I've also > > included the N2s states). The V states seem to be too strong with > > respect to my experimental measurements, but maybe this is just a > > problem with the crossections, I'll try to play with it a bit. What > > I've however noticed is that the optimized q_sphere differ > > significantly between different atoms of the same type: > > > >Optimize q_sphere by fitting total DOS?(Y/n) > > Y > > Mean deviation of (total-DOS - sum(PDOS))**2 using > > outputst-weights: optimized weights: with limit: > >12672.3980 1069.0124+-0.20 > > Partial Orbital Case.outputst Optimized > > V 4s 0.13420.33429998 > > V 4p 1.0.8001 > > V 3d 0.818099980.9901 > > V 4s 0.13420.33429998 > > V 4p 1.0.8001 > > V 3d 0.818099980.9901 > > V 4s 0.13420.23524959 > > V 4p 1.0.8001 > > V 3d 0.818099980.9901 > > V 4s 0.13420.33429998 > > V 4p 1.0.8001 > > V 3d 0.818099980.81147813 > > N 2s 0.803600010.9901 > > N 2p 0.737900020.93790001 > > N 2s 0.803600010.60360003 > > N 2p 0.737900020.53790003 > > N 2s 0.803600010.60360003 > > N 2p 0.737900020.53790003 > > N 2s 0.803600010.9901 > > N 2p 0.737900020.93790001 > > N 2s 0.803600010.9901 > > N 2p 0.737900020.93790001 > > N 2s 0.803600010.87147462 > > N 2p 0.737900020.60641075 > > > > This is kinda unexpected as this is in fact a perfect cubic VN > > supercell (with single named atom for core-hole calculations, but > > no > > core hole yet in this case). > > > > Is the fitting working as expected? > > > > Best regards > > Pavel > > > > On Fri, 2020-03-13 at 14:14 +0100, Peter Blaha wrote: > > > Daer Pavel, > > > > > > Thanks for your report. > > > > > > I tried to reproduce it, but my version of pes has already been > > > changed > > > significantly. In particular optimize_charge.f is now quite > > > different. > > > > > > However, when compiling with -C I detected another problem with > > > the > > > variable "start" (never set to zero) and in read_dos.f. > > > > > > I've uploaded SRC_pes.tar.gz into the "files" directory of the > > > wien2k-download and you can download it from there, if > > > interested. > > > > > > PS: I've not fixed the gfortran problem yet, but will try to do > > > it > > > soon. > > > So if it is not timely, you can also wait with the download until > > > this > > > is also fixed. > > > > >
Re: [Wien] PES issues
Dear Peter, thanks for the new version. It seems to be working now (I've also included the N2s states). The V states seem to be too strong with respect to my experimental measurements, but maybe this is just a problem with the crossections, I'll try to play with it a bit. What I've however noticed is that the optimized q_sphere differ significantly between different atoms of the same type: Optimize q_sphere by fitting total DOS?(Y/n) Y Mean deviation of (total-DOS - sum(PDOS))**2 using outputst-weights: optimized weights: with limit: 12672.3980 1069.0124+-0.20 Partial Orbital Case.outputst Optimized V 4s 0.13420.33429998 V 4p 1.0.8001 V 3d 0.818099980.9901 V 4s 0.13420.33429998 V 4p 1.0.8001 V 3d 0.818099980.9901 V 4s 0.13420.23524959 V 4p 1.0.8001 V 3d 0.818099980.9901 V 4s 0.13420.33429998 V 4p 1.0.8001 V 3d 0.818099980.81147813 N 2s 0.803600010.9901 N 2p 0.737900020.93790001 N 2s 0.803600010.60360003 N 2p 0.737900020.53790003 N 2s 0.803600010.60360003 N 2p 0.737900020.53790003 N 2s 0.803600010.9901 N 2p 0.737900020.93790001 N 2s 0.803600010.9901 N 2p 0.737900020.93790001 N 2s 0.803600010.87147462 N 2p 0.737900020.60641075 This is kinda unexpected as this is in fact a perfect cubic VN supercell (with single named atom for core-hole calculations, but no core hole yet in this case). Is the fitting working as expected? Best regards Pavel On Fri, 2020-03-13 at 14:14 +0100, Peter Blaha wrote: > Daer Pavel, > > Thanks for your report. > > I tried to reproduce it, but my version of pes has already been > changed > significantly. In particular optimize_charge.f is now quite > different. > > However, when compiling with -C I detected another problem with the > variable "start" (never set to zero) and in read_dos.f. > > I've uploaded SRC_pes.tar.gz into the "files" directory of the > wien2k-download and you can download it from there, if interested. > > PS: I've not fixed the gfortran problem yet, but will try to do it > soon. > So if it is not timely, you can also wait with the download until > this > is also fixed. > > PPS: Since you have N-s basis and it is used in the fit, you must > include its PDOS also. Thus you must include the PDOS for lower > energy, > such that the N-s band is included. Otherwise the fit may give > nonsense. > > Best regards > Peter > > > On 3/13/20 11:58 AM, Pavel Ondračka wrote: > > Dear Wien2k mailing list, > > > > I'm experiencing a crash when trying to calculate valence band > > spectra > > for VN. > > > > (This is a resend of previous email which is stuck in the queue due > > to > > being slightly over the size limit, now with a link instead. I > > apologize for double posting if the original one eventually makes > > it to > > the list as well.) > > > > There is out of bounds write during optimization of q_spheres: > > Program received signal SIGSEGV, Segmentation fault. > > 0x0040d494 in optimize_charge () at optimize_charge.f:95 > > 95 temp(l,recon_counter)=temp(l,j)+temp(l,i) > > (gdb) print recon_counter > > $1 = 27 > > (gdb) print output_counter > > $2 = 24 > > (it tries to write at index 27) but the size is just 24 (defined > > by output_counter) > > > > The files needed to reproduce this and the terminal output > > (together > > with the manual keyboard input needed to reproduce the crash) are > > at > > https://drive.google.com/open?id=1NZ8lSkfrgigtdQZrDZLp8Y-mFf4uSyk_ > > . I'm not a regular user of the pes program so there is a high > > chance that there is something wrong with my input. > > > > BTW While taking a quick look I spotted some likely unrelated small > > issues, for instance pes is also influenced by the well known issue > > with gfortran using the units 5 and 6 (have to change it manually > > to > > something else otherwise stdin and stdout doesn't work) and there > > are > > some valgrind warnings even before the crash, for example: > > > > ==57304== Conditional jump or move depends on uninitialised > > value(s) > > ==
[Wien] PES issues
Dear Wien2k mailing list, I'm experiencing a crash when trying to calculate valence band spectra for VN. (This is a resend of previous email which is stuck in the queue due to being slightly over the size limit, now with a link instead. I apologize for double posting if the original one eventually makes it to the list as well.) There is out of bounds write during optimization of q_spheres: Program received signal SIGSEGV, Segmentation fault. 0x0040d494 in optimize_charge () at optimize_charge.f:95 95 temp(l,recon_counter)=temp(l,j)+temp(l,i) (gdb) print recon_counter $1 = 27 (gdb) print output_counter $2 = 24 (it tries to write at index 27) but the size is just 24 (defined by output_counter) The files needed to reproduce this and the terminal output (together with the manual keyboard input needed to reproduce the crash) are at https://drive.google.com/open?id=1NZ8lSkfrgigtdQZrDZLp8Y-mFf4uSyk_ . I'm not a regular user of the pes program so there is a high chance that there is something wrong with my input. BTW While taking a quick look I spotted some likely unrelated small issues, for instance pes is also influenced by the well known issue with gfortran using the units 5 and 6 (have to change it manually to something else otherwise stdin and stdout doesn't work) and there are some valgrind warnings even before the crash, for example: ==57304== Conditional jump or move depends on uninitialised value(s) ==57304==at 0x419919: abs_smooth_ (SPLINE.f:173) ==57304==by 0x41852E: setup_ (SPLINE.f:91) ==57304==by 0x4199FE: spline_ (SPLINE.f:16) ==57304==by 0x41582B: read_database2_ (read_database2.f:124) ==57304==by 0x403FE7: MAIN__ (pes.f:151) ==57304==by 0x4066FF: main (pes.f:3) ==57304== Uninitialised value was created by a stack allocation ==57304==at 0x4199B3: spline_ (SPLINE.f:1) SPLINE.f:173 if (x >= delta_x) then The unuinitialized variable is the delta_x which was passed from setup (SPLINE.f:91): call abs_smooth(m4 - m3, delta_x, w1) and was itself allocated on the stack at the beginning of spline but is not initialized anywhere as far as I can see. So I set it to 0.0d0 (the default for ifort). and one more which should be harmless... ==57389== Conditional jump or move depends on uninitialised value(s) ==57389==at 0x47213EC5: bcmp (vg_replace_strmem.c:1113) ==57389==by 0x474A3B7A: _gfortran_compare_string (string_intrinsics_inc.c:98) ==57389==by 0x41677F: read_outputst_ (read_outputst.f:37) ==57389==by 0x4048E2: MAIN__ (pes.f:222) ==57389==by 0x4066FF: main (pes.f:3) ==57389== Uninitialised value was created by a stack allocation ==57389==at 0x416559: read_outputst_ (read_outputst.f:1) Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Ask for help
Dear Siham, yes, you can get a full dielectric tensor from Wien2k, just select the appropriate components in case.inop and case.injoint files. If I remember correctly the orientation of the tensor is that the xx direction of the dielectric tensor is in the direction of the a lattice parameter, yy is in the plane defined by a and b lattice parameters (and perpendicular to xx) and zz is perpendicular to yy and xx. So for some arbitrary direction you will have to do the transformation yourself. Best regards Pavel On Tue, 2020-01-28 at 13:43 +0100, Siham Malki wrote: > Dear All, > I calculated the dielectric function with Wien2k, so i obtained this > function vs energie , i need to know how to change the wave vector of > light q for determine the variation of the dielectric function as > function the wave vector q. Can you help me please. > Best regards > > > >Sender notified by > Mailtrack 28/01/20 à 13:42:52 > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Wien2k 19.1 with linux+gfortran benchmarks
I concur. In general for the serial test case on modern CPU (avx2 instructions) your runtime should be around or below 30seconds for single thread. However as this is almost 10 years old mobile CPU with just avx instructions the total runtime of slightly above 1 minute is expected. Regarding the scaling even when not memory bound, I can get around 35% runtime compared to serial run with openBLAS (MKL scales slightly better). Small speedups could be probably gained with some work on HNS section (as this is the worst scaling part which we have more or less under control) but for the DIAG part we just depend on the BLAS/LAPACK to scale properly. If you have multiple k-points and your total memory permits it, its best to use k-point parallelization and use OpenMP just for lapw0 and mixer... Pavel On Thu, 2019-12-12 at 13:42 +0100, Peter Blaha wrote: > It is perfectly ok for your hardware. > > The cpu time is not so important for you, what counts is the WALL- > time > (this is the time it really takes until it finishes). > > You can see that Hamilt parallelizes fairly well (3.7 vs. 12.3 > seconds / > speedup factor 3.3), HNS is not so good (3.8 vs. 8.8 s / factor 2.3) > and > DIAG is worse (23.2 vs. 48.2 / factor 2.1). > > Part of the reason that you can never see a factor of 4 is the slow > memory access, so when 4 cores do some calculations, they have to > wait > sometimes for data from the memory. > > On machines with more cores and a better memory bus, you will get > other > speed-ups, but basically no machine can use all cores with 100% > efficiency because of this limited memory access. > > > On 12/12/19 1:07 PM, Hemza wrote: > > Hi everybody: > > I just finished updating my wien2k installation to 19.1 with > > openMP > > support (linux (4.19.88), gfortran (9.2.0), openblas-lapack-openmp > > (0.3.7), fftw3 (3.3.8), libxc (4.3.4)), and patches from > > "https://github.com/gsabo/WIEN2k-Patches;. > > I intend to use it for relatively small cases (less than 25 > > atoms/unit > > cell). I run 'x lapw1' on the test_case. > > With OMP_NUM_THREAD=4 in bashrc: > > -- > > $ x lapw1 > > STOP LAPW1 END > > 113.876u 2.097s 0:31.36 369.7% 0+0k 424+37840io 2pf+0w > > $ grep HORB *output1* > > test_case.output1: TIME HAMILT (CPU) =13.5, HNS = > > 12.6, > > HORB = 0.0, DIAG =87.3, SYNC = 0.0 > > test_case.output1: TIME HAMILT (WALL) = 3.7, HNS = > > 3.8, > > HORB = 0.0, DIAG =23.2, SYNC = 0.0 > > -- > > > > and with OMP_NUM_THREAD=1 , I got: > > - > > $ x lapw > > STOP LAPW1 END > > 69.380u 0.339s 1:09.88 99.7%0+0k 352+37848io 2pf+0w > > $ grep HORB *output1* > > test_case.output1: TIME HAMILT (CPU) =12.0, HNS = > > 8.8, > > HORB = 0.0, DIAG =48.1, SYNC = 0.0 > > test_case.output1: TIME HAMILT (WALL) =12.3, HNS = > > 8.8, > > HORB = 0.0, DIAG =48.2, SYNC = 0.0 > > > > I do not feel i really understand the output and I do not know if > > this > > timing are good, so I eager to read your comments! > > > > My machine ('inix -dm' output) > > > > System:Host: dojo Kernel: 4.19.88-1-lts x86_64 bits: 64 > > Desktop: i3 > > 4.17.1 Distro: Artix rolling > > Machine: Type: Laptop System: ASUSTeK product: K53SD v: 1.0 > > serial: > > > > Mobo: ASUSTeK model: K53SD v: 1.0 serial: > required> > > BIOS: American Megatrends v: K53SD.202 > > date: 11/02/2011 > > Battery: ID-1: BAT0 charge: 33.8 Wh condition: 33.8/59.4 Wh (57%) > > Memory:RAM: total: 7.57 GiB used: 4.84 GiB (63.9%) > > RAM Report: permissions: Unable to run dmidecode. Are > > you root? > > CPU: Quad Core: Intel Core i7-2670QM type: MT MCP speed: 849 > > MHz > > min/max: 800/3100 MHz > > Graphics: Device-1: Intel 2nd Generation Core Processor Family > > Integrated Graphics driver: i915 v: kernel > > Device-2: NVIDIA GF119M [GeForce 610M] driver: nouveau > > v: > > kernel > > Display: x11 server: X.org 1.20.6 driver: > > intel,nouveau > > unloaded: fbdev,modesetting,vesa > > resolution: > > Message: Unable to show advanced data. Required tool > > glxinfo > > missing. > > Network: Device-1: Intel Centrino Wireless-N 100 driver: iwlwifi > > Device-2: Qualcomm Atheros AR8151 v2.0 Gigabit > > Ethernet > > driver: atl1c > > Drives:Local Storage: total: 2.05 TiB used: 1.45 TiB (70.8%) > > Info: Processes: 300 Uptime: 1d 1h 46m Shell: bash inxi: > > 3.0.26 > > - > > > > regards > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > >
Re: [Wien] 24k points on 36processors ??. (a Fractional k-point per core)
I'm neither a PBS or csh expert but to me it looks like you are spawning just a single kpoint job for each node and also for the lapw0 just a single process per node. If you run just a single k-point job at the node and there is still not enough memory than you probably need MPI. Or maybe the default memory PBS gives you is not enough (maybe you need to specifically ask for larger amount, no idea). For the example from my last email to spawn 4 kpoint jobs per node at three nodes with 3 openmp threads each the final .machines files should look like this (with the nodexxx replaced with the actual nodenames based on the $PBS_NODEFILE): 1:node1 1:node1 1:node1 1:node1 1:node2 1:node2 1:node2 1:node2 1:node3 1:node3 1:node3 1:node3 granularity:1 extrafine:1 omp_lapw2:3 omp_lapw1:3 omp_lapw0:4 omp_global:12 lapw0: node1:3 node2:3 node3:3 I would advice to read the .machines file section of the manual once more, try to understand what should your .machines file look like and than consult with whoever wrote your PBS script in the first place to modify it so it generates the .machines file you need. Best regards Pavel BTW you are actually not asking for 12cpus but just 8cpus... On Thu, 2019-12-12 at 17:04 +0530, Ashwani Kumar wrote: > Dear Sir, > Hyper-threading is disabled (just checked with facility > expert). So 12 physical cores per node (intel xeon nehalem based > arch.). Available Memory 4gb/core (48gb/node). > Lapw1 stops with error "insufficient virtual memory". So i > thought better to use 36 cores for 24k points as extra (48gb) memory > will be available. I am using pbs queuing system (wien2k V19.1 > compiled with openmpi_parallelization) which generates *.machine file > when jobscript executed. Then how to set the omp_thread in *.machine > file. (jobscript file attached for your reference). > > thanks, > A. kumar > > On Thu, Dec 12, 2019 at 2:55 PM Pavel Ondračka < > pavel.ondra...@email.cz> wrote: > > Hi, > > > > do you have hyperthreading or not (in other words does this number > > of > > 12 already mean there are 6 physical CPUs and 12 virtuals, or 12 > > physical)? This would influence the advice maybe a bit... > > > > Otherwise you need to experiment, the optimal setting is heavily > > dependent on your specific CPU, memory speed and what you are > > calculating (system size). > > > > When talking about the 24 kpoints and 36 processors, than running > > 4kpoints on each node (12 kpoints in parallel) with 3 openmp > > threads > > each might be a reasonable setting. > > > > It is also possible that just leaving some cores idle might be the > > best > > thing to do (as running a lot of k-points in parallel you can get > > limited by the memory speed so leaving some cores idle means more > > memory bandwidth for the others): > > This would correspond to running 8 kpoints on each node or 4 > > kpoints on > > each node with 2 openmp threads each. > > > > The linux kernel and modern processors are also usually good at > > handling some small overload and load balancing so you can also try > > to > > overload the system a bit, i.e., 8kpoints per node with 2 openmp > > threads each. > > > > Just try the different settings (single lapw1 run for each should > > be > > enough to get some idea) and compare the timings. > > > > Best regards > > Pavel > > > > BTW for lapw0 I would go with something like 3 MPI processes per > > node > > with 4 OpenMP threads for each in this case. > > > > On Thu, 2019-12-12 at 12:28 +0530, Ashwani Kumar wrote: > > > Hi, > > >This is related to no. of k-points which we provide during the > > > initilization. No. of k-gen points given ; 120 with shifted mesh. > > > Irr. k-points : 24k points. Running job on 3 nodes (12 x3 > > processors, > > > 48 gb x 3 Ram). Job running on 24 processors only (with > > granularity: > > > 1, extrafine:1 in *.machine file) which means 1kpoint/1-core. How > > can > > > 24 k-points be made to run on 36 cores ?. Or how can 24 kpoints > > can > > > be distributed equally between 36 cores (or let's say 12 kpoints > > on > > > 24 processors to make calculation converge faster). > > > > > > thanks, > > > A. Kumar > > > ___ > > > Wien mailing list > > > Wien@zeus.theochem.tuwien.ac.at > > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > > SEARCH the MAILING-LIST at: > > > > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/i
Re: [Wien] 24k points on 36processors ??. (a Fractional k-point per core)
Hi, do you have hyperthreading or not (in other words does this number of 12 already mean there are 6 physical CPUs and 12 virtuals, or 12 physical)? This would influence the advice maybe a bit... Otherwise you need to experiment, the optimal setting is heavily dependent on your specific CPU, memory speed and what you are calculating (system size). When talking about the 24 kpoints and 36 processors, than running 4kpoints on each node (12 kpoints in parallel) with 3 openmp threads each might be a reasonable setting. It is also possible that just leaving some cores idle might be the best thing to do (as running a lot of k-points in parallel you can get limited by the memory speed so leaving some cores idle means more memory bandwidth for the others): This would correspond to running 8 kpoints on each node or 4 kpoints on each node with 2 openmp threads each. The linux kernel and modern processors are also usually good at handling some small overload and load balancing so you can also try to overload the system a bit, i.e., 8kpoints per node with 2 openmp threads each. Just try the different settings (single lapw1 run for each should be enough to get some idea) and compare the timings. Best regards Pavel BTW for lapw0 I would go with something like 3 MPI processes per node with 4 OpenMP threads for each in this case. On Thu, 2019-12-12 at 12:28 +0530, Ashwani Kumar wrote: > Hi, >This is related to no. of k-points which we provide during the > initilization. No. of k-gen points given ; 120 with shifted mesh. > Irr. k-points : 24k points. Running job on 3 nodes (12 x3 processors, > 48 gb x 3 Ram). Job running on 24 processors only (with granularity: > 1, extrafine:1 in *.machine file) which means 1kpoint/1-core. How can > 24 k-points be made to run on 36 cores ?. Or how can 24 kpoints can > be distributed equally between 36 cores (or let's say 12 kpoints on > 24 processors to make calculation converge faster). > > thanks, > A. Kumar > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Issues with Wien2k installation
On Wed, 2019-12-04 at 06:29 +, Vidit Zala wrote: > Dear Sir, > I am using Wien2k version 19.1 on a thinkstation with i7 processor, > having ubuntu installed in it. I have just installed Wien2k in the > machine. The gfortran compiler and gcc are used. > After the installation, I have made a structure using makestruct > command and initialization was done using init_lapw command. > While running the scf calculations, with run_lapw command, I am > facing an error. I tried looking up in the mailing list to solve the > issue, but haven't found the solution. I am facing the following > error. > hup: Command not found. This is harmless (you can search some old mailing list threads for more details). > > no Fe.clmsum(_old) file found, which is necessary for lapw0 This is the real issue. The clmsum file contains the total charge density and should have been created during the init_lapw step. Were there any errors during the initialization? Is your structure reasonable? Best regards Pavel > !grep: *scf1*: No such file or directory > grep: lapw2*.error: > No such file or directory > > > > stop error > > Please guide me to solve this query. > > Thanking you in anticipation. > > Regards, > Vidit Zala > Gujarat University > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] lapw2 crashed error
Hi, I can't comment on the lapw2 error but just a small note about the .machines file. The four "100:localhost" lines mean that you run the lapw1, lapw2 and hf parallel over kpoints (in four separate processes). The "omp_global:4" line means that every Wien2k process will try to use 4 threads. Therefore, the total load during lapw1 and lapw2 will be 16, leading potentially to large overload and suboptimal speed. I would suggest to replace the "omp_global:4" with lines "omp_lapw0:4" and "omp_mixer:4" to just use the OpenMP parallelization in parts of the Wien2k where there is no parallelization over the kpoints. Best regards Pavel On Tue, 2019-11-26 at 05:53 +0530, Peeyush kumar kamlesh wrote: > Sir, > I am using single node of four cores. Mu machine file is below: > __ > 100:localhost > 100:localhost > 100:localhost > 100:localhost > granularity:1 > extrafine:1 > omp_global:4 > > > On Mon, Nov 25, 2019 at 10:06 PM Peeyush kumar kamlesh < > peeyush.physik@gmail.com> wrote: > > Hello Wien2k user, > > Greetings! > > I am running scf cycle with hf potential. When I run the command > > "run_lapw -hf -p", then after successful completion of 7 cycles, I > > found error in cycle 8. In terminal it is represented as follows: > > > > in cycle 8ETEST: .000491915000 CTEST: .0035867 > > hup: Command not found. > > LAPW0 END > > LAPW0 END > > LAPW1 END > > LAPW1 END > > LAPW1 END > > LAPW1 END > > sed: Command not found. > > LAPW2 - Error. Check file lapw2.error > > cp: cannot stat '.in.tmp': No such file or directory > > > > > stop error > > - > > - > > > > When I checked lapw2.error file I found following details: > > _ > > 'LAPW2' - can't open unit: 10 > > > > 'LAPW2' -filename: /case.vector > > > > 'LAPW2' - status: unknown form: unformatted > > > > ** testerror: Error in Parallel LAPW2 > > - > > -- > > > > I also tried to search and understand the previous threads, but I > > was unable to do so. Kindly suggest me why this error is appearing > > and how can it be resolved? > > > > Thanks and Regards > > Peeyush Kumar Kamlesh > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] possible overload/underload with OpenMP or threading in general
Dear Wien2k mailing list, in some recent discussion with profs. Marks and Blaha it was shown that under some circumstances the threading parallelization in Wien2k and its interaction with threaded BLAS/LAPACK environment variable (MKL but possibly also OpenBLAS and others) might have unexpected behavior potentially leading either to not perfect utilization of nodes (underload) or too many contending threads (overload), both reducing optimal speed of calculations. Short story with just three points: - Occasionally check the load of your nodes when running (either with "top", similar program or using your job scheduler reporting). If its much higher or lower than the number of cores, than this could be a problem and please continue reading. - If you have previously set MKL_NUM_THREADS, OPENBLAS_NUM_THREADS or any other equivalent BLAS/LAPACK specific threading variable, please unset them. - If you linked with non-default MKL settings or linked with different threaded BLAS/LAPACK such as OpenBLAS, please make sure that you BLAS/LAPACK library is internally threaded with OpenMP (not pthreads, TBB or any other threading library) and it uses the same OpenMP library as Wien2k (one example of such problematic config would be when compiling Wien2k with gfortran using MKL and using libiomp5 for MKL threading but libgomp for OpenMP threading in Wien2k itself). Best regards Pavel P.S.: Long story for people interested in technical details: Wien2k links with the threaded MKL by default and threaded OpenBLAS is usually also the default which distributions provide. In Wien2k versions before 19 when running stuff k-parallel and without OMP_NUM_THREADS set (or the BLAS specific equivalent env variables) the parallel BLAS/LAPACK libraries usually try to use the maximum number of cores, leading to overload if multiple k-points were running on single node. This was fixed with Wien2k 19.1 where the threading is now explicitly controlled from machines files and when no threading is specified it defaults to one thread per process. Another problem is with the BLAS/LAPACK specific threading variables such as MKL_NUM_THREADS, OPENBLAS_NUM_TRHEADS, etc. They have higher priority than the OMP_NUM_THREADS which is set by Wien2k internally based on the omp_xxx:y lines in .machines file and therefore can overwrite optimal threading set by the user. Unsetting them will make the parallel BLAS/LAPACK obey settings from the .machines file. More problems can occur when combining different threading models in Wien2k and BLAS/LAPACK (such as OpenMP and POSIX threads) or using OpenMP threading in both but different OpenMP libraries (for example Intel and GNU). This is most likely to happen when using gfortran and distro-provided OpenBLAS as its default threading is with ptreads. The OpenMP parallelization in Wien2k works in such a way that there are some explicit OpenMP parallel regions in which there might be also BLAS/LAPACK calls. In other places the BLAS/LAPACK calls are done from serial regions and we depend on parallelization at the BLAS/LAPACK level. If using OpenMP and same omplib everywhere that in the first case the BLAS/LAPACK library will recognize it is already being called from parallel region and run only single threaded while in the second case it will run with multiple threads as expected. If combining threading models or different threading libraries the BLAS/LAPACK calls from OpenMP parallel regions have no way of knowing there are already multiple threads and can each spawn more threads leading again to overload. ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] apw2c tries to read an anomalous amount of data
If you see a linear scaling for maximum used memory of lapw1c process as a number of threads even at low thread number than there is something strange going on. In fact the most memory consuming diagonalization part should more or less take the same amount of memory independent of the number of threads. The only part in which memory consumption scales roughly with number of threads is the hamilt routine at the beginning of lapw1, however unless you use more than ~10 threads, it should still consume less than the later diagonalization. Therefore it should not increase the max memory footprint and in fact at small number of threads the max memory consumption of both lapw1 and lapw2 shouln't depend much on the number of threads. If you see something different than please provide some specific numbers so I can check it. You are also right that the vector size is unchanged, and therefore the total amount of IO is of course the same with and without OpenMP. However, with the k-point parallelization, the issues can be caused by all the processes running at the same time and accessing the vector files simultaneously, therefore when you reduce the number of processes the peak I/O access should be more balanced/reduced. Prof. Blaha already explained this in his email on Friday. Another stuff which can unexpectedly influence the results is the filesystem cache. The linux kernel is usually quite clever with caching the latest accessed files in the unused memory, so this can help significantly as well (and again depends on the overall memory pressure). Hope this helps to explain it. Pavel On Mon, 2019-07-29 at 12:19 +0200, Luc Fruchter wrote: > What comes out as a surprise (for me), is that the memory needed for > lapw2s does not scale with the number of CPUs, while it does for > lapw1s: > when I reduce the number of CPUs, lapw1s memory scales down > proportionaly, while total .vector files size is unchanged, and so > there > is no improvement in handling them with lapw2s. > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] lapw2c tries to read an anomalous amount of data
The easiest solution to reduce memory pressure is to reduce the number of k-points run in parallel... You should experiment with other parallelization options. If running 4 kpoints in parallel does not fit in your memory (or is slow), try to run for example just with two but with 2 OpenMP threads per process. Using MPI is another option and also reduces memory required per CPU. On Fri, 2019-07-26 at 09:37 +0100, Laurence Marks wrote: > If I remember right, the largest piece of memory is the vector file > so this should be a reasonable estimate. > > During the scf convergence you can reduce this by carefully changing > the numbers at the end of case.in1(c). You don't really need to go to > 1.5 Ryd above E_F (and similarly reduce nband for ELPA). For DOS etc > later you increase these and rerun lapw1 etc. > > On Fri, Jul 26, 2019 at 9:27 AM Luc Fruchter > wrote: > > Yes, I have shared memory. Swap on disk is disabled, so the system > > must > > manage differently here. > > > > I just wonder now: is there a way to estimate the memory needed for > > the > > lapw2s, without running scf up to these ? Is this the total .vector > > size ? > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > https://urldefense.proofpoint.com/v2/url?u=http-3A__zeus.theochem.tuwien.ac.at_mailman_listinfo_wien=DwICAg=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0=xDusGo0KphXQ04Dl6Wf9xCaKVxoL-U4kVBCyrmtP_J4=f2IY4a60LXX2_8DTCObJe-nnPgNcIVqZRBsqpTqrRQU= > > SEARCH the MAILING-LIST at: > > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mail-2Darchive.com_wien-40zeus.theochem.tuwien.ac.at_index.html=DwICAg=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0=xDusGo0KphXQ04Dl6Wf9xCaKVxoL-U4kVBCyrmtP_J4=g-rsFk4xC7uHaddVQZCS2nKpLz-AyX4WWPpytDCUObI= > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] 3p or 4p PDOS for 3d transition metal
If I remember correctly I had some some similar problems in the past if the deep lying states width was too thin (with respect to the energy step). Try to reduce the energy step in tetra, or maybe increase the broadening... Best regards Pavel On Fri, 2019-07-12 at 15:30 +0800, 杨柯 wrote: > Thanks for the reply. > > The problem is that the energy level of 3p and 3p* orbial as valence > state of Co atom were not show in my case.dos1eVup file. > I already change the Emin in the case.inst file. Still the 3p and 3p* > about -4.5 Ry were not show. > I checked the case.qtlup file, the energy start in -4 Ry. It looks > like the energy about -4.5 Ry lost. > I have no idea what happened. > Any suggestions are welcome. > > > > > > > -原始邮件- > > 发件人: t...@theochem.tuwien.ac.at > > 发送时间: 2019-07-12 15:06:21 (星期五) > > 收件人: "A Mailing list for WIEN2k users" < > > wien@zeus.theochem.tuwien.ac.at> > > 抄送: > > 主题: Re: [Wien] 3p or 4p PDOS for 3d transition metal > > > > Hi, > > > > In WIEN2k, the plotting of the DOS is only for the valence states > > (the core states are not shown). If they were plotted, each core > > state would just correspond to a vertical line. > > > > In the case where two sets of states with same angular momentum, > > but different quantum number were both treated as valence, > > then case.outputst would tell you approximately were they should > > appear on the DOS. > > > > FT > > > > On Friday 2019-07-12 03:13, 杨柯 wrote: > > > > > Date: Fri, 12 Jul 2019 03:13:13 > > > From: 杨柯 > > > Reply-To: A Mailing list for WIEN2k users < > > > wien@zeus.theochem.tuwien.ac.at> > > > To: A Mailing list for WIEN2k users < > > > wien@zeus.theochem.tuwien.ac.at> > > > Subject: Re: [Wien] 3p or 4p PDOS for 3d transition metal > > > > > > Dear Blaha, > > > > > > Thank you very much for your detailed reply. > > > > > > I have another question that I hope you could help me. > > > > > > The case.outputst have the information about which oribal is > > > treated as Core-state and which orbital is treated as Valence- > > > state. > > > The case I was doing for example Co atom. The 1s,2s,2p,3s were > > > treated as Core-state. > > > The 3p,3d,4s were treated as Valence-state. > > > When the initial input was like this. > > > Dose thit mean when I using "x lapw2 -orb -up/-dn -qtl" to plot > > > the PDOS, > > > and the "configrue_int" to choose the tot,s,p,and the projected > > > orbit of d orbit, > > > "x tetra -up/-dn" to show the dos. > > > It is obvious that the d orbit is the 3d orbit for Co atom. > > > But I'm not very sure the s,p orbit corresponding to which orbit > > > 3s, 3p or 3p, 4s? > > > Is this principal quantum number (n) for ns,np PDOS related to > > > the orbital you choose as Core-state and Valence-state? > > > > > > > > > I hope you can help me clear up my doubts. > > > > > > > > > > > > -- > > > Yours sincerely, > > > Ke Yang > > > Email: kyan...@fudan.edu.cn > > > Address: Department of Physics, Fudan University, Handan Road > > > 220, Shanghai 200433, China > > > > > > > > > > > > > > > > > > > > > > -原始邮件- > > > > 发件人: "Peter Blaha" > > > > 发送时间: 2019-07-11 23:20:37 (星期四) > > > > 收件人: wien@zeus.theochem.tuwien.ac.at > > > > 抄送: > > > > 主题: Re: [Wien] 3p or 4p PDOS for 3d transition metal > > > > > > > > Obviously, 3s,3p and 4s,4p states differ in their energy. The > > > > principal > > > > QNs are not "labelled" explicitly. > > > > > > > > Depending on which TM atom you have, the 3s,3p states may range > > > > at -2.0 > > > > (Sc) to -7 Ry (Cu). Eventually, their bandwidth can be very > > > > small and > > > > usually we do not "plot" the corresponding DOS. > > > > > > > > The 4s,4p states are in the valance region around EF. However, > > > > their > > > > wave function are very delocalized, thus have very little > > > > contribution > > > > inside the atomic sphere, One 4s electron may eventiually have > > > > only 0.1 > > > > electrons within the sphere, thus the corresponding PDOS will > > > > be very small. > > > > For these reasons, I'd recommend to use the recent xps > > > > package (see > > > > UG). If you provide all possible PDOS (of all atoms and all l > > > > values) > > > > this package can renormalize the PDOS, such that the > > > > interstital DOS is > > > > "removed" and distributed into the corresponding atomic PDOS. > > > > > > > > > > > > > > > > Am 11.07.2019 um 16:54 schrieb 杨柯: > > > > > Dear Blaha and others, > > > > > > > > > >Now, I'm trying to plot the 3s,3p or 4s,4p PDOS for 3d > > > > > transition metal. > > > > > > > > > > I'm not sure the standard output s,p orbital for transition > > > > > metal is 3s,3p or 4s,4p. > > > > > > > > > > If not, how can I obtain the 3s,3p,4s,4p PDOS for the > > > > > transition metal. > > > > > > > > > > Any suggestion are welcome. > > > > > > > > > > Thank you very much for your reply. > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Yours
Re: [Wien] FFTW compiling
Maybe Gavin can clarify but I'm not actually sure if the instructions in the mentioned email are 100% correct. FFTW there is compiled with the default static libs (libfftw3.a and libfftw3_mpi.a), but the actual -lfftw3 -lfftw3_mpi flags used for linking are for the dynamic library (libfftw3.so and libfftw3_mpi.so). My guess is that it worked for Gavin due to having another dynamic fftw somewhere in the path (e.g., the system fftw libraries)? Anyway, if you can get libfftw3.a but not libfftw3_mpi.a even when you pass --enable-mpi to fftw configure, there is likely some problem with your MPI installation. Were there any warnings during fftw configure? BTW on some distributions like Fedora, only installing openmpi or mpich is not enough you must also load it to set all the paths properly (for example on Fedora with "module load openmpi"). This is distribution specific though. To get the dynamic library, just pass --enable-shared to fftw configure, the Wien2k flags should work than (provided you can figure out the missing mpi fftw lib). However I would highly recommend to get the fftw from distribution repositories (something like libfftw3-mpi- dev and libfftw3-dev package is needed, but specific package names vary with distribution as well), unless you are on some enterprise systemwith legacy fftw version. Best regards Pavel On Wed, 2019-07-03 at 14:57 +0530, Riyajul Islam wrote: > Hellow wien2k users, > I am facing a problem in compiling FFTW in sitecofiguration. I have > followed all the steps as in " > https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg18664.html > ". But could nopt acces the > "/home/username/fftw3/mpi/.libs/libfftw3_mpi.a" as in in mailing > list. > the current settings for FFTW is > F FFTW options:-DFFTW3 -I/home/edison1/fftw3/include > FFTW-LIBS: -L/home/edison1/fftw3/lib64 -lfftw3 > FFTW-PLIBS: NOT FOUND! > > How can I solve this issue? > > Regards, > Riyajul Islam > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] lapw2c tries to read an anomalous amount of data
Just out of curiosity, how do you measure the current disc usage per process? Also, how large is your memory consumption? Very high disc IO (and bad speed in general) can also be associated with swapping. lapw2 can be the most memory intensive part of the scf cycle. Best regards Pavel On Tue, 2019-07-02 at 19:10 +0200, Luc Fruchter wrote: > I am facing a problem with lapw2c on a machine running the 18.2 > version > of Wien2k. I suspect this is a machine problem, rather than a Wien2k > one, but would like to be sure: > > As lapw2c runs in parallel in a cycle, the lapw2c processes will all > try > to read a very large amount of data from the disk (several hundred > Gb), > so keeping the system busy endless. > > As the input files for lapw2c (.energy, .vector) are only a few Gb, > this > reading of hundred Gb seems suspect, and rather a disk problem. > However, > all lapw2c routines experience the same problem when run in > parallel, > which I would not expect if the reading of one file was problematic. > > On the other hand, the first cycles ran without any trouble. > > Thanks > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] core-hole calculation in a molecule
So just some brief follow up, in case someone finds this interesting. First of all I've made a mistake in my previous calculations, there actually is some dependence on the supercell size for the Slater's transition state approach. However the difference in binding energies is only ~0.05-0.1eV when going from 5Å vacuum to 10Å vacuum and changes less than ~0.01eV when going further to 15Å... I've tried the Δ-SCF approach as well and this is much worse. The difference in binding energies is ~0.3-0.4eV when going from 5Å vacuum to 10Å vacuum and changes by another ~0.2eV when going further to 15Å... The absolute energy values are better for the Δ-SCF approach (by approx 0.5eV), but since we are about 5eV from the absolute experimental values anyway, this is likely meaningless. For example taking LDA instead of PBE can change the absolute values by > 2eV. What is important, the relative shifts between different carbon atoms (with respect to experimental data) are also better for the Slater's transition state than for the Δ-SCF approach (with Slater's transition state I can get around 10% difference from experiment, while for Δ-SCF it is more like 20%). In general I'm very happy with the calculations now, except for the speed ;-) Best regards Pavel On Fri, 2019-06-21 at 07:39 +0200, Pavel Ondračka wrote: > On Wed, 2019-06-19 at 16:25 +0200, Peter Blaha wrote: > > This is certainly interesting. > > > > For a molecule an alternative is to remove one electron and then > > use > > E-tot(N) - E_tot(N-1) as binding energy. However, in this case due > > to > > the charged cells, I'd expect quite some dependency on the cell > > size > > and > > some correction might be necessary. > > > > Your findings indicate that Slater's transition state method is > > much > > better. > > I will try the Δ-SCF approach as well to see if it behaves > differently. > But still, I've now done a lot of similar calculations and there was > always some dependency on the cell size so this is a really big > surprise... > > BTW for Δ-SCF "E-tot(N) - E_tot(N-1)" is not enough, also μ is > needed, > which surprisingly no manuals mention... > > > On the other hand: If you really want to do only organic molecules > > (but > > many of them), any non-periodic molecular code (eg. NWChem, which > > is > > free) will be MUCH cheaper and faster. > > Right, the problem is that ultimately I would like to do the > interaction with a surface as well (and look for changes), so I still > do need a periodic boundary condition. In general I agree, when > hydrogen comes into play the lapw approach is super slow... For now > I'm > just exploring this so burning some extra CPU time is not an issue if > it ultimately saves me the troubles of learning yet another DFT code. > > > Your last question, comparison to bulk materials, you have to find > > out > > yourself. > > I would not expect perfect agreement with experiment in all cases, > > simply because of the problem having a common Energy-zero (we use > > EF > > for > > this, but EF is well defined only in metals, but the VBM of an > > insulator > > or the HOMO of a molecule is not the same "Fermi energy". > > > > Suppose you put a molecule far away from a metal surface, the DFT > > simulation will give you a common EF (which is most likely not > > where > > the > > HOMO of the molecule is). Thus (E-1s - EF) will be different if > > you > > do > > a combined system or the molecule alone, even when the molecule is > > so > > far away that it behaves as a free molecule. > > I'm actually hoping to use core electron binding energy of atoms far > from the surface (in both the bulk and the molecule) as a reference > to > check the core electron binding energy shifts of atoms directly at > the > surface, but dunno how this will work in reality. > > Thanks for all the feedback on the list (and in off-the-list emails > I've received as well) > Pavel > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] core-hole calculation in a molecule
On Wed, 2019-06-19 at 16:25 +0200, Peter Blaha wrote: > This is certainly interesting. > > For a molecule an alternative is to remove one electron and then use > E-tot(N) - E_tot(N-1) as binding energy. However, in this case due > to > the charged cells, I'd expect quite some dependency on the cell size > and > some correction might be necessary. > > Your findings indicate that Slater's transition state method is much > better. I will try the Δ-SCF approach as well to see if it behaves differently. But still, I've now done a lot of similar calculations and there was always some dependency on the cell size so this is a really big surprise... BTW for Δ-SCF "E-tot(N) - E_tot(N-1)" is not enough, also μ is needed, which surprisingly no manuals mention... > > On the other hand: If you really want to do only organic molecules > (but > many of them), any non-periodic molecular code (eg. NWChem, which is > free) will be MUCH cheaper and faster. Right, the problem is that ultimately I would like to do the interaction with a surface as well (and look for changes), so I still do need a periodic boundary condition. In general I agree, when hydrogen comes into play the lapw approach is super slow... For now I'm just exploring this so burning some extra CPU time is not an issue if it ultimately saves me the troubles of learning yet another DFT code. > Your last question, comparison to bulk materials, you have to find > out > yourself. > I would not expect perfect agreement with experiment in all cases, > simply because of the problem having a common Energy-zero (we use EF > for > this, but EF is well defined only in metals, but the VBM of an > insulator > or the HOMO of a molecule is not the same "Fermi energy". > > Suppose you put a molecule far away from a metal surface, the DFT > simulation will give you a common EF (which is most likely not where > the > HOMO of the molecule is). Thus (E-1s - EF) will be different if you > do > a combined system or the molecule alone, even when the molecule is > so > far away that it behaves as a free molecule. I'm actually hoping to use core electron binding energy of atoms far from the surface (in both the bulk and the molecule) as a reference to check the core electron binding energy shifts of atoms directly at the surface, but dunno how this will work in reality. Thanks for all the feedback on the list (and in off-the-list emails I've received as well) Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] core-hole calculation in a molecule
Dear Wien2k mailing list, I'm trying to calculate core electron binding energies using the Slaters transition state approach (half electron removed from the core compensated by the background charge) in an organic molecule. As part of the usual convergence checking I did four calculations with different amount of vacuum, with 5Å, 10Å, 15Å and 20Å in all directions in order to get some trend and try to extrapolate the final values. This is the approach similar to what I use for insulators (increasing supercell size), to estimate the supercell size error due to the Coulomb interaction between the periodic images of the charged atom. However to my first surprise there is no change in the binding energies (~0.01 eV) observed. Thinking about it more it makes sense though, as there is no screening in the vacuum, so there probably is no reduction of the interaction (like in the simple electrostatic example where the electric field intensity next to the infinite charged plane doesn't depend on the distance to it). I'm looking for an advice whether someone already tried something like this and if this kind of calculation (i.e., corehole for molecule, single atom, or even a 2D material) actually makes a sense from the physical point of view and also within the lapw framework... For now I'm comparing the relative shifts of the core electron binding energies of different carbon atoms within the molecule, and the results looks quite in agreement with the literature. However I'm not sure how much I can trust the results and if I can actually compare the values also with bulk materials. Any advice would be appreciated Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] compiling error of lapw1_mpi
The first option how to fix this is to compile with ELPA (which I would recommend anyway as it is much faster). To fix the compilation you can move the definitions outside the ifdef ELPA block, i.e., change line number 12 in seclr4.F from: NPE, myrowhs, mycolhs,ictxt,ictxtall, NBELW_PARA, & to NPE, myrowhs, mycolhs,ictxt,ictxtall, NBELW_PARA, lcolhs, dlcolhs, & and line number 14 from lcolhs, dlcolhs, elpa_switch, scala_switch, elpa_mem_default to elpa_switch, scala_switch, elpa_mem_default This should fix the compilation but I'm not familiar with this part of code enough to say how its intended to work. Hopefully prof. Blaha can confirm this is the correct solution. Best regards Pavel On Fri, 2019-06-14 at 11:31 +0200, Wien2k User wrote: > YES, we have compiling lapw1_mpi just with intel MPI without ELPA > > Le ven. 14 juin 2019 à 11:21, Pavel Ondračka > a écrit : > > I'll make a guess that this is with MPI but without ELPA? > > > > The DLCOLHS and the others are defined inside "#ifdef ELPA" block > > > > #ifdef ELPA > >lcolhs, dlcolhs, elpa_switch, > > scala_switch, > > elpa_mem_default > > #else > > > > but used also in "#ifdef parallel" > > blocks (specifically the one starting from line 515). There is also > > error on line 712 probably for the same reason, but I got lost in > > the > > ifdef magic there so dunno. > > > > Pavel > > > > > > On Fri, 2019-06-14 at 09:36 +0200, Peter Blaha wrote: > > > Sorry, but I cannot reproduce this. > > > > > > On 6/14/19 1:10 AM, Wien2k User wrote: > > > > Dear Prof. P. Blaha > > > > > > > > I got this error when compiling lapw1_mpi with mpiifort intel > > > > cluster > > > > edition 2018 > > > > > > > > seclr4_tmp_.F(520): error #6404: This name does not have a > > type, > > > > and > > > > must have an explicit type. [DLCOLHS] > > > > allocate(H(DLDHS,DLCOLHS)) > > > > ^ > > > > seclr4_tmp_.F(520): error #6385: The highest data type rank > > > > permitted is > > > > INTEGER(KIND=8). [DLCOLHS] > > > > allocate(H(DLDHS,DLCOLHS)) > > > > ^ > > > > seclr4_tmp_.F(524): error #6385: The highest data type rank > > > > permitted is > > > > INTEGER(KIND=8). [DLCOLHS] > > > > allocate(Z(DLDHS,DLCOLHS)) > > > > ^ > > > > seclr4_tmp_.F(712): error #6404: This name does not have a > > type, > > > > and > > > > must have an explicit type. [LCOLHS] > > > > deallocate(Z) ; allocate(Z(LDHS,LCOLHS)) > > > > ^ > > > > seclr4_tmp_.F(712): error #6385: The highest data type rank > > > > permitted is > > > > INTEGER(KIND=8). [LCOLHS] > > > > deallocate(Z) ; allocate(Z(LDHS,LCOLHS)) > > > > > > > > ___ > > > > Wien mailing list > > > > Wien@zeus.theochem.tuwien.ac.at > > > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > > > SEARCH the MAILING-LIST at: > > > > > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > > > > > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] compiling error of lapw1_mpi
I'll make a guess that this is with MPI but without ELPA? The DLCOLHS and the others are defined inside "#ifdef ELPA" block #ifdef ELPA lcolhs, dlcolhs, elpa_switch, scala_switch, elpa_mem_default #else but used also in "#ifdef parallel" blocks (specifically the one starting from line 515). There is also error on line 712 probably for the same reason, but I got lost in the ifdef magic there so dunno. Pavel On Fri, 2019-06-14 at 09:36 +0200, Peter Blaha wrote: > Sorry, but I cannot reproduce this. > > On 6/14/19 1:10 AM, Wien2k User wrote: > > Dear Prof. P. Blaha > > > > I got this error when compiling lapw1_mpi with mpiifort intel > > cluster > > edition 2018 > > > > seclr4_tmp_.F(520): error #6404: This name does not have a type, > > and > > must have an explicit type. [DLCOLHS] > > allocate(H(DLDHS,DLCOLHS)) > > ^ > > seclr4_tmp_.F(520): error #6385: The highest data type rank > > permitted is > > INTEGER(KIND=8). [DLCOLHS] > > allocate(H(DLDHS,DLCOLHS)) > > ^ > > seclr4_tmp_.F(524): error #6385: The highest data type rank > > permitted is > > INTEGER(KIND=8). [DLCOLHS] > > allocate(Z(DLDHS,DLCOLHS)) > > ^ > > seclr4_tmp_.F(712): error #6404: This name does not have a type, > > and > > must have an explicit type. [LCOLHS] > > deallocate(Z) ; allocate(Z(LDHS,LCOLHS)) > > ^ > > seclr4_tmp_.F(712): error #6385: The highest data type rank > > permitted is > > INTEGER(KIND=8). [LCOLHS] > > deallocate(Z) ; allocate(Z(LDHS,LCOLHS)) > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > > ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] System configuration
Right, as was written in the previous email, the provided config is a weird mix of ifort and gfortran options, also at some point in siteconfig you did chose that you want parallel build which now fails. > SRC_dstart/compile.msg:make: *** [para] Error 2 All the errors which I have seen up to now are from building parallel mpi programs. It is likely that the serial stuff still built fine. BTW Even after fixing the flags (using for example instruction in the Gavins email) you will still miss the mpi libraries, therefore it will not help much. Unfortunately, I don't know how to disable the parallel build after it has been enabled (and in general the siteconfig is not very good at clearing completely already set options), so you just have to ignore the errors for now and hope that the rest is fine (or clean your Wien2k folder, start from scratch with fresh gfortran config, and when it asks you about finegrained parallel just say no). The more important thing is, after using the new compile flags I have suggested in an earlier email, together with -lopenblas instead of the -lapack -lblas flags for the linker (and optionally with the provided patch), is the lapw1 faster? Best regards Pavel On Thu, 2019-05-23 at 20:18 -0600, Gavin Abo wrote: > The -mp1, -pad, -traceback, and so on look like ifort specific > compiler flags . > If you are using gfortran, compiler flags for gfortran need to be > used for the Compiling Options in siteconfig. A good starting > pointing is to use the "Recommended options" by siteconfig for > linuxgfortran, which is seen in the post [1], before you start > customizing it with your own flags. For example, gfortan has > -fbacktrace [2] instead of the -trackback that ifort has [3]. > [1] > https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17903.html > [2] https://gcc.gnu.org/onlinedocs/gfortran/Option-Summary.html > [3] > https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-traceback > > On 5/23/2019 12:37 PM, Indranil mal wrote: > > I did the patching but after compiling I am getting the > > SRC_dstart/compile.msg:gfortran: error: buffered_io: No such file > > or directory > > SRC_dstart/compile.msg:gfortran: error: unrecognized command line > > option ‘-mp1’ > > SRC_dstart/compile.msg:gfortran: error: unrecognized command line > > option ‘-prec_div’; did you mean ‘-mrecip’? > > SRC_dstart/compile.msg:gfortran: error: unrecognized command line > > option ‘-pc80’; did you mean ‘-mpc80’? > > SRC_dstart/compile.msg:gfortran: error: unrecognized command line > > option ‘-pad’ > > SRC_dstart/compile.msg:gfortran: error: unrecognized command line > > option ‘-ip’; did you mean ‘-p’? > > SRC_dstart/compile.msg:gfortran: error: unrecognized command line > > option ‘-traceback’ > > SRC_dstart/compile.msg:gfortran: error: unrecognized command line > > option ‘-assume’; did you mean ‘-msse’? > > SRC_dstart/compile.msg:make[1]: *** [module.o] Error 1 > > SRC_dstart/compile.msg:make: *** [para] Error 2 > > ... > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] System configuration
I'm putting this also back to the list after I received several private emails. Your timing and the ldd shows that you are linking against reference lapack and blas. You need to replace -llapack -lblas in R_LIBS with -lopenblas (this was discussed before in this thread: https://www.mail-archive.com/ wien@zeus.theochem.tuwien.ac.at/msg18194.html ) Also your config is a weird mix of ifort and gfortran options, which results in a ton of errors for the parallel programs (as was shown in another off- the-list email). At this moment this doesn't matter as we need to make the serial stuff working first. Best regards Pavel " grep "TIME HAMILT" test_case.output1 TIME HAMILT (CPU) = 22.8, HNS = 12.3, HORB = 0.0, DIAG = 78.9 TIME HAMILT (WALL) = 22.9, HNS = 12.4, HORB = 0.0, DIAG = 78.9 " " current:FOPT:-ffree-form -O2 -ffree-line-length-none current:FPOPT:-O1 -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML - traceback -assume buffered_io -I$(MKLROOT)/include current:LDFLAGS:$(FOPT) current:DPARALLEL:'-DParallel' current:R_LIBS:-llapack -lblas -lpthread current:FFTWROOT: current:FFTW_VERSION: current:FFTW_LIB: current:FFTW_LIBNAME: current:LIBXCROOT:/opt/etsf/ current:LIBXC_FORTRAN:xcf03 current:LIBXC_LIBNAME:xc current:LIBXC_LIBDNAME:lib/ current:SCALAPACKROOT: current:SCALAPACK_LIBNAME: current:BLACSROOT: current:BLACS_LIBNAME: current:ELPAROOT: current:ELPA_VERSION: current:MPIRUN:mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_ current:CORES_PER_NODE:1 current:MKL_TARGET_ARCH:intel64 current:RP_LIBS: linux-vdso.so.1 (0x7ffd78bac000) liblapack.so.3 => /usr/lib/x86_64-linux-gnu/liblapack.so.3 (0x15344ad 82000) libblas.so.3 => /usr/lib/x86_64-linux-gnu/libblas.so.3 (0x15344ab15000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x15344a8f 6000) libgfortran.so.4 => /usr/lib/x86_64-linux-gnu/libgfortran.so.4 (0x15344a 517000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x15344a179000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x153449d88000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x153449b7) /lib64/ld-linux-x86-64.so.2 (0x15344ba2e000) libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x 15344993) On Thu, May 23, 2019 at 8:52 PM Indranil mal mailto:indranil@gmail.com)> wrote: " Thanks a lot. Sir my calculations are running when I do the x lapw1 may be due to that this time is too long. I have installed ifort and intel mpi mkl but could not configured that is why I am using GFORTRAN and gcc the basic gnu compiler and open blas. If you dont mind you can access my pc through team viewer. On Thu, May 23, 2019 at 7:50 PM Pavel Ondračka mailto:pavel.ondra...@email.cz)> wrote: "Well, first we need to figure out why is your serial lapw so slow... You definitely don't have the libmvec patches, however almost two min runtime suggest that even your BLAS might be bad? In the test_case folder run: $ grep "TIME HAMILT" test_case.output1 and post the output. Also please go to the Wien2k folder and send the output of $ cat WIEN2k_OPTION and $ ldd lapw1 Next Wien2k version will have this simplified, however for now some patching needs to be to be done. The other option would be to get MKL and ifort from Intel and use it instead... Anyway if you don't want MKL, you need to download the attached patch to the SRC_lapw1 folder in Wien2k base folder. Go to the folder, and apply the patch with (you might need the patch package for that) $ patch -p1 < lapw1.patch then set the FOPT compile flags via siteconfig to: -ffree-form -O2 -ffree-line-length-none -march=native -ftree-vectorize -DHAVE_LIBMVEC -fopenmp and recompile lapw1. Now when you do again $ ldd lapw1 it should show line with "libmvec.so.1 => /lib64/libmvec.so.1" Compare timings again with the test_case. Also try: $ OMP_NUM_THREADS=2 x lapw1 $ OMP_NUM_THREADS=4 x lapw1 And after each run show total timings as well as $ grep "TIME HAMILT" test_case.output1 Hopefully, you are already linking the multithreaded Openblas (but dunno what is the Ubuntu default)... I'll help you with the parallel execution in the next step. Best regards Pavel On Thu, 2019-05-23 at 18:58 +0530, Indranil mal wrote: > Dear sir > > After running x lapw1 I got the following > > ~/test_case$ x lapw1 > STOP LAPW1 END > 114.577u 0.247s 1:54.82 99.9% 0+0k 0+51864io 0pf+0w > > I am using parallel k point execution only 8 GB memory is in use and > for 100 atom (100 kpoints) calculation it is taking around 12 hours > to complete one cycle. > please help me. > > Thanking you > > Indranil > > On Thu, May 23, 2019 at 11:22 AM Pavel Ondračka < > pavel.ondra...@email.cz(mailto:pavel.ondra...@email.cz)> wrote: > > Hi Indranil, > > > > While the
Re: [Wien] System configuration
Hi Indranil, I'm sending this again this time also to the list (haven't noticed you removed it), in the hope it might be useful for someone optimizing with gfortran as well... Pavel "Well, first we need to figure out why is your serial lapw so slow... You definitely don't have the libmvec patches, however almost two min runtime suggest that even your BLAS might be bad? In the test_case folder run: $ grep "TIME HAMILT" test_case.output1 and post the output. Also please go to the Wien2k folder and send the output of $ cat WIEN2k_OPTION and $ ldd lapw1 Next Wien2k version will have this simplified, however for now some patching needs to be to be done. The other option would be to get MKL and ifort from Intel and use it instead... Anyway if you don't want MKL, you need to download the attached patch to the SRC_lapw1 folder in Wien2k base folder. Go to the folder, and apply the patch with (you might need the patch package for that) $ patch -p1 < lapw1.patch then set the FOPT compile flags via siteconfig to: -ffree-form -O2 -ffree-line-length-none -march=native -ftree-vectorize -DHAVE_LIBMVEC -fopenmp and recompile lapw1. Now when you do again $ ldd lapw1 it should show line with "libmvec.so.1 => /lib64/libmvec.so.1" Compare timings again with the test_case. Also try: $ OMP_NUM_THREADS=2 x lapw1 $ OMP_NUM_THREADS=4 x lapw1 And after each run show total timings as well as $ grep "TIME HAMILT" test_case.output1 Hopefully, you are already linking the multithreaded Openblas (but dunno what is the Ubuntu default)... I'll help you with the parallel execution in the next step. Best regards Pavel On Thu, 2019-05-23 at 18:58 +0530, Indranil mal wrote: > Dear sir > > After running x lapw1 I got the following > > ~/test_case$ x lapw1 > STOP LAPW1 END > 114.577u 0.247s 1:54.82 99.9% 0+0k 0+51864io 0pf+0w > > I am using parallel k point execution only 8 GB memory is in use and > for 100 atom (100 kpoints) calculation it is taking around 12 hours > to complete one cycle. > please help me. > > Thanking you > > Indranil > > On Thu, May 23, 2019 at 11:22 AM Pavel Ondračka < > pavel.ondra...@email.cz> wrote: > > Hi Indranil, > > > > While the k-point parallelization is usually the most efficient > > (provided you have sufficient number of k-points) and does not need > > any > > extra libraries, for 100atoms case it might be problematic to fit > > 12 > > processes into 32GB of memory. I assume you are already using it > > since > > you claim to run on two cores? > > > > Instead check what is the maximum memory requirement of lapw1 when > > run > > in serial and based on that find how much processes you can run in > > parallel, than for each place one line "1:localhost" into .machines > > file (there is no need to copy .machines from templates, or use > > random > > scripts, instead read the userguide to understand what you are > > doing, > > it will save you time in the long run). If you can run at least few > > k- > > points in parallel it might be enough to speed it up significantly. > > > > For MPI you would need openmpi-devel scalapack-devel and fftw3- > > devel > > (I'm not sure how exactly are they named on Ubuntu) packages. > > Especially the scalapack configuration could be tricky, it is > > probably > > easiest to start with lapw0 as this needs only MPI and fftw. > > > > Also based on my experience with default gfortran settings, it is > > likely that you don't have even optimized the single core > > performance, > > try to download the serial benchmark > > http://susi.theochem.tuwien.ac.at/reg_user/benchmark/test_case.tar.gz > > untar, run x lapw1 and report timings (on average i7 CPU it should > > take > > below 30 seconds, if it takes significantly more, you will need > > some > > more tweaks). > > > > Best regards > > Pavel > > > > On Thu, 2019-05-23 at 10:42 +0530, Dr. K. C. Bhamu wrote: > > > Hii, > > > > > > If you are doing k-point parallel calculation (having number of > > k- > > > points in IBZ more then 12) then use below script on terminal > > where > > > you want to run the calculation or use in your job script with > > -p > > > option in run(sp)_lapw (-so). > > > > > > if anyone knows how to repeat a nth line m times in a file then > > this > > > script can be changed. > > > > > > Below script simply coping machine file from temple directory and > > > updating it as per your need. > > > So you do not need copy it, open it in your favorite editor and > >
Re: [Wien] System configuration
Hi Indranil, While the k-point parallelization is usually the most efficient (provided you have sufficient number of k-points) and does not need any extra libraries, for 100atoms case it might be problematic to fit 12 processes into 32GB of memory. I assume you are already using it since you claim to run on two cores? Instead check what is the maximum memory requirement of lapw1 when run in serial and based on that find how much processes you can run in parallel, than for each place one line "1:localhost" into .machines file (there is no need to copy .machines from templates, or use random scripts, instead read the userguide to understand what you are doing, it will save you time in the long run). If you can run at least few k- points in parallel it might be enough to speed it up significantly. For MPI you would need openmpi-devel scalapack-devel and fftw3-devel (I'm not sure how exactly are they named on Ubuntu) packages. Especially the scalapack configuration could be tricky, it is probably easiest to start with lapw0 as this needs only MPI and fftw. Also based on my experience with default gfortran settings, it is likely that you don't have even optimized the single core performance, try to download the serial benchmark http://susi.theochem.tuwien.ac.at/reg_user/benchmark/test_case.tar.gz untar, run x lapw1 and report timings (on average i7 CPU it should take below 30 seconds, if it takes significantly more, you will need some more tweaks). Best regards Pavel On Thu, 2019-05-23 at 10:42 +0530, Dr. K. C. Bhamu wrote: > Hii, > > If you are doing k-point parallel calculation (having number of k- > points in IBZ more then 12) then use below script on terminal where > you want to run the calculation or use in your job script with -p > option in run(sp)_lapw (-so). > > if anyone knows how to repeat a nth line m times in a file then this > script can be changed. > > Below script simply coping machine file from temple directory and > updating it as per your need. > So you do not need copy it, open it in your favorite editor and do it > manually. > > cp $WIENROOT/SRC_templates/.machines . ; grep localhost .machines | > perl -ne 'print $_ x 6' > LOCALHOST.dat ; tail -n 2 .machines > > grang.dat ; sed '22,25d' .machines > MACHINE.dat ; cat MACHINE.dat > localhost.dat grang.dat > .machines ; rm LOCALHOST.dat MACHINE.dat > grang.dat > > regards > Bhamu > > > On Wed, May 22, 2019 at 10:52 PM Indranil mal > wrote: > > respected sir/ Users, > > I am using a PC with intel i7 8th gen (with 12 > > cores) 32GB RAM and 2TB HDD with UBUNTU 18.04 LTS. I have installed > > OpenBLAS-0.2.20 and using GNU FORTRAN and c compiler. I am trying > > to run a system with 100 atoms only two cores are using the rest of > > them are idle and the calculation taking a too long time. I have > > not installed mpi ScaLAPACK or elpa. Please help me what should I > > do to utilize all of the cores of my cpu. > > > > > > > > Thanking you > > > > Indranil > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Wien2k on AVX512 CPUs
On Wed, 2019-02-27 at 06:54 -0600, Laurence Marks wrote: > The script I used is below, works fine with versions 19.0.2.187 > 20190117. You might have wanring/issues with the compilation of their > test programs; I hacked configure.ac to remove them. > > I suspect the issue with HAMILT is misleading, as it has very little > MKL. I suggest doing "grep Time case.output1" to look at the > individual parts. > You are of course right. Sigh, I need to read my mails at least once more before sending... I was of course thinking about DIAG part (specifically about the single ZHETRD call). HAMILT and HNS are OK. So at least that part I got right. > --- > export FFLAGS="-O2 -pc80 -msse4.2 -fminshared -axCORE-AVX512 -pad -ip > -fimf-precision=high -prec_div -traceback -no-complex-limited-range > -no-fast-transcendentals -no-ftz " > export CFLAGS="-O2 -pc80 -msse4.2 -fminshared -axCORE-AVX512 -ip > -fimf-precision=high -prec_div -traceback -no-complex-limited-range > -no-fast-transcendentals -no-ftz " > > > > > export MPICC=mpiicc > export CC=mpiicc > export CXX=mpiicc > export F77=mpiifort > export F90=mpiifort > export FC=mpiifort > export FCFLAGS=$FFLAGS > export MPIFC=mpiifort > export MPIF90=mpiifort > > > export SCALAPACK_LDFLAGS= > export SCALAPACK_FCFLAGS= > export CFLAGS+="-mkl=cluster" > export FCFLAGS+="-mkl=cluster" > export CXXFLAGS=$CFLAGS > > > ./configure --prefix=/opt/elpaRC1 --disable-shared --enable-avx512 -- > disable-tests --disable-legacy-interface > make install Thank you, I'll test this and post results. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Wien2k on AVX512 CPUs
t; working with > > > this MKL_ENABLE_INSTRUCTIONS variable: > > > --avx512 > > > TIME HAMILT (CPU) = 5.1, HNS = 2.1, > > HORB = > > > 0.0, > > > DIAG =15.3 > > > TIME HAMILT (WALL) = 5.4, HNS = 2.1, > > HORB = > > > 0.0, > > > DIAG =15.3 > > > --avx2 > > > TIME HAMILT (CPU) = 5.8, HNS = 2.5, > > HORB = > > > 0.0, > > > DIAG =16.3 > > > TIME HAMILT (WALL) = 6.1, HNS = 2.5, > > HORB = > > > 0.0, > > > DIAG =16.3 > > > > > > However, when using OMP_NUM_THREADS=8, this difference is > > further > > > reduced (probably due to memory bounds ?) > > > ---avx512 > > > TIME HAMILT (CPU) =19.9, HNS = 7.7, > > HORB = > > > 0.0, > > > DIAG =24.2 > > > TIME HAMILT (WALL) = 2.6, HNS = 1.0, > > HORB = > > > 0.0, > > > DIAG = 3.2 > > > avx2 > > > TIME HAMILT (CPU) =20.0, HNS = 7.4, > > HORB = > > > 0.0, > > > DIAG =27.0 > > > TIME HAMILT (WALL) = 2.6, HNS = 1.0, > > HORB = > > > 0.0, > > > DIAG = 3.5 > > > - > > > > > > > > Yes, we have the latest ELPA elpa-2018.11.001 installed. > > Seems > > > to run > > > without problems and is overall significantly better than > > the > > > old ELPA), > > > but it requires a change in the user interface. The next > > release of > > > WIEN2k will have two elpa versions supported, a ELPA15 > > (which is in > > > WIEN2k_18), and a new ELPA interface for elpa versions > > later > > > than 2017 > > > (this is somehow like FFTW2 and FFTW3 versions). > > > > > > So in essence: with the present code one cannot use > > > ELPA-versions from > > > 2017 or later. > > > > > > On 2/27/19 7:34 AM, Pavel Ondračka wrote: > > > > Dear mailing list, > > > > > > > > just out of curiosity has anyone any experience > > running > > > Wien2k on a > > > > AVX512 capable machine (eg. the KNL accelerators or > > recent Intel > > > > skylake-avx512 CPUs)? > > > > > > > > Recently my cluster updated to this skylake-avx512 > > machines > > > however I'm > > > > unable to get any better performance for Wien2k. In > > > particular MKL seem > > > > to suck, for example in single core performance (with > > the serial > > > > test_case) the eigenvalue problem is actually faster > > when I > > > forbid the > > > > usage of AVX512 instructions: > > > > > > > > running with MKL_VERBOSE=1 > > MKL_ENABLE_INSTRUCTIONS=AVX2 > > > > MKL_VERBOSE > > > > > > > > > ZHETRD(L,3481,0x2b74d8567cc0,3481,0x2b74d82121c0,0x2b74d8218e88,0x > > 2b74e > > > > f769b00,0x2b74ef777490,452530,0) 10.21s CNR:OFF Dyn:1 > > FastMM:1 > > > > TID:0 NThr:1 > > > > > > > > with MKL_ENABLE_INSTRUCTIONS=AVX512 > > > > MKL_VERBOSE > > > > > > > > > ZHETRD(L,3481,0x2b5397c96cc0,3481,0x2b53979411c0,0x2b5397947e88,0x > > 2b53a > > > > ee98b00,0x2b53aeea6490,452530,0) 12.31s CNR:OFF Dyn:1 > > FastMM:1 > > > > TID:0 NThr:1 > > > > > > > > This is somewhat compensated by speedups in the hamilt > > part > > > (the VML > > > > stuff and various ?GEMMs seem to be actually slightly > > > faster), but > > > > overall the performance is mostly the same wi
[Wien] Wien2k on AVX512 CPUs
Dear mailing list, just out of curiosity has anyone any experience running Wien2k on a AVX512 capable machine (eg. the KNL accelerators or recent Intel skylake-avx512 CPUs)? Recently my cluster updated to this skylake-avx512 machines however I'm unable to get any better performance for Wien2k. In particular MKL seem to suck, for example in single core performance (with the serial test_case) the eigenvalue problem is actually faster when I forbid the usage of AVX512 instructions: running with MKL_VERBOSE=1 MKL_ENABLE_INSTRUCTIONS=AVX2 MKL_VERBOSE ZHETRD(L,3481,0x2b74d8567cc0,3481,0x2b74d82121c0,0x2b74d8218e88,0x2b74e f769b00,0x2b74ef777490,452530,0) 10.21s CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:1 with MKL_ENABLE_INSTRUCTIONS=AVX512 MKL_VERBOSE ZHETRD(L,3481,0x2b5397c96cc0,3481,0x2b53979411c0,0x2b5397947e88,0x2b53a ee98b00,0x2b53aeea6490,452530,0) 12.31s CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:1 This is somewhat compensated by speedups in the hamilt part (the VML stuff and various ?GEMMs seem to be actually slightly faster), but overall the performance is mostly the same with and without the AVX512 stuff. OpenBLAS is maybe 15% slower so not an option as well... Moreover for MPI version I'm not able to get a correctly working ELPA compiled with the AVX512 support (I went for the latest elpa- 2018.11.001 version), it just returns bogus results and diverges after few iterations. If someone has this working I'd be really grateful for a working configure line, and advice with which elpa and which compiler version this was. Unfortunately I was not able to get any support from the cluster admins beyond "We see a 30% per-core performance increase in average" therefore asking here if anyone has experience with such machines. Any advice would be appreciated. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] expand script name clash with the GNU coreutils utility expand
Dear Wien2k mailing list, this is not a bugreport per se but it might be wise to remove the short "expand" symlink to the expand_lapw script from the Wien2k package. There is actually a name clash with the "expand" utility from GNU coreutils ( https://www.gnu.org/software/coreutils/manual/html_node/expand-invocation.html#expand-invocation , installed by default on any linux box). Most users are be probably unaffected as it is not that widely used but it has some uses (in my case I was compiling the latest glibc and the build script uses it for some text processing). Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] N1s binding energy in TiN
On Thu, 2019-01-17 at 09:28 +0100, Peter Blaha wrote: > Sorry for the confusion. The quoted values in the pdf are probably > from > a lousy calculation and are ment only to demonstrate the effect (if > I > remember correctly, only a 4x bigger P supercell, but for sure not > converged, I also don't remember which functional, ) and only > accidentally match experiment. >From my current testing the supercell effect (the core-hole interaction with its periodic images) is minimal for the TiN, e.g. the N1s binding energy difference between 8 atoms and 216 atoms is only about 0.1eV (at similar other parameters). The screening of the core hole is very good here (at least for the TiN but probably also in other metals, from my testing cells around 64 atoms are reasonably converged). This is ofc completely different story for insulators... What I saw was that during my testing the difference between the small 8 atom cell with default parameters and the 216 atoms cells with well tweaked numerical parameters was below 0.2eV. Basically the only input which has any significant difference is the functional. Which made me think that maybe I'm missing some secrest ingredient. Probably not... :-) > My general experience is that core-eigenvalues (taken with respect to > EF > !!) are 10-15% off, while slater TS gives 1-2 %, i.e. an order of > magnitude better. Right, the main problem with the core-eigenvalues is not that they are shifted in absolute values but rather they show almost no chemical shifts, which only become visible when the core-hole is added. The 1- 2% is also what I see here and in general is reasonable, I was just wondering if I could do better. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] N1s binding energy in TiN
On Thu, 2019-01-17 at 02:46 -0600, Laurence Marks wrote: > My three cents. I think an agreement of 0.1eV should be considered as > fortuitous. There are many issues which are glossed over even with > the miraculous exact functional: > > 1) The Slater method is a very clever use of the mean value theorem > for an integral. However, it is only 1 value. You can check the > literature, I remember seeing papers where people use a range of > holes to more accurately do the integral. Thanks for the pointer, I'll do some reading. This should not be a problem for the delta-scf method, right? I've always stuck with the Slater's method up to now since if I understand the delta-scf method properly then for insulators you need to place the electron in background and the E_b is than calculated as E^tot_finalstate(N-1) - E^tot_initialstate(N) + μ. However I'm not sure how to include the lat part. And if you do just E^tot_finalstate(N-1) - E^tot_initialstate(N) you get a really bad results. Metals are better since you can place the electron to valence band and do E^tot_initialstate(N) - E^tot_finalstate(N), which gives in practice almost identical results as the Slater's approach. BTW I did found it interesting that for metals and the Slater's transition state calculations it actually doesn't matter if you place the extra half electron in the valence band or in background (0.02eV difference for TiN). > > 2) The simple dft-based calculations assume that the final states are > plane waves. Rigorously the exiting photoelectron in XPS is an > evanescent Bloch wave (for a crystal). There is literature on this, > but I doubt that it has been combined with DFT. This is probably beyond my knowledge/interest I could check how large differences this could cause. I've actually seen some articles where authors claim to calculate good absolute binding energies (Ozaki, T., & Lee, C.-C. (2017). Absolute Binding Energies of Core Levels in Solids from First Principles. Physical Review Letters, 118(2), 026401. https://doi.org/10.1103/PhysRevLett.118.026401) and their way to good absolute results was the exact coulomb cutoff method and some penalty functional, but I'm not sure how much this would be applicable to lapw method. > 3) In experiments you have to worry about photoelectron diffraction, > and there will be some shifts to higher apparent binding energy due > to phonon inelastic scattering. And you have to worry about charging > and band bending for insulators, chemisorption induced work function > changes (how clean is your XPS?) The photoelectron diffraction and phonons is probably something I can't do anything about, except for hoping that it does not affect the relative shifts however I've always thought that the work function, charging, etc. could be neglected considering you do proper change compensation and align the measured spectra properly (either with the adventitious carbon or preferably with respect to Au), than you should have just the proper value with respect to the Fermi energy? What I have found more problematic for semiconductors is that from normal calculations you don't get the correct position of Fermi level, since its somewhere in the band gap. In those cases I've tried to align the experimental spectra to the valence band maximum for comparison with calculations which sort of works but introduces extra uncertainties. Anyway thank you for the thoughts. Best regards Pavel Ondračka > > _ > Professor Laurence Marks > "Research is to see what everybody else has seen, and to think what > nobody else has thought", Albert Szent-Gyorgi > www.numis.northwestern.edu > > On Thu, Jan 17, 2019, 02:29 Peter Blaha wrote: > > Sorry for the confusion. The quoted values in the pdf are probably > > from > > a lousy calculation and are ment only to demonstrate the effect (if > > I > > remember correctly, only a 4x bigger P supercell, but for sure not > > converged, I also don't remember which functional, ) and only > > accidentally match experiment. > > > > My general experience is that core-eigenvalues (taken with respect > > to EF > > !!) are 10-15% off, while slater TS gives 1-2 %, i.e. an order of > > magnitude better. > > > > > > On 1/17/19 9:15 AM, Pavel Ondračka wrote: > > > Dear Wien2k mailing list, > > > > > > I'm looking for some advice regarding the calculation of core > > level > > > binding energies (to compare with XPS experiments). First of all > > there > > > is this nice lecture where prof. Blaha actually shows some > > calculations > > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__susi.theochem.tuwien.ac.at_reg-5Fuser_textbooks_WIEN2k-5Flecture-2D=DwIGaQ=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws=U
[Wien] N1s binding energy in TiN
Dear Wien2k mailing list, I'm looking for some advice regarding the calculation of core level binding energies (to compare with XPS experiments). First of all there is this nice lecture where prof. Blaha actually shows some calculations http://susi.theochem.tuwien.ac.at/reg_user/textbooks/WIEN2k_lecture- notes_2011/Blaha_xas_eels.pdf of core levels with perfect results. For example with TiN the deltaSCF method gets 397.1eV for the N1s level as compared to 397.0eV experiment. The trouble is that I'm not able to reproduce this. I've done some calculations before and I was never really happy with the absolute values which were always few eV off but I've always thought this is just the limitation of xc functional or methodology. Hence seeing the nice results in the lecture surprised me. However, I'm not able to reproduce the values even for metals from the example. For the TiN I'm getting values of 404.8eV with the slaters transition state approach and 404.6eV with delta-scf (here I'm using the formula for metals E_b = E^tot_initialstate(N) - E^tot_finalstate(N), i.e. placing the core-electron in the valence band and with PBE). I have thought that this is maybe functional difference, since while taking LDA instead of PBE shifts the results differ almost by 4eV (to 400.9eV). However with the PBE I get the core energy ε_i as 377.4eV (consistent with the mentioned pdf where it is 377.5eV) so maybe this is not just about functional?. I've already checked convergence with supercell size as well as numerical parameters and I'm actually out of ideas. To be honest, I'm not much concerned personally about the discrepancy since the chemical shifts seem to be reasonable even if the absolute values are not. I just think that if it is possible to get the absolute values right (or at least closer to experiment) as in the lecture pdf, the results would of course look way better, therefore I'll be grateful for any comments and help. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] non-zero value of epsilon 2 below the energy band gap
Dear Anup Pradhan Sakhya, this probably just due to the broadening procedure. If you set broadening to zero in kram, than you show have no absorption below the band gap (with plain optic and no scissor). Or just take a look at the output from joint which should contain the unbroadened imaginary part of the dielectric tensor (epsilon 2). In general if you want to compare your spectra against experiment some broadening is needed for a good agreement (under the assumption that your DOS is already good and the approximations in optic package works for your material). There are processes which cause absorption below the gap also in the experiment (the Urbach tail, instrumental broadening, etc.) hence this is nothing to worry about. The Lorentzian broadening in the kram is not really a good model though, especially regarding the absorption bellow the band gap, since it decays quite slowly and hence can produce nonzero absorption way too low below the gap. I have much better experience with Gaussian broadening (and it should be theoretically a better match for the processes which happen in experiment). Unfortunately, kram only supports Lorentzian broadening. Best regards Pavel On Sat, 2018-12-15 at 01:19 +0530, Anup Shakya wrote: > Dear All, > > I have performed calculations for two double perovskite oxide > materials and the band gap of the material is found to be more than > 1. 3 eV for both materials. The calculations have been performed > using GGA+U, since it contains rare earth materials. The value of U > have been used from the literature. The energy convergence was > performed till 0.0001 Ry and the optical properties were calculated > using optic. However, the imaginary part of the dielectric constant > (epsilon 2) shows non-negligible value below the energy band gap. PBE > was used as the exchange correlation functional and if I am not wrong > then for the calculation of epsilon 2 the contribution of inter-band > transitions are taken into consideration and the intra-bands are > neglected. Then what is the reason for the non-zero value of epsilon > 2 below the energy band gap? > If anyone could suggest some views, I would be very grateful. Please > let me know, if anyone needs more information. > > Sincerely, > Anup Pradhan Sakhya, > TIFR. ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] "OpenBlas" package instead of default "blas"
On Mon, 2018-11-19 at 23:54 +0530, Ashwani Kumar wrote: > Dear Dr. Pavel Ondracka, > In previous thread, > https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg18098.html > > you advised to use OpenBlas package to extract best performance from > processor. Since i was having problem with wien2k installation, so i > went with Dr. Gavin's set of instructions (for lapack devel package). > Now i want to speed up the wien2k execution (simple oxides too take > much time). Further i noted that at a time, only one thread remains > 100% busy, rest threads shows load level 1-5%. > Configuration of my pc: i7-8700 (6 cores, 12 threads), 8 gb ram (can > be upgraded to 16 gb), fedora-28, graphic card (gtx...) > > I understand that "openBlas" need to be installed and set R_path to > -lopenblas. I also want to utilize thread level parallelism if it > boosts the processor's performance further by a factor of >= x1.5 > times. Dear A. Kumar, I don't fully understand your comment about the thread load? The Wien2k does not ATM spawn multiple threads (unless you use threaded blas/lapack). The k-point (or MPI) parallel calculations spawn multiple processes but those should never be at 1-5% load... IMO there are likely two problems here: 1) If you are only using one machine and your case has a lot of k-point (and you are not memory-bound), what you want is k-point parallelism. This can be done with the .machines file (and the -p switch). If you are only using single machine your .machines file should contain "1:localhost" line for every processor on your computer (i.e. in your specific case reasonable .machines file would have 6 (maybe even 12 with hyperthreading, but you need to test your optimal setup) identical lines. Please check the userguide for more details about the k-point parallel execution and .machines file in general. 2) regarding the openblas: what you need is an openblas devel package. In the beginning I suggest the serial openblas "dnf install openblas- devel" and set R_LIBs to just "-lopenblas". If you want to squeeze more speed (and you are using only single computer), add also "-ftree- vectorize -march=native" to your FOPT flags. If you really want to go with the threaded openblas I can help you later but IMO this should not be needed in the beginning (as the k- point parallelism is the optimal one). You will also need some further tricks to make lapw1 fast with the libmvec. Either see https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg16159.html or I can provide some new patches which do the same with OpenMP (but first get the k-point parallelism and serial openblas working). Hope this helps Best regards Pavel > > Waiting for your expert advise, > > thanks, > A. Kumar > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] crash in tetra -enefile
Dear Wien2k mailing list, there is a small bug in tetra -enefile which could occasionally result in a crash like this: x tetra -enefile Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: Segmentation fault (core dumped) 0.023u 0.389s 0:00.56 71.4% 0+0k 0+216io 0pf+0w This is an uninitialized variable problem (here reproducible with gfortran 8.2.1 and with the right star alignment due to the dependence on random uninitialized memory) The crash happens at tetra.f:464 tetra.f:462 if(nnsum_dos.gt.0) then tetra.f:463 do i=1,nnsum_dos tetra.f:464 WRITE(6,1176) i,(isumdos(i,i1),i1=1,nnsum_dos_max) with out of bound read of isumdos, (I don't have any SUM in my int file) hence the "if(nnsum_dos.gt.0)" should be false, but nnsum_dos is unitialized at this point. valgrind: ==30563== Conditional jump or move depends on uninitialised value(s) ==30563==at 0x40A38E: MAIN__ (tetra.f:462) ==30563==by 0x40B3B3: main (tetra.f:6) (gdb) print nnsum_dos $1 = 528 The variable is supposed to be set here: tetra.f:256 nnsum_dos=0 tetra.f:257 read(5,'(a)',end=91) system tetra.f:258 if(system(1:3).ne.'SUM') goto 91 tetra.f:259 read(system(5:70),*,ERR=91,END=91) nnsum_dos,nnsum_dos_max however the entire block is skipped with -enefile due to tetra.f:216 if(enefile) goto 200 which jumps to tetra.f:343 200 CONTINUE The solution is to zero-initialize the nnsum_dos variable earlier (before the goto 200 jump or at the file beginning). While the crash looks scary, it is likely harmless since it crashes almost at the end where all important data should be written anyway, reporting nevertheless. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Help Request for making WIEN2K (ver18.2) programs executable.
It would be probably reasonable to add the "-ffpe-summary=none" to the default gfortran flags to not scare a new users (as this "issue") is being brought up over and over again (hint to prof. Blaha). In fact, I tried quite hard to debug lot of those things and in general to hunt down the uninitialized stuff and the remaining ones (at least in the common codepaths) should be harmless. IIRC the denormals in lapw0 and lapw2 comes from something like "exp(-big_number)" which is completely OK and the scary ones IEEE_INVALID_FLAG IEEE_DIVIDE_BY_ZERO in mixer come from lapack- netlib/SRC/ieeeck.f function which is intentionally doing all the divide by zero and infinity math to check if it can depend on the compliance with the IEEE 754 standard. Regarding the ifort vs gfortran, this is mostly a matter of personal taste. gfrortan strictly adheres to the Fortran specification and this caused some issues in the Wien2k code. gfortran is also somewhat slower but this is due to much more conservative default flags. At -O2 gfortran won't do any SIMD vectorization or loop unrolling in general, no link-time (interprocedural) optimizations and it strictly adheres to IEEE compliance (and does a lot or other stuff, like properly setting errno after library calls etc. which ifort does not do with the Wien2k flags). I can maybe someday write some blogpost which flags to use to get ifort-like behavior and speed. Best regards Pavel -- Původní e-mail -- Od: Gavin Abo Komu: wien@zeus.theochem.tuwien.ac.at Datum: 24. 10. 2018 4:19:56 Předmět: Re: [Wien] Help Request for making WIEN2K (ver18.2) programs executable. "Those gfortran warnings have been seen in symmetry [ https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17396.html ] and dstart [ https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17389.html ], which if I recall correctly were resolved by the fixes seen on the updated page [ http://susi.theochem.tuwien.ac.at/reg_user/updates/ ]: VERSION_18.1: 1.6.2018 SRC_dstart: fix of zamt initialization SRC_symmetry: setting yvec,zvec=0 >From the above, you can see that uninitialized variables in the code tend to be the cause of those type of warnings. Apparently, ifort either handles them better (or ignores the issue). It is not surprising and perhaps expected that those warnings might appear in more programs than just lapw0. As I mentioned before, WIEN2k compiled with gfortran is less vetted and less maintained by the developers [ https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg18018.html ]. I suppose that is one drawback of using that free compiler. Though, I suppose Intel's recent 2016 or newer ifort compiler standardization changes breaking some of the WIEN2k code is perhaps not better in some cases [ https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17542.html ]. So, pick your poison. As mentioned on stackoverflow page at the link in the previous post at https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17385.html , I think the statement there describes well the meaning of those warning messages: They can be a "hint about numerical problems in your code, but it is not an error per se." Adding "-ffpe-summary=none" to the compiler settings might suppress that warning message when compiling with gfortran. However, I don't recommend doing that as it might suppress other more important warnings/errors. Someday in the future, 'maybe' the WIEN2k code could be reprogrammed to remove that warning message (which I think would be the proper way to remove that message). In summary, unless you can fix the code yourself to remove the error, you would have the ignore them and continue on with your calculation unless it results in something absurdly wrong. P.S., Pavel has been good at gfortran debugging and resolving those, which has been quite appreciated. Though, I believe he is a user (not developer) less obligated to help fix such problems. On 10/23/2018 2:27 PM, Ashwani Kumar wrote: > thank you, Program compiled successfully withour any error. I tried to > remove manually but still some error occurred, Followed Mr. Gavin's > method (from previous thread) for LIBXC link to R_LIBS. Done > successfully. But while running an example of TiC (to check everything > is fine), STDOUT file shows warning message (for lapw0 and lapw2) but > program executed without error. I checked makefile, makefile.orig (and > makefile.orig_14 also for lapw0) and found nothing suspicious. > ** > in cycle 9 ETEST: .1542 CTEST: .0009143 > STOP LAPW0 END > Note: The following floating-point exceptions are signalling: > IEEE_DENORMAL > STOP LAPW1 END > STOP LAPW2 END > Note: The following floating-point exceptions are signalling: > IEEE_DENORMAL > STOP CORE END > Note: The following floating-point exceptions are signalling: >
Re: [Wien] Fwd: Help Request for making WIEN2K (ver18.2) programs executable.
Dear Ashwani, the problem is that the libxc modules are not installed in /usr/include on Fedora (and some other distros). This is kinda stupid (but the rationale being that the mod files are not headers in the standard sense, but rather a binary (compiler and arch dependent) files). The are in $(LIBDIR)/gfortran/ modules/ on Fedora. Unfortunately the siteconfig is not flexible enough to allow you to specify this directory. Therefore it is not possible to compile with libxc on Fedora currently without manually changing the SRC_lapw0 Makefile. You would do best to remove the libxc altogether. But to be honest I don't know how, to do that from siteconfig (It does not allow me to reset LIBXCROOT to empty). Hence your best chance is to edit WIEN2k_OPTIONS file manually and delete all the lines startings with current:LIBXC* (or hopefully someone more experienced can advice how to reset LIBXCROOT to empty fro siteconfig) and regenerate the makefiles. If you really need the libxc, set: LIBXCROOT = /usr/ LIBXC_FORTRAN = xcf03 LIBXC_LIBDNAME = lib64 LIBXC_LIBNAME = xc and manually edit the " LIBXC_FOPT = -DLIBXC -I$(LIBXCROOT)include" line in SRC_lapw0/Makefile to " LIBXC_FOPT = -DLIBXC -I$(LIBXCROOT)/$(LIBXC_LIBDNAME)/gfortran/modules" (BTW also check that you have libxc-devel package installed "dnf install libxc-devel") Best regards Pavel BTW its a pity that the siteconfig package doesn't use the most common way of the package detection (i.e. the package config files). Nowadays all packages such as fftw, elpa, OpenBLAS or libxc (with the scalapack being the exception) have a proper package configs (at least upstream) and the information like where to find the fortran modules (of other required compile/link flags) can be found in them. -- Původní e-mail -- Od: Ashwani Kumar Komu: wien@zeus.theochem.tuwien.ac.at Datum: 23. 10. 2018 6:56:47 Předmět: [Wien] Fwd: Help Request for making WIEN2K (ver18.2) programs executable. " Mr. Pavel, i have just noted down your point (and will imply once i start using WIEN2K and gets more comfortable with the code). SPEED MATTERS A LOT. Thanks Mr. Gavin. Earlier issue solved. Now lapw0 and lapw2 not executable which i doubt is due to LIBXC (or may not). Your previous reply indicated not to use LIBXC. I re-installed everything fresh but LIBXC setting remains there. please find the compile errors: Compiling All Program: ** Compile time errors (if any) were: SRC_lapw0/compile.msg:Fatal Error: Can't open module file ‘xc_f03_lib_m.mod’ for reading at (1): No such file or directory SRC_lapw0/compile.msg:make[1]: *** [Makefile:170: inputpars.o] Error 1 SRC_lapw0/compile.msg:make: *** [Makefile:119: seq] Error 2 Check file compile.msg in the corresponding SRC_* directory for the compilation log and more info on any compilation problem. ** Compiling lapw0 alone : ** RC_lapw0 ... if [ -f .parallel ]; then \ rm -f .parallel modules.o W2kinit.o fft_modules.o reallocate.o energy.o getff1.o getfft.o gtfnam.o lapw0.o outerr.o rean0.o rean3.o rean4.o setff1.o setff2.o setfft.o xcpot1.o xcpot3.o eramps.o *.mod; \ fi touch .sequential make ./lapw0 FORT=gfortran FFLAGS=' -ffree-form -O2 -ffree-line-length-none -DLIBXC -I/usr/include ' make[1]: Entering directory '/home/hardy/WIEN2K/SRC_lapw0' make[1]: Circular pwxad4.o <- pwxad4.o dependency dropped. gfortran -ffree-form -O2 -ffree-line-length-none -DLIBXC -I/usr/include -c inputpars.F inputpars.F:6:10: use xc_f03_lib_m 1 Fatal Error: Can't open module file ‘xc_f03_lib_m.mod’ for reading at (1): No such file or directory compilation terminated. make[1]: *** [Makefile:170: inputpars.o] Error 1 make[1]: Leaving directory '/home/hardy/WIEN2K/SRC_lapw0' make: *** [Makefile:119: seq] Error 2 make: *** No rule to make target 'complex'. Stop. Copying programs WARNING: no executable found in SRC_lapw0. Check compile.msg in this directory done. Compile time errors (if any) were: SRC_lapw0/compile.msg:Fatal Error: Can't open module file ‘xc_f03_lib_m.mod’ for reading at (1): No such file or directory SRC_lapw0/compile.msg:make[1]: *** [Makefile:170: inputpars.o] Error 1 SRC_lapw0/compile.msg:make: *** [Makefile:119: seq] Error 2 ** init_lapw is executing succesfully while run_lapw shows error ** * /home/hardy/WIEN2K/lapw0: Command not found. grep: lapw2*.error: No such file or directory > stop error thanking you, A. Kumar ___ Wien mailing list
[Wien] OpenMP parallelization in lapw0
Dear Wien2k mailing list, there is some ongoing work to use OpenMP in lapw0 to provide another level of parallelization in addition to the MPI (which parallelizes over atoms). The current version should be almost production ready and was already tested with some standard stuff (LDA,PBE,sp,etc.) by myself and prof. Blaha. I'm looking for some experienced Wien2k users for further testing. If you are willing to help, just email me directly and I'll provide the patches and further instructions. The OpenMP version allows for efficient parallelization of small cases where the MPI version is too heavy hammer (or in general for single computer installations if you don't want to mess with the MPI). It can be also quite effective for large clusters in combination with the MPI (when there are more cores than atoms). Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] $DEC! NOOPTIMIZE equivalents in gfortran?
I can look at the gfortran, what is your testcase? I tried to take a quick look with the full mixer using one random TiO2 case. I put a breakpoint after some random Kahan sum (specifically this was at charge.f:150 in Wien2k 18.2) and I looked for the differences between O0 and O2. I was actually looking for small differences, but the value of sum was 0 with -O2 vs 739.29 with -O0! Hence in this case it looks like either the different optimization levels influence the program flow, or the optimizations caused the shift of the breakpoint to some other place. It might also be possible that this is a gdb problem since there is a lot of ** On entry to DHSEQR parameter number 4 had an illegal value ** On entry to DGEBAL parameter number 3 had an illegal value ** On entry to DGEHRD parameter number 2 had an illegal value spam which I have no idea about and BTW valgrind is also not happy with the mixer (even at -O0 there are lot of "Use of uninitialised value ... and On entry to DHSEQR parameter number 4 had an illegal value ) If you can produce a simple testcase, I'd be happy to look into the Kahan sum problem, but at the moment I can't reproduce with the full mixer due to the aforementioned problems. Best regards Pavel Laurence Marks píše v St 08. 08. 2018 v 11:44 -0500: > I am testing adding the compiler directive !DEC$ NOOPTIMIZE to the > Kahan summations in charge.f in order to prevent ifort from > optimizing the summation away. It seems to help. > > Does anyone know if there are equivalents in gfortran or other > compilers? (I can't find anything for gfortran.) > > N.B., if anyone has experience with directives and wants to suggest > others that may be faster but will avoid optimizing away the > summation I am open to suggestions. > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] crash with -it
On Wed, 2018-07-11 at 11:21 +0200, Peter Blaha wrote: > I've been able to reproduce the problem with gfortran, and also made > a > fix, which according to my tests seems to work. > > Please try the attached jacdavblock.F file. > > This fix is not necessary if ifort is used, but should not harm > either. > > Regards It works fine now, thanks for the fix. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] compilation problems in the new pes module
On Wed, 2018-07-11 at 08:41 +0200, Peter Blaha wrote: > PES is for valence PES. > Ti 3s and 3p are "core" states (from the chemists point of view). > > You should specify the PDOS as I said before: Use > > configure_int > > total > 1 s,d > 2 s,p > > Regards Thanks, it seems I misunderstood the terminology, e.g. in Wien2k context I understood "core" state as states which are defined in case.inc. BTW even with the fixed DOS files, there are still some problems with gfortran. Besides the few unitialized variables as indicated earlier the biggest problems are with the quad-precision. Lot of the uninitialized values come from the fact that following code: REAL*16,DIMENSION(1:5) :: temp_data READ(4,*,IOSTAT=io)(temp_data(n),n=1,5) where the the file contains lines like this: 100. 42.75000 1.998000 0.142 -1.219E-06 does not work, e.g. the values are read completely messed up: (gdb) print temp_data $6 = (0, 2.56176902595376107227e+2170, 2.01476099468585859245e+3117, -5.32047756928494954508e-2034, ) IMO this should work (i.e. this is likely a gfortran problem). Surprisingly if I try to isolate this to a simple test case it works, so there is some subtle bug going on. Converting the code to doubles fixes this particular case. But this is more of a workaround than a fix, I'll report when I have more idea what is going on, thanks for all the help. Best regards Pavel > > Am 10.07.2018 um 22:48 schrieb Pavel Ondračka: > > -- Původní e-mail -- > > Od: Peter Blaha > > Komu: wien@zeus.theochem.tuwien.ac.at > > Datum: 10. 7. 2018 22:01:12 > > Předmět: Re: [Wien] compilation problems in the new pes module > > > > > > I guess your case.int (and thus the dos files) is wrong. > > > > > > This is definitely possible ;-) > > > > The output says: > > > > Valence orbitals according to periodic table data: > > Ti4s3d > > O 2s2p > > > > > > What about the Ti 3s, and Ti 3d? I can see them in my DOS (around > > -55 > > and -32eV as expected), are they too deep? > > > > > > so we need the Ti 4s and 3d PDOS and O 2s,2p PDOS (and the > > total DOS) > > > > > > Thanks for the clarification, but I think I still quite don't get > > this. > > According to you comment I need only the total DOS, Ti4s dos, Ti3d > > dos, > > O2s dos and O2p dos? > > > > The userguide says "You need to generate the partial DOS for ALL > > atoms > > and ALL “chemical angular momenta” (eg. C-s and C-p; or Ti-s and > > Ti-d) > > using the tetra program. This is also not very clear since contrary > > to > > your comment it does not speak about the total DOS at all. And its > > also > > not very clear to me if the "generate the partial DOS for ALL > > atoms" > > belongs to the rest of the sentence, i.e. if I need also the O- > > total dos > > and Ti-total dos in addition to the O-s, O-p, Ti-s, Ti-d? > > > > > > Would you be so kind to share some example case as I believe it > > might > > save some further explanation? > > > > BTW can the [Bagheri and Blaha 2018] manuscript be already > > accessed > > somewhere? > > > > > > Best regards > > > > Pavel Ondračka > > > > > > > > > > You always have to define the "chemical valence orbitals", but > > not all > > possible PDOS. > > > > > > > > > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus. > > theochem.tuwien.ac.at/index.html > > > > ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] compilation problems in the new pes module
-- Původní e-mail -- Od: Peter Blaha Komu: wien@zeus.theochem.tuwien.ac.at Datum: 10. 7. 2018 22:01:12 Předmět: Re: [Wien] compilation problems in the new pes module "I guess your case.int (and thus the dos files) is wrong." This is definitely possible ;-) "" "The output says: Valence orbitals according to periodic table data: Ti4s3d O 2s2p" What about the Ti 3s, and Ti 3d? I can see them in my DOS (around -55 and - 32eV as expected), are they too deep? " so we need the Ti 4s and 3d PDOS and O 2s,2p PDOS (and the total DOS)" Thanks for the clarification, but I think I still quite don't get this. According to you comment I need only the total DOS, Ti4s dos, Ti3d dos, O2s dos and O2p dos? The userguide says "You need to generate the partial DOS for ALL atoms and ALL “chemical angular momenta” (eg. C-s and C-p; or Ti-s and Ti-d) using the tetra program. This is also not very clear since contrary to your comment it does not speak about the total DOS at all. And its also not very clear to me if the "generate the partial DOS for ALL atoms" belongs to the rest of the sentence, i.e. if I need also the O-total dos and Ti-total dos in addition to the O-s, O-p, Ti-s, Ti-d? Would you be so kind to share some example case as I believe it might save some further explanation? BTW can the [Bagheri and Blaha 2018] manuscript be already accessed somewhere? Best regards Pavel Ondračka " You always have to define the "chemical valence orbitals", but not all possible PDOS. """ ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] compilation problems in the new pes module
So after applying the suggested compilation fixes (+ the one uninitialized variable fix suggested later) I started to test the pes program. My testcase was simple anatase TiO2. The program finishes fine, however the spectrum is strange (looks like the Ti3p peak is almost invisible). So I run with valgrind to see if there are some obvious errors. ==16577== Conditional jump or move depends on uninitialised value(s) ==16577==at 0x40DD43: read_database2_ (read_database2.f:26) ==16577==by 0x402CE3: MAIN__ (pes.f:151) ==16577==by 0x40176C: main (pes.f:3) ==16577== Uninitialised value was created by a stack allocation ==16577==at 0x40DC5A: read_database2_ (read_database2.f:1) ios is used uninitialized at read_database2.f:26 do while (ios == 0) --- ==22496== Conditional jump or move depends on uninitialised value(s) ==22496==at 0x41054E: abs_smooth (SPLINE.f:170) ==22496==by 0x41054E: setup_ (SPLINE.f:88) ==22496==by 0x41099C: spline_ (SPLINE.f:15) ==22496==by 0x40E791: read_database2_ (read_database2.f:111) ==22496==by 0x402CE3: MAIN__ (pes.f:151) ==22496==by 0x40176C: main (pes.f:3) ==22496== Uninitialised value was created by a stack allocation ==22496==at 0x410980: spline_ (SPLINE.f:1) this is the delta_x variable first defined in SPLINE and passed down to setup in SPLINE.f:15call setup(delta_x,X,F,N,strt,stp,J,interpolation) further down to abs_smooth at: SPLINE.f:88 call abs_smooth(m4 - m3, delta_x, w1) where it is finally used in: SPLINE.f:170 if (x >= delta_x) then without initialization --- ==29640== Conditional jump or move depends on uninitialised value(s) ==29640==at 0x40E3EA: unpolarized (read_database2.f:153) ==29640==by 0x40E3EA: read_database2_ (read_database2.f:79) ==29640==by 0x402CE3: MAIN__ (pes.f:151) ==29640==by 0x40176C: main (pes.f:3) ==29640== Uninitialised value was created by a stack allocation ==29640==at 0x40DC5A: read_database2_ (read_database2.f:1) This is the counter variable at read_database2.f:153do I=1,counter even though is should initialized at read_database2.f:40 counter = 0 the line was not reached at this point, since it is guarded by several ifs so it is possible that in some cases the variable can stay uninitialized. Should ever the counter be 0 at the read_database2.f:153 line? -- ==22325== Conditional jump or move depends on uninitialised value(s) ==22325==at 0x4801F7A0: __multf3 (in /usr/lib64/libgcc_s-8- 20180502.so.1) ==22325==by 0x41251F: unpolarized_ (read_database2.f:155) ==22325==by 0x412E4A: read_database2_ (read_database2.f:79) ==22325==by 0x403414: MAIN__ (pes.f:151) ==22325==by 0x40560A: main (pes.f:3) ==22325== Uninitialised value was created by a stack allocation ==22325==at 0x4125C9: read_database2_ (read_database2.f:1) This is the unitialized beta variable. Obviously similar to the above warning we somehow missed the initialization block starting at read_database2.f:46 and ending at read_database2.f:56 and there are more troubles ahead (another 1000+ valgrind warnings). At this moment I stopped since I suspect something went wrong earlier and there is no reason to debug codepaths which should probably never be reached. It is possible that I messed something in the initialization or my input files are not in order... The complete output looks like this: Please enter the excitation energy (ev): 1486.6 nat: 2 mult: 2 4 aname :Ti O ___ Valence orbitals according to periodic table data: Ti4s3d O 2s2p ___ opening 9 100k_7Rk_PBE.dos2ev opening 10 100k_7Rk_PBE.dos3ev Valence Partial orbital found in case.DOS file: Ti4s 1 Ti3d 1 O 2s 2 O 2p 2 Enter database (default: 1) 1 Total cross section & Asymmetry & Non-dipole parameters 100-1 eV (Trzhaskovskaya etal., Unpolarized & linearly polarized X-ray source) Recommended option ! 2 Total cross section 10-1500eV (Yeh & Lindau 1985; Unpolarized X-ray source) 3 No cross sections (for testing renormalization) 1 Enter Calulation Scheme (default: 1) 1 Unpolarized X-ray source (general) Linearlly polarized X-ray source 2 Dipole & NON Dipole - parallel 3 Dipole - perpendicular 4 Dipole & NON Dipole - perpendicular 5 LDAD 1 __ Partial OrbitalCross section Ti4s0.54346D-01 Ti3d0.54346D-01 O 2s0.54346D-01 O 2p0.54346D-01 Continue with q_sphere?(Recommended)(Y/n) y Partial OrbitalAverage q_sphere Ti4s 0.7378D-01
[Wien] crash with -it
Dear Wien2k mailing, so I noticed that the old crash with -it (with gfortran only of course) is still present in the 18.1 version: At line 140 of file jacdavblock_tmp_.F (unit = 200, file = './case.storeHinv_1_proc_0') Fortran runtime error: Sequential READ or WRITE not allowed after EOF marker, possibly use REWIND or BACKSPACE It was first reported here: https://www.mail-archive.com/wien@zeus.theo chem.tuwien.ac.at/msg17338.html and some analysis was also provided by Gavin https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg1 7343.html but it probably got forgotten since then. The workaround is simply to use the -noHinv flag, hence this is not really critical, but it would be nice to have it fixed for some future release anyway. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] supported ELPA versions?
On Mon, 2018-07-09 at 14:20 +, Ruh, Thomas wrote: > Dear Pavel, > > we used the ELPA version 2015.11.001 for implementation. There this > module is still present - apparently there was a change of the > interface between this version and the newer ones. > I will look into getting the newest ELPA working with WIEN2k, > however, at the moment, I am having problems even compiling the > newest ELPA version on our system. Did the installation work for you > without problem? Dear Thomas, I had the 2017.05.003 and 2018.05.001 packages already available on my system, thats why I tried with them in the first place. However a quick compile of the 2018.05.001 version works fine (I tested only MPI version, no OMP or GPU stuff though). What problems do you see? > I will keep you updated regarding the usability of newer versions, > but in the meantime you could use version 2015.11.001. Thanks, 2015.11.001 compiles fine, but now I'm seeing some: Operating system error: Cannot allocate memory Allocation would exceed memory limit which is strange since this is with the mpi-benchmark which doesn't need that much memory in the first place (and I'm only starting testing now with few processes on a single node and with >50GB free memory so memory pressure seems unlikely, with scalapack only it uses <10% of available memory. I checked the ulimit as an obvious suspect but it is set to unlimited, will need to dug into the code to see how much memory it actually wants... Best regards Pavel > Kind regards, > Thomas > ________ > Von: Wien im Auftrag von > Pavel Ondračka > Gesendet: Montag, 09. Juli 2018 15:26 > An: wien@zeus.theochem.tuwien.ac.at > Betreff: [Wien] supported ELPA versions? > > Dear Wien2k mailing list, > > what is the recommended ELPA version? I've tried the 2017.05.003 and > 2018.05.001 versions with no luck (missing > mod_blacs_infrastructure.mod > file). > > Best regards > Pavel > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.th > eochem.tuwien.ac.at/index.html > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.th > eochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] compilation problems in the new pes module
Thanks for the fixes, the code compiles now. I've prepared a patch so that other users don't have to patch by hand, and also for Gavin if he continues the great work of collecting fixes in his repo. Copy to the SRC_pes folder and apply with patch -p1 < pes-patch.txt Best regards Pavel On Tue, 2018-07-10 at 08:33 +0200, Peter Blaha wrote: > Thanks for the report. See inlined comments. > > PS: Unfortunately, when I looked into the code, I saw it is in > terrible > shape. It mixes real*4 up to real*16 variables randomly and has a > couple > of unclear things in it (for instance just before calling spline > > Peter Blaha > > > I'm interested in the new pes module. Unfortunately, the > > compilation of > > the module faces some problems with gfortran, specifically: > > > > - > > > > pes.f:114:19: > > > > read (*,'(I)') database > > 1 > > Error: Nonnegative width required in format string at (1) > > pes.f:146:21: > > > > read (*,'(i)') scheme > > 1 > > Error: Nonnegative width required in format string at (1) > > > > - This is nonstandard behavior, looking at the expected values it > > should be probably I1 in both cases > > > Yes I1 is fine. > > > > > > > pes.f:235:39: > > > > 500 format(A,A16,2x,A16,2x,<7>(A16,2x)) > > 1 > > Error: Unexpected element ‘<’ in format string at (1) > > pes.f:239:42: > > > > 600 format(f16.8,2x,e16.8,2x,<7>(e16.8,2x)) > >1 > > Error: Unexpected element ‘<’ in format string at (1) > > ind_p.f:39:26: > > > > 100 format(<15>A1) > >1 > > Error: Unexpected element ‘<’ in format string at (1) > > optimize_charge.f:239:21: > > > > 1013 FORMAT(<3>A15) > > 1 > > Error: Unexpected element ‘<’ in format string at (1) > > > > - Another nonstandard ifort specific stuff. Since the value is > > constant the brackets are not needed anyway. > > Yes, the "<" and ">" characters should simply be removed. > > > > > > > > > > pes.f:266:22: > > > > 800 format(4x,I) > >1 > > Error: Non-negative width required in format string at (1) > > optimize_charge.f:64:25: > > > >1001 FORMAT(3x,A1,I) > > This should be I3 > > > > 1 > > Error: Nonnegative width required in format string at (1) > > read_dos.f:41:21: > > > > 301 FORMAT (7x,I) > > This should be I5 > > > 1 > > Error: Nonnegative width required in format string at (1) > > read_dos.f:44:45: > > > >400 format(4x,f10.5,10x,i3,10x,i8,20x,f) > > should be f10.5 > > > 1 > > Error: Nonnegative width required in format string at (1) > > > > - No idea here about the required width, but needs to be set too. > > > > > > > > pes.f:279:26: > > > >if((ERROR.eq.0).AND.(STR.eq.'#')) then > > Yes, of course this should be STTR instead of STR > > >1 > > Error: Operands of comparison operator ‘.eq.’ at (1) are > > INTEGER(4)/CHARACTER(1) > > > > -It looks like the STR is undefined, probably a typo (did author > > want > > STTR in the comparison)? > > > > > > > > read_dos.f:51:36: > > > > 600 format(f10.5,f14.8) > > Should simply be: 600 format(f10.5,7f14.8) > > > 1 > > Error: Unexpected element ‘<’ in format string at (1) > > Find_p.f:46:25: > > > >200 format(A1) > > It should be 15A1 > > > 1 > > Error: Unexpected element ‘<’ in format string at (1) > > Find_p.f:50:25: > > > >300 format(A1) > > Also here: 15A1 > > > 1 > > Error: Unexpected element ‘<’ in format string at (1) > > > > - Can be rewritten with combination of internal output and string > > formats. > > > > for example: > > write(Anumber,200)(temp(l),l=1,k-1) > > 200 format(A1) > > > > should be equivalent to > > > > character(len=10) :: frmt > > write(frmt,'("(",I0,"A1)")') j-1 > > write(Anumber,frmt)(temp(l),l=1,k-1) > > > > > > > > optimize_charge.f:103:9: > > > > IF(PCHECK(j).EQ. .FALSE.)THEN > > 1 > > Error: Logicals at (1) must be compared with .eqv. instead of .eq. > > optimize_charge.f:329:12: > > > > IF (CHECK.EQ..FALSE.) THEN > > 1 > > Error: Logicals at (1) must be compared with .eqv. instead of .eq. > > read_database2.f:68:5: > > > >if (data_exist.eq..false.)then > > 1 > > Error: Logicals at (1) must be compared with .eqv. instead of .eq. > > > > - Use .eqv. as suggested. > > Yes, in all these cases it should be .eqv. > > > > > - > > > > SPLINE.f:15:14: > > > > call setup(p0, p1, p2, p3, > >
[Wien] supported ELPA versions?
Dear Wien2k mailing list, what is the recommended ELPA version? I've tried the 2017.05.003 and 2018.05.001 versions with no luck (missing mod_blacs_infrastructure.mod file). Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] compilation problems in the new pes module
Dear Wien2k mailing list, I'm interested in the new pes module. Unfortunately, the compilation of the module faces some problems with gfortran, specifically: - pes.f:114:19: read (*,'(I)') database 1 Error: Nonnegative width required in format string at (1) pes.f:146:21: read (*,'(i)') scheme 1 Error: Nonnegative width required in format string at (1) - This is nonstandard behavior, looking at the expected values it should be probably I1 in both cases pes.f:235:39: 500 format(A,A16,2x,A16,2x,<7>(A16,2x)) 1 Error: Unexpected element ‘<’ in format string at (1) pes.f:239:42: 600 format(f16.8,2x,e16.8,2x,<7>(e16.8,2x)) 1 Error: Unexpected element ‘<’ in format string at (1) ind_p.f:39:26: 100 format(<15>A1) 1 Error: Unexpected element ‘<’ in format string at (1) optimize_charge.f:239:21: 1013 FORMAT(<3>A15) 1 Error: Unexpected element ‘<’ in format string at (1) - Another nonstandard ifort specific stuff. Since the value is constant the brackets are not needed anyway. pes.f:266:22: 800 format(4x,I) 1 Error: Non-negative width required in format string at (1) optimize_charge.f:64:25: 1001 FORMAT(3x,A1,I) 1 Error: Nonnegative width required in format string at (1) read_dos.f:41:21: 301 FORMAT (7x,I) 1 Error: Nonnegative width required in format string at (1) read_dos.f:44:45: 400 format(4x,f10.5,10x,i3,10x,i8,20x,f) 1 Error: Nonnegative width required in format string at (1) - No idea here about the required width, but needs to be set too. pes.f:279:26: if((ERROR.eq.0).AND.(STR.eq.'#')) then 1 Error: Operands of comparison operator ‘.eq.’ at (1) are INTEGER(4)/CHARACTER(1) -It looks like the STR is undefined, probably a typo (did author want STTR in the comparison)? read_dos.f:51:36: 600 format(f10.5,f14.8) 1 Error: Unexpected element ‘<’ in format string at (1) Find_p.f:46:25: 200 format(A1) 1 Error: Unexpected element ‘<’ in format string at (1) Find_p.f:50:25: 300 format(A1) 1 Error: Unexpected element ‘<’ in format string at (1) - Can be rewritten with combination of internal output and string formats. for example: write(Anumber,200)(temp(l),l=1,k-1) 200 format(A1) should be equivalent to character(len=10) :: frmt write(frmt,'("(",I0,"A1)")') j-1 write(Anumber,frmt)(temp(l),l=1,k-1) optimize_charge.f:103:9: IF(PCHECK(j).EQ. .FALSE.)THEN 1 Error: Logicals at (1) must be compared with .eqv. instead of .eq. optimize_charge.f:329:12: IF (CHECK.EQ..FALSE.) THEN 1 Error: Logicals at (1) must be compared with .eqv. instead of .eq. read_database2.f:68:5: if (data_exist.eq..false.)then 1 Error: Logicals at (1) must be compared with .eqv. instead of .eq. - Use .eqv. as suggested. - SPLINE.f:15:14: call setup(p0, p1, p2, p3, delta_x,X,F,N,strt,stp,J,interpolation) 1 Error: Explicit interface required for ‘setup’ at (1): allocatable argument - No idea here :-( - read_int.f:18:25: read(22,100),ndos 1 Warning: Legacy Extension: Comma before i/o item list at (1) Find_p.f:66:65: write(output_names(output_counter),500) ,aname(m),composition(m,n),m 1 Warning: Legacy Extension: Comma before i/o item list at (1) - Some unrelated harmless easy to fix warnings. - Most of the fixes are probably obvious except the missing length for the read formats, where the proper fix requires some knowledge about the input structuring and also the "Explicit interface required" stuff. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
On Tue, 2018-05-29 at 13:58 +0200, Peter Blaha wrote: > I'm working on the new distribution, WIEN2k_18, where this variable > should be initialized. > > It should come out within the next days ... Thanks a lot, I'm looking forward to the new release. Best regards Pavel > > On 05/29/2018 12:36 PM, Pavel Ondračka wrote: > > On Wed, 2018-05-09 at 14:17 -0500, Laurence Marks wrote: > > > You are right, this is a bug (potentially severe). In my local > > > version I replaced somm with a more standard integration as somm > > > interpolates to the origin (which is not right). > > > > Will there be a fix for this bug? > > > > Best regards > > Pavel > > > > > > On Wed, May 9, 2018 at 2:11 PM, Pavel Ondračka > > ail. > > > cz> wrote: > > > > -- Původní e-mail -- > > > > Od: Laurence Marks > > > > Komu: A Mailing list for WIEN2k users > > > n.ac > > > > .at> > > > > Datum: 9. 5. 2018 18:18:21 > > > > Předmět: Re: [Wien] IEEE_UNDERFLOW_FLAG IEEE_DENORMAL > > > > > > > > > Zamt is set by somm, which is a slightly inaccurate Simpson > > > > > summation (the accuracy does not matter). Hence it does not > > > > > need > > > > > to be set previously. > > > > > > > > Right, it is indeed set in the somm subroutine. The problem is > > > > that > > > > the DA variable (the local name for zamt variable in the somm > > > > subroutine) is set for the first time on line 11, while it is > > > > used > > > > (read) for the first time earlier on line 10 (and this is the > > > > line > > > > when the gcc and valgrind complains)! Hence when the subroutine > > > > is > > > > called for the first time, the results depends on read from > > > > uninitialized memory. > > > > Best regards > > > > Pavel > > > > > > > > > _ > > > > > Professor Laurence Marks > > > > > "Research is to see what everybody else has seen, and to > > > > > think > > > > > what nobody else has thought", Albert Szent-Gyorgi > > > > > www.numis.northwestern.edu > > > > > > > > > > On Wed, May 9, 2018, 10:42 AM Pavel Ondračka > > > > emai > > > > > l.cz> wrote: > > > > > > Laurence Marks píše v St 09. 05. 2018 v 11:51 +: > > > > > > > This appears to be due to a silly approach in gfortran, > > > > > > > and > > > > > > > > > > > > almost > > > > > > > certainly is not an error/problem and can be ignored -- > > > > > > > see h > > > > > > > > > > > > ttps://urldefense.proofpoint.com/v2/url?u=https- > > > > > > 3A__s=DwIGaQ=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6w > > > > > > s= > > > > > > U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0=R1lxmwh4Y3r4y > > > > > > Rx1Z > > > > > > yR4NN9mpSe7RuaT974qRm6Uhfw=KDobN5dacXxTk7OUpO1BdJBY45FBX3 > > > > > > 4Hf6 > > > > > > Q9hTempg0= > > > > > > > tackoverflow.com/questions/44308577/ieee-underflow-flag- > > > > > > > ieee- > > > > > > > denormal-in-fortran-77. > > > > > > > > > > > > > > > > > > > While I do agree that in most cases this is harmless, it > > > > > > can > > > > > > also > > > > > > suggest a bug. > > > > > > > > > > > > I only looked at dstart for the TiC and I have this > > > > > > specific > > > > > > example: > > > > > > > > > > > > dstart for a TiC case prints a "Note: The following > > > > > > floating- > > > > > > point > > > > > > exceptions are signalling: IEEE_DENORMAL" line. If you trap > > > > > > this with > > > > > > -ffpe-trap='denormal' flag, to inspect in gdb, the > > > > > > offending > > > > > > line is > > > > > > then: > > > > > > Program received signal SIGFPE, Arithmetic exception. > > > > > > 0x0041dc48 in somm (dr=..., dp=..., > > > > > > dpas=0.
Re: [Wien] IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
On Wed, 2018-05-09 at 14:17 -0500, Laurence Marks wrote: > You are right, this is a bug (potentially severe). In my local > version I replaced somm with a more standard integration as somm > interpolates to the origin (which is not right). Will there be a fix for this bug? Best regards Pavel > > On Wed, May 9, 2018 at 2:11 PM, Pavel Ondračka cz> wrote: > > -- Původní e-mail -- > > Od: Laurence Marks > > Komu: A Mailing list for WIEN2k users > .at> > > Datum: 9. 5. 2018 18:18:21 > > Předmět: Re: [Wien] IEEE_UNDERFLOW_FLAG IEEE_DENORMAL > > > > > Zamt is set by somm, which is a slightly inaccurate Simpson > > > summation (the accuracy does not matter). Hence it does not need > > > to be set previously. > > > > Right, it is indeed set in the somm subroutine. The problem is that > > the DA variable (the local name for zamt variable in the somm > > subroutine) is set for the first time on line 11, while it is used > > (read) for the first time earlier on line 10 (and this is the line > > when the gcc and valgrind complains)! Hence when the subroutine is > > called for the first time, the results depends on read from > > uninitialized memory. > > Best regards > > Pavel > > > > > _ > > > Professor Laurence Marks > > > "Research is to see what everybody else has seen, and to think > > > what nobody else has thought", Albert Szent-Gyorgi > > > www.numis.northwestern.edu > > > > > > On Wed, May 9, 2018, 10:42 AM Pavel Ondračka > > l.cz> wrote: > > > > Laurence Marks píše v St 09. 05. 2018 v 11:51 +: > > > > > This appears to be due to a silly approach in gfortran, and > > > > almost > > > > > certainly is not an error/problem and can be ignored -- see h > > > > ttps://urldefense.proofpoint.com/v2/url?u=https- > > > > 3A__s=DwIGaQ=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws= > > > > U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0=R1lxmwh4Y3r4yRx1Z > > > > yR4NN9mpSe7RuaT974qRm6Uhfw=KDobN5dacXxTk7OUpO1BdJBY45FBX34Hf6 > > > > Q9hTempg0= > > > > > tackoverflow.com/questions/44308577/ieee-underflow-flag-ieee- > > > > > denormal-in-fortran-77. > > > > > > > > > > > > > While I do agree that in most cases this is harmless, it can > > > > also > > > > suggest a bug. > > > > > > > > I only looked at dstart for the TiC and I have this specific > > > > example: > > > > > > > > dstart for a TiC case prints a "Note: The following floating- > > > > point > > > > exceptions are signalling: IEEE_DENORMAL" line. If you trap > > > > this with > > > > -ffpe-trap='denormal' flag, to inspect in gdb, the offending > > > > line is > > > > then: > > > > Program received signal SIGFPE, Arithmetic exception. > > > > 0x0041dc48 in somm (dr=..., dp=..., > > > > dpas=0.013585429144994965, > > > > da=1.1588924125005173e-310, m=0, np=935) at somm.f:10 > > > > 10D1=DA+MM > > > > > > > > so this line looks completely harmless and some prints show why > > > > the > > > > compiler notes about this: > > > > > > > > (gdb) print DA > > > > $1 = 1.1588924125005173e-310 > > > > (gdb) print MM > > > > $2 = 1 > > > > > > > > i.e. we just add a really small number (denormal, since > > > > DOUBLE_MIN is > > > > around 1.8e-308) to 1, which is completely OK. But lets take a > > > > look > > > > where this incredibly small value comes from... > > > > > > > > (gdb) up > > > > #1 0x004143c6 in make_spheres (lcore=.FALSE., luse=7) > > > > at > > > > make_spheres.F:81 > > > > 81 call > > > > somm(rat(1,ia),rhoat(1,ia),dx(ia),zamt,0,nptat(ia)) > > > > > > > > So this is the zamt variable, surprisingly grepping around for > > > > zamt > > > > finds nothing. As far as I can see it is not declared or > > > > initialized > > > > anywere in dstart. If I just missed something please correct > > > > me! The > > > > same goes for the zamt1 and zamt2. > > > > > > > > Note that in this case we get lucky since the random memory > > > > value i
Re: [Wien] Problems when trying to plot E vs c/a
Riyajul Islam píše v Pá 18. 05. 2018 v 19:25 +0530: > I also have the same problem with E vs c/a plot. Then when I replace > optimize.pl your attached one the I get an error > Failed to exec /home/dipraj/wien2k/SRC_w2web/htdocs/exec/optimize.pl > : Permission denied Dear Riyajul, you probably have wrong file permissions, try: chmod +x /home/dipraj/wien2k/SRC_w2web/htdocs/exec/optimize.pl Best regards Pavel > On 17 May 2018 at 17:00, Fecher, Gerhardwrote: > > Hallo Peter, > > thanks for the files. > > unforunately, the otimize.pl still doesn't show the result of the > > fit (plot is there) > > output is in a shortened version: > > > > Fit of: E = a1 + a2*x + a3*x^2 + a4*x^3 + a5*x^4 > > a1 1.000 > > a2 0.000 1.000 > > a3 -0.725 -0.000 1.000 > > a4 -0.000 -0.930 0.000 1.000 > > a5 0.648 0.000 -0.985 -0.000 1.000 > > > > the line 174 should contain at least tail -15(instead of -5) > > what results in the output of the parameters and the correlation > > matrix > > > > Fit of: E = a1 + a2*x + a3*x^2 + a4*x^3 + a5*x^4 > > Final set of parametersAsymptotic Standard Error > > ===== > > a1 = -5573.9 +/- 3.634e-06(6.519e-08%) > > a2 = 4.23124e-06 +/- 9.205e-06(217.5%) > > a3 = 0.000137795 +/- 2.93e-05 (21.26%) > > a4 = 7.61902e-06 +/- 1.037e-05(136.1%) > > a5 = -1.43164e-05 +/- 2.725e-05(190.3%) > > > > correlation matrix of the fit parameters: > > a1 a2 a3 a4 a5 > > a1 1.000 > > a2 0.000 1.000 > > a3 -0.725 -0.000 1.000 > > a4 -0.000 -0.930 0.000 1.000 > > a5 0.648 0.000 -0.985 -0.000 1.000 > > > > or shorter versuion is to use tail -15 fit.log | head -7 > > because I don't think that the correlation matrix is needed in the > > w2web output (it's found in fit.log anyway) > > the result is then only > > > > Fit of: E = a1 + a2*x + a3*x^2 + a4*x^3 + a5*x^4 > > Final set of parametersAsymptotic Standard Error > > ===== > > a1 = -5573.9 +/- 3.634e-06(6.519e-08%) > > a2 = 4.23124e-06 +/- 9.205e-06(217.5%) > > a3 = 0.000137795 +/- 2.93e-05 (21.26%) > > a4 = 7.61902e-06 +/- 1.037e-05(136.1%) > > a5 = -1.43164e-05 +/- 2.725e-05(190.3%) > > > > the optimize.pl file changed in the latter way is attached > > > > > > Ciao > > Gerhard > > > > DEEP THOUGHT in D. Adams; Hitchhikers Guide to the Galaxy: > > "I think the problem, to be quite honest with you, > > is that you have never actually known what the question is." > > > > > > Dr. Gerhard H. Fecher > > Institut of Inorganic and Analytical Chemistry > > Johannes Gutenberg - University > > 55099 Mainz > > and > > Max Planck Institute for Chemical Physics of Solids > > 01187 Dresden > > > > Von: Wien [wien-boun...@zeus.theochem.tuwien.ac.at] im Auftrag von > > Peter Blaha [pbl...@theochem.tuwien.ac.at] > > Gesendet: Donnerstag, 17. Mai 2018 12:32 > > An: wien@zeus.theochem.tuwien.ac.at > > Betreff: Re: [Wien] Problems when trying to plot E vs c/a > > > > Thanks for the report. > > > > Modified eplot_lapw > > and > > SRC_w2web/htdocs/exec/optimize.pl > > > > attached. > > > > On 05/16/2018 04:20 PM, Fecher, Gerhard wrote: > > > Dear c/a fitters, > > > This concerns the latest Wien2k version > > > I receive only the content of > > > test_opt.analysis > > > when I try with w2web to plot E vs c/a > > > but neither the result of the fit nor the plot are shown, > > > this seems to be a problem with the present version of the > > > eplot > > > script > > > > > > when I use the eplot script of version 14.2 it is nearly ok, > > however, > > > there are still two issues: instead of the result of the fit, the > > "correlation matrix of the fit parameters" is shown > > > and the figure is missing. > > > Reason is that eplot and optimize.pl do not work well together: > > > optimize.pl > > > prints the last 5 lines of fit.log (but the result is before > > these lines) and expects the graph as case.c_over_a.png (but has a > > different name .coa.) > > > > > > this can be solved by changing the two lines > > > line 169change "CASE.c_over_a.png" > > > $umps = qx(cp $DIR/$CASE.coa.png $tempdir/$SID-$$.png); > > > line 173change "tail -5" > > > $OUT .= qx(cd $DIR;echo ' ';echo "Fit of: E = a1 + > > a2*x + a3*x^2 + a4*x^3 + a5*x^4";tail -15 fit.log); > > > > > > or indeed, by changing eplot (I just did not find fast how to > > supress the output of the correlation
Re: [Wien] IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
-- Původní e-mail -- Od: Laurence MarksKomu: A Mailing list for WIEN2k users Datum: 9. 5. 2018 21:18:20 Předmět: Re: [Wien] IEEE_UNDERFLOW_FLAG IEEE_DENORMAL " You are right, this is a bug (potentially severe). In my local version I replaced somm with a more standard integration as somm interpolates to the origin (which is not right). " Thanks for looking into this, hopefully the fix is not too hard. BTW if I provide similar analysis also for the other occurrences of floating point exceptions and/or valgrind uninitialized read/write errors, would you be willing to check whether those are harmless or real problems as well? Best regards Pavel " "___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
-- Původní e-mail -- Od: Laurence Marks <l-ma...@northwestern.edu> Komu: A Mailing list for WIEN2k users <wien@zeus.theochem.tuwien.ac.at> Datum: 9. 5. 2018 18:18:21 Předmět: Re: [Wien] IEEE_UNDERFLOW_FLAG IEEE_DENORMAL " Zamt is set by somm, which is a slightly inaccurate Simpson summation (the accuracy does not matter). Hence it does not need to be set previously. " Right, it is indeed set in the somm subroutine. The problem is that the DA variable (the local name for zamt variable in the somm subroutine) is set for the first time on line 11, while it is used (read) for the first time earlier on line 10 (and this is the line when the gcc and valgrind complains)! Hence when the subroutine is called for the first time, the results depends on read from uninitialized memory. Best regards Pavel " _ Professor Laurence Marks "Research is to see what everybody else has seen, and to think what nobody else has thought", Albert Szent-Gyorgi www.numis.northwestern.edu(http://www.numis.northwestern.edu) On Wed, May 9, 2018, 10:42 AM Pavel Ondračka <pavel.ondra...@email.cz (mailto:pavel.ondra...@email.cz)> wrote: "Laurence Marks píše v St 09. 05. 2018 v 11:51 +: > This appears to be due to a silly approach in gfortran, and almost > certainly is not an error/problem and can be ignored -- see https:// urldefense.proofpoint.com/v2/url?u=https-3A__s=DwIGaQ=yHlS04HhBraes5BQ9 ueu5zKhE7rtNXt_d012z2PA6ws=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0=R 1lxmwh4Y3r4yRx1ZyR4NN9mpSe7RuaT974qRm6Uhfw=KDobN5dacXxTk7OUpO1BdJBY45FBX34 Hf6Q9hTempg0= (https://urldefense.proofpoint.com/v2/url?u=https-3A__s=DwIGaQ=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0=R1lxmwh4Y3r4yRx1ZyR4NN9mpSe7RuaT974qRm6Uhfw=KDobN5dacXxTk7OUpO1BdJBY45FBX34Hf6Q9hTempg0=) > tackoverflow.com/questions/44308577/ieee-underflow-flag-ieee- (http://tackoverflow.com/questions/44308577/ieee-underflow-flag-ieee-) > denormal-in-fortran-77. > While I do agree that in most cases this is harmless, it can also suggest a bug. I only looked at dstart for the TiC and I have this specific example: dstart for a TiC case prints a "Note: The following floating-point exceptions are signalling: IEEE_DENORMAL" line. If you trap this with -ffpe-trap='denormal' flag, to inspect in gdb, the offending line is then: Program received signal SIGFPE, Arithmetic exception. 0x0041dc48 in somm (dr=..., dp=..., dpas=0.013585429144994965, da=1.1588924125005173e-310, m=0, np=935) at somm.f:10 10 D1=DA+MM so this line looks completely harmless and some prints show why the compiler notes about this: (gdb) print DA $1 = 1.1588924125005173e-310 (gdb) print MM $2 = 1 i.e. we just add a really small number (denormal, since DOUBLE_MIN is around 1.8e-308) to 1, which is completely OK. But lets take a look where this incredibly small value comes from... (gdb) up #1 0x004143c6 in make_spheres (lcore=.FALSE., luse=7) at make_spheres.F:81 81 call somm(rat(1,ia),rhoat(1,ia),dx(ia),zamt,0,nptat(ia)) So this is the zamt variable, surprisingly grepping around for zamt finds nothing. As far as I can see it is not declared or initialized anywere in dstart. If I just missed something please correct me! The same goes for the zamt1 and zamt2. Note that in this case we get lucky since the random memory value is effectively zero, however this might in my opinion lead to problems if you hit random memory with another value. In fact running the dstart in vagrind shows this as well and the terminal is spammed with "Conditional jump or move depends on uninitialised value(s)" and "Use of uninitialised value of size 8". IMO this is a bug, so either the line needs to be changed to somm(rat(1,ia),rhoat(1,ia),dx(ia),0,0,nptat(ia)) or the zamt variable needs to be declared and initialized somewhere. But I actually have no idea about the physical meaning of the code so please correct me if I just missed something. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at(mailto:Wien@zeus.theochem.tuwien.ac.at) https://urldefense.proofpoint.com/v2/url?u=http-3A__zeus.theochem.tuwien.ac. at_mailman_listinfo_wien=DwIGaQ=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA 6ws=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0=R1lxmwh4Y3r4yRx1ZyR4NN9 mpSe7RuaT974qRm6Uhfw=_8Uru9zRH580QrhtgA9HnU1x81x6paXIOqJCFnOzZME= (https://urldefense.proofpoint.com/v2/url?u=http-3A__zeus.theochem.tuwien.ac.at_mailman_listinfo_wien=DwIGaQ=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0=R1lxmwh4Y3r4yRx1ZyR4NN9mpSe7RuaT974qRm6Uhfw=_8Uru9zRH580QrhtgA9HnU1x81x6paXIOqJCFnOzZME=) SEARCH the MAILING-LIST at: https://urldefense.proofpoint.com/v2/url?u=http -3A__www.mail-2Darchive.com_wien-40zeus.theochem.tuwien.ac.at_index.html= Dw
Re: [Wien] IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
Laurence Marks píše v St 09. 05. 2018 v 11:51 +: > This appears to be due to a silly approach in gfortran, and almost > certainly is not an error/problem and can be ignored -- see https://s > tackoverflow.com/questions/44308577/ieee-underflow-flag-ieee- > denormal-in-fortran-77. > While I do agree that in most cases this is harmless, it can also suggest a bug. I only looked at dstart for the TiC and I have this specific example: dstart for a TiC case prints a "Note: The following floating-point exceptions are signalling: IEEE_DENORMAL" line. If you trap this with -ffpe-trap='denormal' flag, to inspect in gdb, the offending line is then: Program received signal SIGFPE, Arithmetic exception. 0x0041dc48 in somm (dr=..., dp=..., dpas=0.013585429144994965, da=1.1588924125005173e-310, m=0, np=935) at somm.f:10 10D1=DA+MM so this line looks completely harmless and some prints show why the compiler notes about this: (gdb) print DA $1 = 1.1588924125005173e-310 (gdb) print MM $2 = 1 i.e. we just add a really small number (denormal, since DOUBLE_MIN is around 1.8e-308) to 1, which is completely OK. But lets take a look where this incredibly small value comes from... (gdb) up #1 0x004143c6 in make_spheres (lcore=.FALSE., luse=7) at make_spheres.F:81 81 call somm(rat(1,ia),rhoat(1,ia),dx(ia),zamt,0,nptat(ia)) So this is the zamt variable, surprisingly grepping around for zamt finds nothing. As far as I can see it is not declared or initialized anywere in dstart. If I just missed something please correct me! The same goes for the zamt1 and zamt2. Note that in this case we get lucky since the random memory value is effectively zero, however this might in my opinion lead to problems if you hit random memory with another value. In fact running the dstart in vagrind shows this as well and the terminal is spammed with "Conditional jump or move depends on uninitialised value(s)" and "Use of uninitialised value of size 8". IMO this is a bug, so either the line needs to be changed to somm(rat(1,ia),rhoat(1,ia),dx(ia),0,0,nptat(ia)) or the zamt variable needs to be declared and initialized somewhere. But I actually have no idea about the physical meaning of the code so please correct me if I just missed something. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Installation with MPI and GNU compilers
Laurence Marks píše v St 02. 05. 2018 v 21:17 +: > When you say "as fast" do you mean for single core machines or > multicore with threads and/or mpi? Almost everything slow in Wien2k > is lapack/scalapack/elpa. For most parts of the code with 30-200 atom > problems ifort is good but not as critical as the libraries and > network. Yeah I was mostly talking about the normal/highend PCs, not HPC clusters which always have the ifort + MKL set up. I think that the gfortran + MKL / gfortran + OpenBLAS setups are mostly relevant for people new to Wien2k who just want to try it or run some simple/medium sized cases (or people like myself who prefer open source software). Personally I think it is quite hard for new users to get the Wien2k running properly, in a perfect world you would just tell user to install appropriate packages with apt-get/dnf/whatever install openblas fftw elpa etc... and have the siteconfig autodetect it via the usual means (pkgconfig). I have few ideas for siteconfig improvements, will try to come up with some patches, but I'll start a new thread for that later. Best regards Pavel > On Wed, May 2, 2018, 16:05 Pavel Ondračka <pavel.ondra...@email.cz> > wrote: > > -- Původní e-mail -- > > Od: Fecher, Gerhard <fec...@uni-mainz.de> > > Komu: Pavel Ondračka <pavel.ondra...@email.cz>, wien@zeus.theochem. > > tuwien.ac.at <wien@zeus.theochem.tuwien.ac.at> > > Datum: 2. 5. 2018 16:08:06 > > Předmět: AW: [Wien] Installation with MPI and GNU compilers > > > Dear Pavel, > > > maybe it's better to ask Laurence, seems he was writing the VML > > > things. > > > > > > I didn't look into the code within the last years, what I found > > > on a fast look is: > > > > > > The only place where the INTEL_VML is used any longer seems to be > > > in Hamilt.f of LAPW1 > > > I found that it is commented in all other cases where it was once > > > used. > > > > > > If you don't use INTEL_VML, the INTEL ifort will vectorice the > > > loops in vectf.f of LAPW1 (see code in Hamilt.f that calls it) > > > (as I mentioned, maybe one has to link the libsvml explicitely > > > > > > > > > > BTW is svml part of the MKL or do you need the ifort for that? > > > For example > > > -O2 -xHost -qopt-report=1 -qopt-report-phase=vec > > > will show you which loops were vectorized > > > > Indeed, if I add the -O2 and -xHost to the default Wien2k flags > > (with ifort and MKL) there is no performance hit if I remove the > > -DINTEL_VML. > > > > > I could not see that the svml has a reduced accuracy, however, > > > you can set the performance/accuracy level in the VML. > > > What you can do is to set a threshhold for the loop size (similar > > > to unroll), might need some short study of the manual. > > > > Interesting, I will try to run some tests for the speed and > > accuracy of some basic trigonometric functions for ifort vs > > gfortran and standard glibc vs libmvec vs VML vs svml. > > > I could not see that in W2kinit.F a threshold for the loops (size > > > of the arrays) was set, > > > only the precision was set there for the INTEL_VML script, > > > however, > > > I guess that Laurence used it where only large arrays appeared. > > > > > > NB: I enjoy more questions about how to increase the speed or how > > > to improve the code. > > > > Well, I do believe that the code is well optimized when you have > > the ifort + MKL, however the rest of the options is a somewhat > > worse. > > > > Since you can nowadays get the MKL library for free (but not the > > ifort) there is the combination of gfortran + MKL, which does not > > have any default config and is slow as was reported by Rui in > > beginning of the thread. I'm quite sure this combination can be > > made almost as fast as the ifort + MKL (either by somewhat fixing > > the INTEL_VML define to fix the missing ifcore problem, or possibly > > by using the -mveclibabi=svml gfortran switch or some other trick). > > I'm not sure how many people have this setup though. > > > > The most problematic is the gfortran + OpenBLAS combination, where > > I was not able to force gfortran use the vectorized (SIMD) math. It > > works with C code (which is why my approach to making lapw1 fast > > includes porting the vectf.f to C) but not with Fortran. It is > > possible there is some way to make this work but I had no luck so > > far. The libmvec has a public interface so it migh
Re: [Wien] Installation with MPI and GNU compilers
-- Původní e-mail -- Od: Fecher, Gerhard <fec...@uni-mainz.de> Komu: Pavel Ondračka <pavel.ondra...@email.cz>, wien@zeus.theochem.tuwien. ac.at <wien@zeus.theochem.tuwien.ac.at> Datum: 2. 5. 2018 16:08:06 Předmět: AW: [Wien] Installation with MPI and GNU compilers "Dear Pavel, maybe it's better to ask Laurence, seems he was writing the VML things. I didn't look into the code within the last years, what I found on a fast look is: The only place where the INTEL_VML is used any longer seems to be in Hamilt. f of LAPW1 I found that it is commented in all other cases where it was once used. If you don't use INTEL_VML, the INTEL ifort will vectorice the loops in vectf.f of LAPW1 (see code in Hamilt.f that calls it) (as I mentioned, maybe one has to link the libsvml explicitely """ "" BTW is svml part of the MKL or do you need the ifort for that? " For example -O2 -xHost -qopt-report=1 -qopt-report-phase=vec will show you which loops were vectorized" Indeed, if I add the -O2 and -xHost to the default Wien2k flags (with ifort and MKL) there is no performance hit if I remove the -DINTEL_VML. "I could not see that the svml has a reduced accuracy, however, you can set the performance/accuracy level in the VML. What you can do is to set a threshhold for the loop size (similar to unroll), might need some short study of the manual. " Interesting, I will try to run some tests for the speed and accuracy of some basic trigonometric functions for ifort vs gfortran and standard glibc vs libmvec vs VML vs svml. " I could not see that in W2kinit.F a threshold for the loops (size of the arrays) was set, only the precision was set there for the INTEL_VML script, however, I guess that Laurence used it where only large arrays appeared. NB: I enjoy more questions about how to increase the speed or how to improve the code. " Well, I do believe that the code is well optimized when you have the ifort + MKL, however the rest of the options is a somewhat worse. Since you can nowadays get the MKL library for free (but not the ifort) there is the combination of gfortran + MKL, which does not have any default config and is slow as was reported by Rui in beginning of the thread. I'm quite sure this combination can be made almost as fast as the ifort + MKL (either by somewhat fixing the INTEL_VML define to fix the missing ifcore problem, or possibly by using the -mveclibabi=svml gfortran switch or some other trick). I'm not sure how many people have this setup though. The most problematic is the gfortran + OpenBLAS combination, where I was not able to force gfortran use the vectorized (SIMD) math. It works with C code (which is why my approach to making lapw1 fast includes porting the vectf.f to C) but not with Fortran. It is possible there is some way to make this work but I had no luck so far. The libmvec has a public interface so it might be possible to call it directly similarly to the VML, however it would introduce a lot of #ifdef LIBMVEC to the code which I guess is not a good idea. I would like to have this working better out of the box so I'll keep looking for some solution which would not require extensive changes in the code or siteconfig script. Dunno if the authors are accepting patches anyway... Best regards Pavel " Ciao Gerhard DEEP THOUGHT in D. Adams; Hitchhikers Guide to the Galaxy: "I think the problem, to be quite honest with you, is that you have never actually known what the question is." Dr. Gerhard H. Fecher Institut of Inorganic and Analytical Chemistry Johannes Gutenberg - University 55099 Mainz and Max Planck Institute for Chemical Physics of Solids 01187 Dresden Von: Pavel Ondračka [pavel.ondra...@email.cz] Gesendet: Mittwoch, 2. Mai 2018 12:05 An: Fecher, Gerhard Betreff: Re: [Wien] Installation with MPI and GNU compilers I'm using private answer since this might be getting too technical for the list and in fact not interesting for majority of users... Fecher, Gerhard píše v St 02. 05. 2018 v 09:00 +: > I never checked that: does the -DINTEL_VML switch correspond to the > VML library routines of MKL > or to the > SVML library routines of the compiler The lapw1 calls directly the VML library, for example the vdcos, vdsin functions, but I have not checked the rest of Wien2k. > this makes a difference, the svml routines are automatically invoked > by the INTEL compiler if one uses -O2 optimization or higher. > (check also the usage of the switches -vec, -no-vec, -vec-report) > > The VML routines of the MKL make only sense for appropriate sizes of > the vectors, otherwise, they may even slow down the program (how much > might also depend on threads etc.). The common usage of the VML in Wien2k is to call the VML functions with a _larg
Re: [Wien] Installation with MPI and GNU compilers
Rui Costa píše v Po 30. 04. 2018 v 22:24 +0100: > I have the VML libraries, i.e., the libmkl_vml_* files are in > $MKLROOT/lib/intel_64, but when I tried compiling with -DINTEL_VML it > gave me the error "Fatal Error: Can't open module file ‘ifcore.mod’ > for reading at (1): No such file or directory", and this file only > comes with the compilers. Yeah, I have not realized that the INTEL_VML ifdef also guards the use of ifcore stuff, IMO this could be improved by using two defines, one for the actual VML calls (which would be defined when MKL is present) and one for the ifcore library calls (which would be defined only when also the ifort is detected). BTW as a quick hack to make the lapw1 fast, just change all the #if defined (INTEL_VML) lines in SRC_lapw1/hamilt.F to #if defined (INTEL_VML_HAMILT) and add the -DINTEL_VML_HAMILT flag this should be all that is needed to use the VML in lapw1 > To use the libmvec library I would have to change a few lines of code > in the mkl libraries and that is beyond my computer skills. Actually no changes to the MKL are required. The least obtrusive way as described in https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac. at/msg16159.html only consist of copying single c file to SRC_lapw1 compiling it by hand and than rerunning make to link lapw1 with the new object file (i.e. no changes to any Wien2k files are needed). However the VML way is easier when you already have the MKL set up. Best regards, Pavel > Best regards, > Rui Costa. > > On 30 April 2018 at 20:57, Pavel Ondračka <pavel.ondra...@email.cz> > wrote: > > -- Původní e-mail -- > > Od: Rui Costa <ruicosta@gmail.com> > > Komu: A Mailing list for WIEN2k users <w...@zeus.theochem.tuwien.ac > > .at> > > Datum: 30. 4. 2018 19:39:44 > > Předmět: Re: [Wien] Installation with MPI and GNU compilers > > > > > I was able to install wien2k with gfortran+MKL. Apparently the > > > MKL libraries are free [https://software.intel.com/en-us/performa > > > nce-libraries] but not the compilers. > > > > > > While doing the benchmark tests we noticed that during the Hamilt > > > there was a huge difference between this and an ifort+MKL > > > compilation, and as Pavel said, this comes from the VML > > > functions. This is not the case during DIAG because while the > > > DIAG belongs to MKL, Hamilt is from wien2k. I then tried to > > > compile with these VML functions but I couldn't because I need an > > > ifcore.mod file that comes with intel compilers I think, at least > > > it is not in the free MKL version. > > > > > > Do you have any recommendation about the compilation options that > > > could better optimize wien2k? > > > > Dear Rui, > > > > so to make this clear, your MKL comes without the VML, or are you > > just not able to use/link them? I do not understand the part with > > the ifcore.mod much, however the VML paths are guarded with some > > ifdef magic, try adding -DINTEL_VML to your flags (FOPT, FPOPT) > > and see if it helps. > > > > The second option is to use the libmvec library (provided you have > > fairly new glibc) but it is unsupported by the Wien2k team and > > probably not tested by many people except me. If you cannot get the > > VML working, look for older emails discussing libmvec or contact me > > privately and I can give you some pointers. > > > > No idea about the -it problem though. > > > > Best regards > > Pavel > > > > ___ > > Wien mailing list > > Wien@zeus.theochem.tuwien.ac.at > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus. > > theochem.tuwien.ac.at/index.html > > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.th > eochem.tuwien.ac.at/index.html ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Installation with MPI and GNU compilers
-- Původní e-mail -- Od: Rui CostaKomu: A Mailing list for WIEN2k users Datum: 30. 4. 2018 19:39:44 Předmět: Re: [Wien] Installation with MPI and GNU compilers " I was able to install wien2k with gfortran+MKL. Apparently the MKL libraries are free [https://software.intel.com/en-us/performance-libraries (https://software.intel.com/en-us/performance-libraries)] but not the compilers. While doing the benchmark tests we noticed that during the Hamilt there was a huge difference between this and an ifort+MKL compilation, and as Pavel said, this comes from the VML functions. This is not the case during DIAG because while the DIAG belongs to MKL, Hamilt is from wien2k. I then tried to compile with these VML functions but I couldn't because I need an ifcore. mod file that comes with intel compilers I think, at least it is not in the free MKL version. Do you have any recommendation about the compilation options that could better optimize wien2k? " Dear Rui, so to make this clear, your MKL comes without the VML, or are you just not able to use/link them? I do not understand the part with the ifcore.mod much, however the VML paths are guarded with some ifdef magic, try adding - DINTEL_VML to your flags (FOPT, FPOPT) and see if it helps. The second option is to use the libmvec library (provided you have fairly new glibc) but it is unsupported by the Wien2k team and probably not tested by many people except me. If you cannot get the VML working, look for older emails discussing libmvec or contact me privately and I can give you some pointers. No idea about the -it problem though. Best regards Pavel ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Installation with MPI and GNU compilers
Laurence Marks píše v St 04. 04. 2018 v 16:01 +: > I confess to being rather doubtful that gfortran+... is comparable to > ifort+... for Intel cpu, it might be for AMD. While the mkl vector > libraries are useful in a few codes such as aim, they are minor for > the main lapw[0-2]. Well, some fast benchmark data then (serial benchmark single core): Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz (haswell) Wien2k 17.1 - gfortran 7.3.1 + OPENBLAS 0.2.20 + glibc 2.26 (with the custom patch to use libmvec): Time for al,bl(hamilt, cpu/wall) : 0.2 0.2 Time for legendre (hamilt, cpu/wall) : 0.1 0.2 Time for phase(hamilt, cpu/wall) : 1.2 1.2 Time for us (hamilt, cpu/wall) : 1.2 1.2 Time for overlaps (hamilt, cpu/wall) : 2.6 2.8 Time for distrib (hamilt, cpu/wall) : 0.1 0.1 Time sum iouter (hamilt, cpu/wall) : 5.5 5.8 number of local orbitals, nlo (hamilt) 304 allocate YL 2.5 MB dimensions15 3481 3 allocate phsc 0.1 MB dimensions 3481 Time for los (hamilt, cpu/wall) : 0.4 0.3 Time for alm (hns) : 0.1 Time for vector (hns) : 0.3 Time for vector2 (hns) : 0.3 Time for VxV (hns) : 2.1 Wall Time for VxV(hns) : 0.1 245 Eigenvalues computed Seclr4(Cholesky complete (CPU)) : 1.380 40754.14 Mflops Seclr4(Transform to eig.problem (CPU)) :4.470 37745.44 Mflops Seclr4(Compute eigenvalues (CPU)) :12.750 17643.13 Mflops Seclr4(Backtransform (CPU)) : 0.290 10237.08 Mflops TIME HAMILT (CPU) = 5.8, HNS = 2.5, HORB = 0.0, DIAG =18.9 TIME HAMILT (WALL) = 6.1, HNS = 2.5, HORB = 0.0, DIAG =19.0 real0m28.610s user0m27.817s sys 0m0.394s --- Ifort 17.0.0 + MKL 2017.0: Time for al,bl(hamilt, cpu/wall) : 0.2 0.2 Time for legendre (hamilt, cpu/wall) : 0.1 0.2 Time for phase(hamilt, cpu/wall) : 1.2 1.3 Time for us (hamilt, cpu/wall) : 1.0 1.0 Time for overlaps (hamilt, cpu/wall) : 2.6 2.8 Time for distrib (hamilt, cpu/wall) : 0.1 0.1 Time sum iouter (hamilt, cpu/wall) : 5.4 5.6 number of local orbitals, nlo (hamilt) 304 allocate YL 2.5 MB dimensions15 3481 3 allocate phsc 0.1 MB dimensions 3481 Time for los (hamilt, cpu/wall) : 0.2 0.2 Time for alm (hns) : 0.0 Time for vector (hns) : 0.4 Time for vector2 (hns) : 0.4 Time for VxV (hns) : 2.1 Wall Time for VxV(hns) : 0.1 245 Eigenvalues computed Seclr4(Cholesky complete (CPU)) : 1.110 50667.31 Mflops Seclr4(Transform to eig.problem (CPU)) :3.580 47129.09 Mflops Seclr4(Compute eigenvalues (CPU)) :11.320 19873.04 Mflops Seclr4(Backtransform (CPU)) : 0.250 11875.01 Mflops TIME HAMILT (CPU) = 5.7, HNS = 2.6, HORB = 0.0, DIAG =16.3 TIME HAMILT (WALL) = 5.9, HNS = 2.6, HORB = 0.0, DIAG =16.3 real0m25.587s user0m24.857s sys 0m0.321s - So I apologize for my statement in the last email that was too ambitious. Indeed in this particular case the opensource stack is ~12% slower (25 vs 28 seconds). Most of this is in the DIAG part (which I believe is where OpenBLAS comes to play). However on some other (older) Intel CPUs the DIAG part can be even faster with OpenBLAS, see the already mentioned email by prof. Blaha https://www.mail-archive.com/wie n...@zeus.theochem.tuwien.ac.at/msg15106.html where he tested on i7-3930K (sandybridge), hence for those older CPUs I would expect the performance to be really comparable (with the small patch to utilize the libmvec in order to speed up the HAMILT part). In general the opensource support is usually slow to materialize hence the performance on older CPUs is better. Especially in the OpenBLAS where the optimizations for new CPUs and instruction sets are not provided by Intel (contrary to the gcc, gfrortran and glibc where Intel engineers contribute directly) while the MKL and ifort have good support from day 1. I do agree that it is better to advise users to use MKL+ifort since when they have it properly installed the siteconfig is almost always able to detect and build everything out of the box with default config. This is unfortunately not the case with the opensource libraries, where the detection does not work most of time due to distro differences and the unfortunate fact that majority of the needed libraries does not provide any good means for autodetection
Re: [Wien] Installation with MPI and GNU compilers
Rui Costa píše v St 04. 04. 2018 v 14:21 +0100: > I will see what I can do about the Intel compilers. I've had a > question about this, supposedly the intel compilers are the fastest > [https://www.mail- > archive.com/wien@zeus.theochem.tuwien.ac.at/msg13021.html], but how > much faster are they than the others? I expect this to vary from case > to case but on average, how much faster are they? In fact the compiler (e.g. ifort vs gfortran) hardly makes a difference . The important part are the algebra libraries. The opensource OpenBLAS should be almost identical to Intels MKL (see https://www.mail-archive. com/wien@zeus.theochem.tuwien.ac.at/msg15106.html for comparison of OenBLAS vs MKL). However in this old benchmark the opensource stack is still quite slower since the MKL also provides the VML library for vectorized math functions, which did not had any open source alternative for a long time. Recently there is the libmvec library which provides such functions (you need recent glibc), but there is no official Wien2k support for this. However it is actually quite easy to get it working (see https://www.mail-archive.com/wien@zeus.theochem.tuw ien.ac.at/msg16159.html ). Hence if you use the gfortran + OpenBLAS + libmvec the performance is virtually identical to ifort + MKL + VML. The setup is somewhat more difficult though. Best regards Pavel > My objective is not to do simulations with mpi in the computer that > I'm trying to install but to figure out how to install wien2k with > mpi and then give some guidelines to the IT technician. I spent two > weeks telling them that the simulations were not running because the > packages were not compiled and in the end everything was poorly > installed. > > Thank you for your help. > > Best regards, > Rui Costa. > ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html