Dear Marcos, Many thanks for your prompt reply. Today I have performed a systematic study (for the same system) by changing the number of processors. I found out that SIESTA runs successfully in parallel mode up to a limit number of processors. Above that limit, the program stopped with the error message previously described (at the end of my original mail).
In addition, when the program stopped, I found a message within the output file reading: %%%%%%%%%%%%%%%%%%%%%% Some processors are idle. Check PARALLEL_DIST You have too many processors for the system size !!! %%%%%%%%%%%%%%%%%%%%%%%% and by checking the PARALLEL_DIST file %%%%%%%%%%%%%%%%%%%%%%%%%%%% Node 0 handles 4 orbitals. Node 1 handles 4 orbitals. Node 2 handles 4 orbitals. Node 3 handles 4 orbitals. Node 4 handles 4 orbitals. Node 5 handles 4 orbitals. Node 6 handles 4 orbitals. Node 7 handles 4 orbitals. Node 8 handles 4 orbitals. Node 9 handles 4 orbitals. Node 10 handles 4 orbitals. Node 11 handles 4 orbitals. Node 12 handles 4 orbitals. Node 13 handles 0 orbitals. Node 14 handles 0 orbitals. Node 15 handles 0 orbitals. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Where you can see that there are 3 processors that are idle. Therefore, it seems that when you have a somehow inefficient distribution of processors, SIESTA stops. What do you think? Best regards, Andrea > Andrea, > > In principle nothing should change, in terms of success in execution > of Siesta from what I can remember. Basis sets and pseudos (as long as > the latter are in psf format) should not pose a problem. From your > error, it seems that the problem could be with your mpi, since it runs > successfuly in serial mode. Have you performed a system update lately? > I think not too long ago someone was having problems like yours and > re-compilation from scratch (mpi, then blacs, then scalapack, then > siesta) was the solution because the libc's had been updated and that > interfered with the mpich libraries. This shouldn't be too difficult > to do since you already compiled it anyway. Unfortunately you don't > provide much information, such as when the error occurs, if the input > is read correctly and so on, so it gets difficult to think of a > possible cause and a less painful solution. :) > > If re-compilation from scratch doesn't work, then it is indeed a > Siesta error. In cases like this, a "one-change-at-a-time-search" for > the error is recommendable. > > Some things you could try (one at a time): > > 1) Try re-initializing the DM from scratch, instead of using the old one. > 2) Try replacing the MP occupation with Fermi-Dirac smearing. MP > apparently has a bug. > 3) Try using a standard DZP basis set just for the sake of testing. > 4) Re-generate your pseudo, could it be that your pseudo is somehow > corrupted? Someone was having problems with pseudos from a previous > run and found out there were spurious characters by the end of the > file, some time ago. > > Cheers, > > Marcos > > On Mon, Jul 5, 2010 at 8:56 PM, <[email protected]> wrote: >> Hi, >> >> I have installed (in a cluster) Siesta 2.0 which runs successfully in >> parallel mode (it was checked for several systems). >> >> Now I want to take up again a problem that I have studied in the past >> with >> Siesta 1.3 in serial mode. I am trying to run the same problem now with >> Siesta2.0 but it does not run OK in parallel mode but only in serial >> one. >> >> Do I have to change anything in the pseudos & basis generated within >> Siesta 1.3 in serial mode, so as they work OK with Siesta2.0 in >> parallel? >> >> The message error is at the end of my message. I also send attached the >> fdf file. >> >> Thanks in advance, >> >> Andrea >> >>> ----------------------------------------------------------------------------- >> Error Message: >>> forrtl: severe (174): SIGSEGV, segmentation fault occurred >>> Image PC Routine Line >> Source >>> siesta 000000000045CAF2 Unknown Unknown >> Unknown >>> siesta 0000000000448ACE Unknown Unknown >> Unknown >>> siesta 000000000056515D Unknown Unknown >> Unknown >>> siesta 0000000000410002 Unknown Unknown >> Unknown >>> libc.so.6 0000003C33C1D8B4 Unknown Unknown >> Unknown >>> siesta 000000000040FF29 Unknown Unknown >> Unknown >>> forrtl: error (78): process killed (SIGTERM) >>> Image PC Routine Line >> Source >>> libmpich.so.1.0 00002B3CEF46D062 Unknown Unknown >> Unknown >>> libmpich.so.1.0 00002B3CEF44FEFA Unknown Unknown >> Unknown >>> libmpich.so.1.0 00002B3CEF477CB6 Unknown Unknown >> Unknown >>> libmpich.so.1.0 00002B3CEF461122 Unknown Unknown >> Unknown >>> siesta 00000000007E116D Unknown Unknown >> Unknown >>> siesta 00000000007DF350 Unknown Unknown >> Unknown >>> siesta 00000000006C451F Unknown Unknown >> Unknown >>> siesta 00000000006C58D2 Unknown Unknown >> Unknown >>> siesta 000000000053D425 Unknown Unknown >> Unknown >>> siesta 000000000045D081 Unknown Unknown >> Unknown >>> siesta 0000000000448ACE Unknown Unknown >> Unknown >>> siesta 000000000056515D Unknown Unknown >> Unknown >>> siesta 0000000000410002 Unknown Unknown >> Unknown >>> libc.so.6 0000003C33C1D8B4 Unknown Unknown >> Unknown >>> siesta 000000000040FF29 Unknown Unknown >> Unknown >> >> >> >> >> >> >
