Andrea,

That's what it seems to be happening, indeed. However, I had never
seen this error before - such a small system is not worthy of going
into parallel tests, especially on such a huge number of processors!
If you were using a larger system, you wouldn't have such an error.
Try making a 2x2x1 supercell - it'll run with no problems.

As for efficiency - that's a whole other story :)

Cheers,

Marcos

On Wed, Jul 7, 2010 at 12:05 AM,  <[email protected]> wrote:
> Dear Marcos,
>
> Many thanks for your prompt reply.  Today I have performed a systematic
> study  (for the same system) by changing the number of processors. I found
> out that SIESTA runs successfully in parallel mode up to a limit number of
> processors. Above that limit,  the program stopped with the error message
> previously described (at the end of my original mail).
>
> In addition, when the program stopped, I found a message within the output
> file reading:
>
> %%%%%%%%%%%%%%%%%%%%%%
> Some processors are idle. Check PARALLEL_DIST
> You have too many processors for the system size !!!
> %%%%%%%%%%%%%%%%%%%%%%%%
>
> and by checking the PARALLEL_DIST file
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%
> Node            0  handles            4  orbitals.
>  Node            1  handles            4  orbitals.
>  Node            2  handles            4  orbitals.
>  Node            3  handles            4  orbitals.
>  Node            4  handles            4  orbitals.
>  Node            5  handles            4  orbitals.
>  Node            6  handles            4  orbitals.
>  Node            7  handles            4  orbitals.
>  Node            8  handles            4  orbitals.
>  Node            9  handles            4  orbitals.
>  Node           10  handles            4  orbitals.
>  Node           11  handles            4  orbitals.
>  Node           12  handles            4  orbitals.
>  Node           13  handles            0  orbitals.
>  Node           14  handles            0  orbitals.
>  Node           15  handles            0  orbitals.
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> Where you can see that there are 3 processors that are idle.  Therefore,
> it seems that when you have a somehow inefficient distribution of
> processors, SIESTA stops. What do you think?
>
> Best regards,
>
> Andrea
>
>
>
>
>
>
>> Andrea,
>>
>> In principle nothing should change, in terms of success in execution
>> of Siesta from what I can remember. Basis sets and pseudos (as long as
>> the latter are in psf format) should not pose a problem. From your
>> error, it seems that the problem could be with your mpi, since it runs
>> successfuly in serial mode. Have you performed a system update lately?
>> I think not too long ago someone was having problems like yours and
>> re-compilation from scratch (mpi, then blacs, then scalapack, then
>> siesta) was the solution because the libc's had been updated and that
>> interfered with the mpich libraries. This shouldn't be too difficult
>> to do since you already compiled it anyway. Unfortunately you don't
>> provide much information, such as when the error occurs, if the input
>> is read correctly and so on, so it gets difficult to think of a
>> possible cause and a less painful solution. :)
>>
>> If re-compilation from scratch doesn't work, then it is indeed a
>> Siesta error. In cases like this, a "one-change-at-a-time-search" for
>> the error is recommendable.
>>
>> Some things you could try (one at a time):
>>
>> 1) Try re-initializing the DM from scratch, instead of using the old one.
>> 2) Try replacing the MP occupation with Fermi-Dirac smearing. MP
>> apparently has a bug.
>> 3) Try using a standard DZP basis set just for the sake of testing.
>> 4) Re-generate your pseudo, could it be that your pseudo is somehow
>> corrupted? Someone was having problems with pseudos from a previous
>> run and found out there were spurious characters by the end of the
>> file, some time ago.
>>
>> Cheers,
>>
>> Marcos
>>
>> On Mon, Jul 5, 2010 at 8:56 PM,  <[email protected]> wrote:
>>> Hi,
>>>
>>>  I have installed (in a cluster) Siesta 2.0 which runs successfully in
>>> parallel mode (it was checked for several systems).
>>>
>>>  Now I want to take up again a problem that I have studied in the past
>>> with
>>>  Siesta 1.3 in serial mode. I am trying to run the same problem now with
>>> Siesta2.0 but it does not run OK in parallel mode but only in serial
>>> one.
>>>
>>>  Do I have to change anything in the pseudos & basis generated within
>>> Siesta 1.3 in serial mode, so as they work OK with Siesta2.0 in
>>> parallel?
>>>
>>>  The message error is at the end of my message. I also send attached the
>>> fdf file.
>>>
>>>  Thanks in advance,
>>>
>>>  Andrea
>>>
>>>> -----------------------------------------------------------------------------
>>> Error Message:
>>>> forrtl: severe (174): SIGSEGV, segmentation fault occurred
>>>> Image              PC                Routine            Line
>>> Source
>>>> siesta             000000000045CAF2  Unknown               Unknown
>>> Unknown
>>>> siesta             0000000000448ACE  Unknown               Unknown
>>> Unknown
>>>> siesta             000000000056515D  Unknown               Unknown
>>> Unknown
>>>> siesta             0000000000410002  Unknown               Unknown
>>> Unknown
>>>> libc.so.6          0000003C33C1D8B4  Unknown               Unknown
>>> Unknown
>>>> siesta             000000000040FF29  Unknown               Unknown
>>> Unknown
>>>> forrtl: error (78): process killed (SIGTERM)
>>>> Image              PC                Routine            Line
>>> Source
>>>> libmpich.so.1.0    00002B3CEF46D062  Unknown               Unknown
>>> Unknown
>>>> libmpich.so.1.0    00002B3CEF44FEFA  Unknown               Unknown
>>> Unknown
>>>> libmpich.so.1.0    00002B3CEF477CB6  Unknown               Unknown
>>> Unknown
>>>> libmpich.so.1.0    00002B3CEF461122  Unknown               Unknown
>>> Unknown
>>>> siesta             00000000007E116D  Unknown               Unknown
>>> Unknown
>>>> siesta             00000000007DF350  Unknown               Unknown
>>> Unknown
>>>> siesta             00000000006C451F  Unknown               Unknown
>>> Unknown
>>>> siesta             00000000006C58D2  Unknown               Unknown
>>> Unknown
>>>> siesta             000000000053D425  Unknown               Unknown
>>> Unknown
>>>> siesta             000000000045D081  Unknown               Unknown
>>> Unknown
>>>> siesta             0000000000448ACE  Unknown               Unknown
>>> Unknown
>>>> siesta             000000000056515D  Unknown               Unknown
>>> Unknown
>>>> siesta             0000000000410002  Unknown               Unknown
>>> Unknown
>>>> libc.so.6          0000003C33C1D8B4  Unknown               Unknown
>>> Unknown
>>>> siesta             000000000040FF29  Unknown               Unknown
>>> Unknown
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
>

Responder a