Is the use  of the "current petsc-dev", and "Using PETSc 3.1-p8" both built 
with the exact same MPI?

  Are you using shared or static libraries for OpenMPI and PETSc? 

  Are you using the exact same mpiexec to start up all the cases?

  If you change the order of the four nodes that you run this on does the 
"oddball" result process rank always refer to the same physical node? That is 
if the machine that is now used as the fourth node is instead used as the third 
node does the wrong answer appear on then on the third node or still on the 
fourth? If you use a different physical machine for the fourth node does the 
problem persist?

  If you get rid of the rand() call and just set the fileRandomNumber value 
with say 450385 does it behave the same way?

  The reason I am asking you all these questions is that this is a very strange 
error that defies easy explanation; since it is just an MPI call the fact that 
PETSc is used shouldn't matter (yet it does).


   Barry

On Mar 22, 2011, at 12:50 PM, Thomas Witkowski wrote:

> Zitat von Barry Smith <bsmith at mcs.anl.gov>:
> 
>> 
>> On Mar 22, 2011, at 11:08 AM, Thomas Witkowski wrote:
>> 
>>> Could some of you test the very small attached example? I make use  of the 
>>> current petsc-dev, OpenMPI 1.4.1 and GCC 4.2.4. In this  environment, using 
>>> 4 nodes, I get the following output, which is  wrong:
>>> 
>>> [3] BCAST-RESULT: 812855920
>>> [2] BCAST-RESULT: 450385
>>> [1] BCAST-RESULT: 450385
>>> [0] BCAST-RESULT: 450385
>>> 
>>> The problem occurs only when I run the code on different nodes.  When I 
>>> start mpirun on only one node with four threads
>> 
>>   You mean 4 MPI processes?
> 
> Yes.
> 
>> 
>> 
>>> or I make use of a four core system, everything is fine. valgrind  and 
>>> Allinea DDT, both say that everything is fine. So I'm really  not sure 
>>> where the problem is. Using PETSc 3.1-p8 there is no  problem with this 
>>> example. Would be quite interesting to know if  some of you can reproduce 
>>> this problem or not. Thanks for any try!
>> 
>>   Replace the PetscInitialize() and PetscFinalize() with MPI_Init()  and 
>> MPI_Finalize() and remove the include petsc.h now link under old  and new 
>> PETSc and run under the different systems.
>> 
>>   I'm thinking you'll still get the wrong result without the Petsc  calls 
>> indicating that it is an MPI issue.
> 
> No! When I already did this test. In this case I get the correct results!
> 
> Thomas
> 
> 
>> 
>>   Barry
>> 
>>> 
>>> Thomas
>>> 
>>> <test.c>
>> 
>> 
>> 
> 
> 


Reply via email to