On Tue, Sep 24, 2013 at 8:41 PM, Matthew Knepley <[email protected]> wrote:
> On Tue, Sep 24, 2013 at 8:08 AM, Analabha Roy <[email protected]>wrote: > >> >> On Tue, Sep 24, 2013 at 1:42 PM, Jed Brown <[email protected]> wrote: >> >>> Analabha Roy <[email protected]> writes: >>> >>> > Hi all, >>> > >>> > >>> > Compiling and running this >>> > code< >>> https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c >>> >that >>> > builds a petsc matrix gives different results when run with different >>> > number of processors. >>> >>> >> Thanks for the reply. >> >> >>> Uh, if you call rand() on different processors, why would you expect it >>> to give the same results? >>> >>> Right, I get that. The rand() was a placeholder. >> >> This original much larger >> code<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c>replicates >> the same loop structure and runs the same Petsc subroutines, but >> running it by >> >> mpirun -np $N ./eth -lattice_size 5 -vector_size 1 -repulsion 0.0 >> -draw_out -draw_pause -1 >> >> with N=1,2,3,4 gives different results for the matrix dumped out by lines >> 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>. >> The matrix itself is evaluated in parallel, created in lines 263-275 >> <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#263>and >> evaluated in lines >> 294-356<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#294> >> >> (you can click on the line numbers above to navigate directly to them) >> >> Here is a sample <http://i43.tinypic.com/zyhf2f.jpg> of the output of >> lines >> 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514> >> for N=1,2,3,4 procs left to right. >> >> Thty're different for different procs. They should be the same, since >> none of my input parameters are numprocs dependent, and I don't explicitly >> use the size or rank anywhere in the code. >> > > You are likely not dividing the rows you loop over so you are redundantly > computing. > Thanks for the reply. Line 274<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#274>gets the local row indices of Petsc Matrix AVG_BDIBJ Line 295 <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#295>iterates over the local rows and the lines below get the column elements. For each row, the column elements are assigned by the lines up to Line 344<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#344>and stored locally in colvalues[]. Dunno if the details are relevant. Line 347<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#347>inserts the sitestride1^th row into the matrix Line 353+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#353>does the mat assembly Then, after a lot of currently irrelevant code, Line 514+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>dumps the mat plot to graphics Different numprocs give different matrices. Can somebody suggest what I did wrong (or didn't do)? > > Matt > > >> >> >>> for (sitestride1 = Istart; sitestride1 < Iend; sitestride1++) >>> { >>> for (sitestride2 = 0; sitestride2 < matsize; sitestride2++) >>> { >>> for (alpha = 0; alpha < dim; alpha++) >>> { >>> for (mu = 0; mu < dim; mu++) >>> for (lambda = 0; lambda < dim; lambda++) >>> { >>> vecval = rand () / rand (); >>> } >>> >>> VecSetValue (BDB_AA, alpha, vecval, INSERT_VALUES); >>> >>> } >>> VecAssemblyBegin (BDB_AA); >>> VecAssemblyEnd (BDB_AA); >>> VecSum (BDB_AA, &element); >>> colvalues[sitestride2] = element; >>> >>> } >>> //Insert the array of colvalues to the sitestride1^th row of H >>> MatSetValues (AVG_BDIBJ, 1, &sitestride1, matsize, idx, colvalues, >>> INSERT_VALUES); >>> >>> } >>> >>> > The code is large and complex, so I have created a smaller program >>> > with the same >>> > loop structure here. <http://pastebin.ca/2457643> >>> > >>> > Compile it and run it with "mpirun -np $N ./test -draw_pause -1" gives >>> > different results for different values of N even though it's not >>> supposed >>> > to. >>> >>> What do you expect to see? >>> >>> > Here is a sample output <http://i42.tinypic.com/2s16ccw.jpg> for >>> N=1,2,3,4 >>> > from left to right. >>> > >>> > Can anyone guide me as to what I'm doing wrong? Are any of the petssc >>> > routines used not parallelizable? >>> > >>> > Thanks in advance, >>> > >>> > Regards. >>> > >>> > -- >>> > --- >>> > *Analabha Roy* >>> > C.S.I.R <http://www.csir.res.in> Senior Research >>> > Associate<http://csirhrdg.res.in/poolsra.htm> >>> > Saha Institute of Nuclear Physics <http://www.saha.ac.in> >>> > Section 1, Block AF >>> > Bidhannagar, Calcutta 700064 >>> > India >>> > *Emails*: [email protected], [email protected] >>> > *Webpage*: http://www.ph.utexas.edu/~daneel/ >>> >> >> >> >> -- >> --- >> *Analabha Roy* >> C.S.I.R <http://www.csir.res.in> Senior Research >> Associate<http://csirhrdg.res.in/poolsra.htm> >> Saha Institute of Nuclear Physics <http://www.saha.ac.in> >> Section 1, Block AF >> Bidhannagar, Calcutta 700064 >> India >> *Emails*: [email protected], [email protected] >> *Webpage*: http://www.ph.utexas.edu/~daneel/ >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- --- *Analabha Roy* C.S.I.R <http://www.csir.res.in> Senior Research Associate<http://csirhrdg.res.in/poolsra.htm> Saha Institute of Nuclear Physics <http://www.saha.ac.in> Section 1, Block AF Bidhannagar, Calcutta 700064 India *Emails*: [email protected], [email protected] *Webpage*: http://www.ph.utexas.edu/~daneel/
