On Tue, Sep 24, 2013 at 8:35 AM, Analabha Roy <[email protected]>wrote:
> > > > On Tue, Sep 24, 2013 at 8:41 PM, Matthew Knepley <[email protected]>wrote: > >> On Tue, Sep 24, 2013 at 8:08 AM, Analabha Roy <[email protected]>wrote: >> >>> >>> On Tue, Sep 24, 2013 at 1:42 PM, Jed Brown <[email protected]> wrote: >>> >>>> Analabha Roy <[email protected]> writes: >>>> >>>> > Hi all, >>>> > >>>> > >>>> > Compiling and running this >>>> > code< >>>> https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c >>>> >that >>>> > builds a petsc matrix gives different results when run with different >>>> > number of processors. >>>> >>>> >>> Thanks for the reply. >>> >>> >>>> Uh, if you call rand() on different processors, why would you expect >>>> it >>>> to give the same results? >>>> >>>> Right, I get that. The rand() was a placeholder. >>> >>> This original much larger >>> code<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c>replicates >>> the same loop structure and runs the same Petsc subroutines, but >>> running it by >>> >>> mpirun -np $N ./eth -lattice_size 5 -vector_size 1 -repulsion 0.0 >>> -draw_out -draw_pause -1 >>> >>> with N=1,2,3,4 gives different results for the matrix dumped out by >>> lines >>> 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>. >>> The matrix itself is evaluated in parallel, created in lines 263-275 >>> <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#263>and >>> evaluated in lines >>> 294-356<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#294> >>> >>> (you can click on the line numbers above to navigate directly to them) >>> >>> Here is a sample <http://i43.tinypic.com/zyhf2f.jpg> of the output of >>> lines >>> 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514> >>> for N=1,2,3,4 procs left to right. >>> >>> Thty're different for different procs. They should be the same, since >>> none of my input parameters are numprocs dependent, and I don't explicitly >>> use the size or rank anywhere in the code. >>> >> >> You are likely not dividing the rows you loop over so you are redundantly >> computing. >> > > Thanks for the reply. > > Line > 274<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#274>gets > the local row indices of Petsc Matrix > AVG_BDIBJ > > Line 295 > <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#295>iterates > over the local rows and the lines below get the column > elements. For each row, the column elements are assigned by the lines up > to Line > 344<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#344>and > stored locally in colvalues[]. Dunno if the details are relevant. > > Line > 347<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#347>inserts > the sitestride1^th row into the matrix > > Line > 353+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#353>does > the mat assembly > > Then, after a lot of currently irrelevant code, > > Line > 514+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>dumps > the mat plot to graphics > > > Different numprocs give different matrices. > > Can somebody suggest what I did wrong (or didn't do)? > Different values are being given to MatSetValues() for different numbers of processes. So 1) Reduce this to the smallest problem size possible 2) Print out all rows/cols/values for each call 3) Compare 2 procs to the serial case Matt > >> Matt >> >> >>> >>> >>>> for (sitestride1 = Istart; sitestride1 < Iend; sitestride1++) >>>> { >>>> for (sitestride2 = 0; sitestride2 < matsize; sitestride2++) >>>> { >>>> for (alpha = 0; alpha < dim; alpha++) >>>> { >>>> for (mu = 0; mu < dim; mu++) >>>> for (lambda = 0; lambda < dim; lambda++) >>>> { >>>> vecval = rand () / rand (); >>>> } >>>> >>>> VecSetValue (BDB_AA, alpha, vecval, INSERT_VALUES); >>>> >>>> } >>>> VecAssemblyBegin (BDB_AA); >>>> VecAssemblyEnd (BDB_AA); >>>> VecSum (BDB_AA, &element); >>>> colvalues[sitestride2] = element; >>>> >>>> } >>>> //Insert the array of colvalues to the sitestride1^th row of H >>>> MatSetValues (AVG_BDIBJ, 1, &sitestride1, matsize, idx, colvalues, >>>> INSERT_VALUES); >>>> >>>> } >>>> >>>> > The code is large and complex, so I have created a smaller program >>>> > with the same >>>> > loop structure here. <http://pastebin.ca/2457643> >>>> > >>>> > Compile it and run it with "mpirun -np $N ./test -draw_pause -1" gives >>>> > different results for different values of N even though it's not >>>> supposed >>>> > to. >>>> >>>> What do you expect to see? >>>> >>>> > Here is a sample output <http://i42.tinypic.com/2s16ccw.jpg> for >>>> N=1,2,3,4 >>>> > from left to right. >>>> > >>>> > Can anyone guide me as to what I'm doing wrong? Are any of the petssc >>>> > routines used not parallelizable? >>>> > >>>> > Thanks in advance, >>>> > >>>> > Regards. >>>> > >>>> > -- >>>> > --- >>>> > *Analabha Roy* >>>> > C.S.I.R <http://www.csir.res.in> Senior Research >>>> > Associate<http://csirhrdg.res.in/poolsra.htm> >>>> > Saha Institute of Nuclear Physics <http://www.saha.ac.in> >>>> > Section 1, Block AF >>>> > Bidhannagar, Calcutta 700064 >>>> > India >>>> > *Emails*: [email protected], [email protected] >>>> > *Webpage*: http://www.ph.utexas.edu/~daneel/ >>>> >>> >>> >>> >>> -- >>> --- >>> *Analabha Roy* >>> C.S.I.R <http://www.csir.res.in> Senior Research >>> Associate<http://csirhrdg.res.in/poolsra.htm> >>> Saha Institute of Nuclear Physics <http://www.saha.ac.in> >>> Section 1, Block AF >>> Bidhannagar, Calcutta 700064 >>> India >>> *Emails*: [email protected], [email protected] >>> *Webpage*: http://www.ph.utexas.edu/~daneel/ >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > --- > *Analabha Roy* > C.S.I.R <http://www.csir.res.in> Senior Research > Associate<http://csirhrdg.res.in/poolsra.htm> > Saha Institute of Nuclear Physics <http://www.saha.ac.in> > Section 1, Block AF > Bidhannagar, Calcutta 700064 > India > *Emails*: [email protected], [email protected] > *Webpage*: http://www.ph.utexas.edu/~daneel/ > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
