There is one thing In the code, the evaluation of each element of AVG_BIBJ requires a read-only matrix U_parallel that I input from another program, and a writeable sequential vector BDB_AA that is different for each element.
I sequentiate U_parallel to U_seq by using MatCopy here in lines 242+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#242> and each process is supposed to update its copy of BDB_AA at every loop iteration here in line 347+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#347> Is this right? Or are sequential vectors/matrices handled by the root process only? I know how to scatter a parallel vector to all processes using PetSc scatter contexts but don't see any way to do so to a matrix other than MatCopy. How do I ensure that each process has its own private writeable copy of a sequential vector? On Tue, Sep 24, 2013 at 11:48 PM, Analabha Roy <[email protected]>wrote: > > > > On Tue, Sep 24, 2013 at 11:35 PM, Matthew Knepley <[email protected]>wrote: > >> On Tue, Sep 24, 2013 at 10:58 AM, Analabha Roy <[email protected]>wrote: >> >>> Hi, >>> >>> Sorry for misunderstanding >>> >>> I modified my source thus <http://pastebin.ca/2457850> so that the >>> rows/cols/values for each call are printed before inserting into >>> MatSetValues() >>> >>> >>> Then ran it with 1,2 processors >>> >>> >>> Here are the outputs <http://pastebin.ca/2457852> >>> >>> >>> Strange! Running it with 2 procs and only half the values show up!!!!! >>> >> >> PetscPrintf() only prints from rank 0. Use PETSC_COMM_SELF. >> >> > > > Sorry. Modified accordingly and here is new > output<http://pastebin.ca/2457857>(I manually reordered the output of the 2 > procs case since the order in > which it was printed was haphazard) > > > All the elements do not match. > > > >> >> Matt >> >> >>> And even those do not match!!!! >>> >>> >>> >>> >>> On Tue, Sep 24, 2013 at 11:12 PM, Matthew Knepley <[email protected]>wrote: >>> >>>> On Tue, Sep 24, 2013 at 10:39 AM, Analabha Roy >>>> <[email protected]>wrote: >>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> On Tue, Sep 24, 2013 at 9:33 PM, Matthew Knepley <[email protected]>wrote: >>>>> >>>>>> On Tue, Sep 24, 2013 at 8:35 AM, Analabha Roy <[email protected] >>>>>> > wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 24, 2013 at 8:41 PM, Matthew Knepley >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> On Tue, Sep 24, 2013 at 8:08 AM, Analabha Roy < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Sep 24, 2013 at 1:42 PM, Jed Brown >>>>>>>>> <[email protected]>wrote: >>>>>>>>> >>>>>>>>>> Analabha Roy <[email protected]> writes: >>>>>>>>>> >>>>>>>>>> > Hi all, >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > Compiling and running this >>>>>>>>>> > code< >>>>>>>>>> https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c >>>>>>>>>> >that >>>>>>>>>> > builds a petsc matrix gives different results when run with >>>>>>>>>> different >>>>>>>>>> > number of processors. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Thanks for the reply. >>>>>>>>> >>>>>>>>> >>>>>>>>>> Uh, if you call rand() on different processors, why would you >>>>>>>>>> expect it >>>>>>>>>> to give the same results? >>>>>>>>>> >>>>>>>>>> Right, I get that. The rand() was a placeholder. >>>>>>>>> >>>>>>>>> This original much larger >>>>>>>>> code<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c>replicates >>>>>>>>> the same loop structure and runs the same Petsc subroutines, but >>>>>>>>> running it by >>>>>>>>> >>>>>>>>> mpirun -np $N ./eth -lattice_size 5 -vector_size 1 -repulsion 0.0 >>>>>>>>> -draw_out -draw_pause -1 >>>>>>>>> >>>>>>>>> with N=1,2,3,4 gives different results for the matrix dumped out >>>>>>>>> by lines >>>>>>>>> 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>. >>>>>>>>> The matrix itself is evaluated in parallel, created in lines263-275 >>>>>>>>> <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#263>and >>>>>>>>> evaluated in lines >>>>>>>>> 294-356<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#294> >>>>>>>>> >>>>>>>>> (you can click on the line numbers above to navigate directly to >>>>>>>>> them) >>>>>>>>> >>>>>>>>> Here is a sample <http://i43.tinypic.com/zyhf2f.jpg> of the >>>>>>>>> output of lines >>>>>>>>> 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514> >>>>>>>>> for N=1,2,3,4 procs left to right. >>>>>>>>> >>>>>>>>> Thty're different for different procs. They should be the same, >>>>>>>>> since none of my input parameters are numprocs dependent, and I don't >>>>>>>>> explicitly use the size or rank anywhere in the code. >>>>>>>>> >>>>>>>> >>>>>>>> You are likely not dividing the rows you loop over so you are >>>>>>>> redundantly computing. >>>>>>>> >>>>>>> >>>>>>> Thanks for the reply. >>>>>>> >>>>>>> Line >>>>>>> 274<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#274>gets >>>>>>> the local row indices of Petsc Matrix >>>>>>> AVG_BDIBJ >>>>>>> >>>>>>> Line 295 >>>>>>> <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#295>iterates >>>>>>> over the local rows and the lines below get the column >>>>>>> elements. For each row, the column elements are assigned by the >>>>>>> lines up to Line >>>>>>> 344<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#344>and >>>>>>> stored locally in colvalues[]. Dunno if the details are relevant. >>>>>>> >>>>>>> Line >>>>>>> 347<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#347>inserts >>>>>>> the sitestride1^th row into the matrix >>>>>>> >>>>>>> Line >>>>>>> 353+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#353>does >>>>>>> the mat assembly >>>>>>> >>>>>>> Then, after a lot of currently irrelevant code, >>>>>>> >>>>>>> Line >>>>>>> 514+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>dumps >>>>>>> the mat plot to graphics >>>>>>> >>>>>>> >>>>>>> Different numprocs give different matrices. >>>>>>> >>>>>>> Can somebody suggest what I did wrong (or didn't do)? >>>>>>> >>>>>> >>>>>> Different values are being given to MatSetValues() for different >>>>>> numbers of processes. So >>>>>> >>>>>> 1) Reduce this to the smallest problem size possible >>>>>> >>>>>> 2) Print out all rows/cols/values for each call >>>>>> >>>>>> 3) Compare 2 procs to the serial case >>>>>> >>>>>> >>>>> >>>>> Thanks for your excellent suggestion. >>>>> >>>>> I modified my >>>>> code<https://code.google.com/p/daneelrepo/source/diff?spec=svn1435&r=1435&format=side&path=/eth_question/eth.c>to >>>>> dump the matrix in binary >>>>> >>>>> Then I used this python script I >>>>> had<https://code.google.com/p/daneelrepo/source/browse/eth_question/mat_bin2ascii.py>to >>>>> convert to ascii >>>>> >>>> >>>> Do not print the matrix, print the data you are passing to >>>> MatSetValues(). >>>> >>>> MatSetValues() is not likely to be broken. Every PETSc code in the >>>> world calls this many times on every simulation. >>>> >>>> Matt >>>> >>>> >>>>> >>>>> Here are the values of >>>>> <http://pastebin.ca/2457842>AVG_BDIBJ<http://pastebin.ca/2457842>, >>>>> a 9X9 matrix (the smallest possible problem size) run with the exact same >>>>> input parameters with 1,2,3 and 4 procs >>>>> >>>>> As you can see, the 1 and 2 procs match up, but the 3 and 4 procs do >>>>> not. >>>>> >>>>> Serious wierdness. >>>>> >>>>> >>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> for (sitestride1 = Istart; sitestride1 < Iend; sitestride1++) >>>>>>>>>> { >>>>>>>>>> for (sitestride2 = 0; sitestride2 < matsize; sitestride2++) >>>>>>>>>> { >>>>>>>>>> for (alpha = 0; alpha < dim; alpha++) >>>>>>>>>> { >>>>>>>>>> for (mu = 0; mu < dim; mu++) >>>>>>>>>> for (lambda = 0; lambda < dim; lambda++) >>>>>>>>>> { >>>>>>>>>> vecval = rand () / rand (); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> VecSetValue (BDB_AA, alpha, vecval, INSERT_VALUES); >>>>>>>>>> >>>>>>>>>> } >>>>>>>>>> VecAssemblyBegin (BDB_AA); >>>>>>>>>> VecAssemblyEnd (BDB_AA); >>>>>>>>>> VecSum (BDB_AA, &element); >>>>>>>>>> colvalues[sitestride2] = element; >>>>>>>>>> >>>>>>>>>> } >>>>>>>>>> //Insert the array of colvalues to the sitestride1^th row >>>>>>>>>> of H >>>>>>>>>> MatSetValues (AVG_BDIBJ, 1, &sitestride1, matsize, idx, >>>>>>>>>> colvalues, >>>>>>>>>> INSERT_VALUES); >>>>>>>>>> >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> > The code is large and complex, so I have created a smaller >>>>>>>>>> program >>>>>>>>>> > with the same >>>>>>>>>> > loop structure here. <http://pastebin.ca/2457643> >>>>>>>>>> > >>>>>>>>>> > Compile it and run it with "mpirun -np $N ./test -draw_pause >>>>>>>>>> -1" gives >>>>>>>>>> > different results for different values of N even though it's >>>>>>>>>> not supposed >>>>>>>>>> > to. >>>>>>>>>> >>>>>>>>>> What do you expect to see? >>>>>>>>>> >>>>>>>>>> > Here is a sample output <http://i42.tinypic.com/2s16ccw.jpg> >>>>>>>>>> for N=1,2,3,4 >>>>>>>>>> > from left to right. >>>>>>>>>> > >>>>>>>>>> > Can anyone guide me as to what I'm doing wrong? Are any of the >>>>>>>>>> petssc >>>>>>>>>> > routines used not parallelizable? >>>>>>>>>> > >>>>>>>>>> > Thanks in advance, >>>>>>>>>> > >>>>>>>>>> > Regards. >>>>>>>>>> > >>>>>>>>>> > -- >>>>>>>>>> > --- >>>>>>>>>> > *Analabha Roy* >>>>>>>>>> > C.S.I.R <http://www.csir.res.in> Senior Research >>>>>>>>>> > Associate<http://csirhrdg.res.in/poolsra.htm> >>>>>>>>>> > Saha Institute of Nuclear Physics <http://www.saha.ac.in> >>>>>>>>>> > Section 1, Block AF >>>>>>>>>> > Bidhannagar, Calcutta 700064 >>>>>>>>>> > India >>>>>>>>>> > *Emails*: [email protected], [email protected] >>>>>>>>>> > *Webpage*: http://www.ph.utexas.edu/~daneel/ >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> --- >>>>>>>>> *Analabha Roy* >>>>>>>>> C.S.I.R <http://www.csir.res.in> Senior Research >>>>>>>>> Associate<http://csirhrdg.res.in/poolsra.htm> >>>>>>>>> Saha Institute of Nuclear Physics <http://www.saha.ac.in> >>>>>>>>> Section 1, Block AF >>>>>>>>> Bidhannagar, Calcutta 700064 >>>>>>>>> India >>>>>>>>> *Emails*: [email protected], [email protected] >>>>>>>>> *Webpage*: http://www.ph.utexas.edu/~daneel/ >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>> their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> --- >>>>>>> *Analabha Roy* >>>>>>> C.S.I.R <http://www.csir.res.in> Senior Research >>>>>>> Associate<http://csirhrdg.res.in/poolsra.htm> >>>>>>> Saha Institute of Nuclear Physics <http://www.saha.ac.in> >>>>>>> Section 1, Block AF >>>>>>> Bidhannagar, Calcutta 700064 >>>>>>> India >>>>>>> *Emails*: [email protected], [email protected] >>>>>>> *Webpage*: http://www.ph.utexas.edu/~daneel/ >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which >>>>>> their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> --- >>>>> *Analabha Roy* >>>>> C.S.I.R <http://www.csir.res.in> Senior Research >>>>> Associate<http://csirhrdg.res.in/poolsra.htm> >>>>> Saha Institute of Nuclear Physics <http://www.saha.ac.in> >>>>> Section 1, Block AF >>>>> Bidhannagar, Calcutta 700064 >>>>> India >>>>> *Emails*: [email protected], [email protected] >>>>> *Webpage*: http://www.ph.utexas.edu/~daneel/ >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >>> >>> -- >>> --- >>> *Analabha Roy* >>> C.S.I.R <http://www.csir.res.in> Senior Research >>> Associate<http://csirhrdg.res.in/poolsra.htm> >>> Saha Institute of Nuclear Physics <http://www.saha.ac.in> >>> Section 1, Block AF >>> Bidhannagar, Calcutta 700064 >>> India >>> *Emails*: [email protected], [email protected] >>> *Webpage*: http://www.ph.utexas.edu/~daneel/ >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > --- > *Analabha Roy* > C.S.I.R <http://www.csir.res.in> Senior Research > Associate<http://csirhrdg.res.in/poolsra.htm> > Saha Institute of Nuclear Physics <http://www.saha.ac.in> > Section 1, Block AF > Bidhannagar, Calcutta 700064 > India > *Emails*: [email protected], [email protected] > *Webpage*: http://www.ph.utexas.edu/~daneel/ > -- --- *Analabha Roy* C.S.I.R <http://www.csir.res.in> Senior Research Associate<http://csirhrdg.res.in/poolsra.htm> Saha Institute of Nuclear Physics <http://www.saha.ac.in> Section 1, Block AF Bidhannagar, Calcutta 700064 India *Emails*: [email protected], [email protected] *Webpage*: http://www.ph.utexas.edu/~daneel/
