> On 26 Feb 2023, at 8:07 PM, Mike Michell <[email protected]> wrote: > > I cannot agree with this argument, unless you also tested with petsc 3.18.4 > tarball from https://petsc.org/release/install/download/. > If library has issue, it is trivial that I will see an error from my code. > > I ran my code with valgrind and see no error if it is with petsc 3.18.4. You > can test with my code with valgrind or address sanitizer with this version of > petsc-3.18.4.tar.gz from (https://petsc.org/release/install/download/). I > expect you see no error. > > > Let me ask my question differently: > Has any change been made on DMPlexMarkBoundaryFaces() recently?
Yes, and it will may break your application if you are not careful about it: https://gitlab.com/petsc/petsc/-/commit/a29bf4df3e5335fbd3b27b552a624c7f2a5a2f0a Thanks, Pierre > I found that the latest petsc does not recognize parallel (but not physical) > boundary as boundary for distributed dm (line 235 of my example code). > Because of this, you saw the error from the arrays: > > ! midpoint of median-dual face for inner face > axrf(ifa,1) = 0.5d0*(yc(nc1)+yfc(ifa)) ! for nc1 cell > axrf(ifa,2) = 0.5d0*(yc(nc2)+yfc(ifa)) ! for nc2 cell > > and these were allocated here > > allocate(xc(ncell)) > allocate(yc(ncell)) > > as you pointed out. Or any change made on distribution of dm over procs? > > Thanks, > Mike > >> >> On Sun, Feb 26, 2023 at 11:32 AM Mike Michell <[email protected] >> <mailto:[email protected]>> wrote: >>> This is what I get from petsc main which is not correct: >>> Overall volume computed from median-dual ... >>> 6.37050098781844 >>> Overall volume computed from PETSc ... >>> 3.15470053800000 >>> >>> >>> This is what I get from petsc 3.18.4 which is correct: >>> Overall volume computed from median-dual ... >>> 3.15470053800000 >>> Overall volume computed from PETSc ... >>> 3.15470053800000 >>> >>> >>> If there is a problem in the code, it is also strange for me that petsc >>> 3.18.4 gives the correct answer >> >> As I said, this can happen due to different layouts in memory. If you run it >> under valgrind, or address sanitizer, you will see >> that there is a problem. >> >> Thanks, >> >> Matt >> >>> Thanks, >>> Mike >>> >>>> >>>> On Sun, Feb 26, 2023 at 11:19 AM Mike Michell <[email protected] >>>> <mailto:[email protected]>> wrote: >>>>> Which version of petsc you tested? With petsc 3.18.4, median duan volume >>>>> gives the same value with petsc from DMPlexComputeCellGeometryFVM(). >>>> >>>> This is only an accident of the data layout. The code you sent writes over >>>> memory in the local Fortran arrays. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>>> >>>>>> On Sat, Feb 25, 2023 at 3:11 PM Mike Michell <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>>> My apologies for the late follow-up. There was a time conflict. >>>>>>> >>>>>>> A simple example code related to the issue I mentioned is attached >>>>>>> here. The sample code does: (1) load grid on dm, (2) compute >>>>>>> vertex-wise control volume for each node in a median-dual way, (3) halo >>>>>>> exchange among procs to have complete control volume values, and (4) >>>>>>> print out its field as a .vtu file. To make sure, the computed control >>>>>>> volume is also compared with PETSc-computed control volume via >>>>>>> DMPlexComputeCellGeometryFVM() (see lines 771-793). >>>>>>> >>>>>>> Back to the original problem, I can get a proper control volume field >>>>>>> with PETSc 3.18.4, which is the latest stable release. However, if I >>>>>>> use PETSc from the main repo, it gives a strange control volume field. >>>>>>> Something is certainly strange around the parallel boundaries, thus I >>>>>>> think something went wrong with halo communication. To help understand, >>>>>>> a comparing snapshot is also attached. I guess a certain part of my >>>>>>> code is no longer compatible with PETSc unless there is a bug in the >>>>>>> library. Could I get comments on it? >>>>>> >>>>>> I can run your example. The numbers I get for "median-dual volume" do >>>>>> not match the "PETSc volume", and the PETSc volume is correct. Moreover, >>>>>> the median-dual numbers change, which suggests a memory fault. I >>>>>> compiled it using address sanitizer, and it found an error: >>>>>> >>>>>> Number of physical boundary edge ... 4 0 >>>>>> Number of physical and parallel boundary edge ... 4 >>>>>> 0 >>>>>> Number of parallel boundary edge ... 0 0 >>>>>> Number of physical boundary edge ... 4 1 >>>>>> Number of physical and parallel boundary edge ... 4 >>>>>> 1 >>>>>> Number of parallel boundary edge ... 0 1 >>>>>> ================================================================= >>>>>> ==36587==ERROR: AddressSanitizer: heap-buffer-overflow on address >>>>>> 0x603000022d40 at pc 0x0001068e12a8 bp 0x7ffee932cfd0 sp 0x7ffee932cfc8 >>>>>> READ of size 8 at 0x603000022d40 thread T0 >>>>>> ================================================================= >>>>>> ==36588==ERROR: AddressSanitizer: heap-buffer-overflow on address >>>>>> 0x60300000f0f0 at pc 0x00010cf702a8 bp 0x7ffee2c9dfd0 sp 0x7ffee2c9dfc8 >>>>>> READ of size 8 at 0x60300000f0f0 thread T0 >>>>>> #0 0x10cf702a7 in MAIN__ test.F90:657 >>>>>> #1 0x10cf768ee in main test.F90:43 >>>>>> #0 0x1068e12a7 in MAIN__ test.F90:657 >>>>>> #1 0x1068e78ee in main test.F90:43 >>>>>> #2 0x7fff6b80acc8 in start (libdyld.dylib:x86_64+0x1acc8) >>>>>> >>>>>> 0x60300000f0f0 is located 0 bytes to the right of 32-byte region >>>>>> [0x60300000f0d0,0x60300000f0f0) >>>>>> allocated by thread T0 here: >>>>>> #2 0x7fff6b80acc8 in start (libdyld.dylib:x86_64+0x1acc8) >>>>>> >>>>>> 0x603000022d40 is located 0 bytes to the right of 32-byte region >>>>>> [0x603000022d20,0x603000022d40) >>>>>> allocated by thread T0 here: >>>>>> #0 0x114a7457f in wrap_malloc (libasan.5.dylib:x86_64+0x7b57f) >>>>>> #1 0x1068dba71 in MAIN__ test.F90:499 >>>>>> #2 0x1068e78ee in main test.F90:43 >>>>>> #3 0x7fff6b80acc8 in start (libdyld.dylib:x86_64+0x1acc8) >>>>>> >>>>>> SUMMARY: AddressSanitizer: heap-buffer-overflow test.F90:657 in MAIN__ >>>>>> Shadow bytes around the buggy address: >>>>>> >>>>>> which corresponds to >>>>>> >>>>>> ! midpoint of median-dual face for inner face >>>>>> axrf(ifa,1) = 0.5d0*(yc(nc1)+yfc(ifa)) ! for nc1 cell >>>>>> axrf(ifa,2) = 0.5d0*(yc(nc2)+yfc(ifa)) ! for nc2 cell >>>>>> >>>>>> and these were allocated here >>>>>> >>>>>> allocate(xc(ncell)) >>>>>> allocate(yc(ncell)) >>>>>> >>>>>> Hopefully the error is straightforward to see now. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>>> Thanks, >>>>>>> Mike >>>>>>> >>>>>>>> >>>>>>>> On Mon, Feb 20, 2023 at 12:05 PM Matthew Knepley <[email protected] >>>>>>>> <mailto:[email protected]>> wrote: >>>>>>>>> On Sat, Feb 18, 2023 at 12:00 PM Mike Michell <[email protected] >>>>>>>>> <mailto:[email protected]>> wrote: >>>>>>>>>> As a follow-up, I tested: >>>>>>>>>> >>>>>>>>>> (1) Download tar for v3.18.4 from petsc gitlab >>>>>>>>>> (https://gitlab.com/petsc/petsc/-/tree/v3.18.4) has no issue on >>>>>>>>>> DMPlex halo exchange. This version works as I expect. >>>>>>>>>> (2) Clone main branch (git clone https://gitlab.com/petsc/petsc.git) >>>>>>>>>> has issues with DMPlex halo exchange. Something is suspicious about >>>>>>>>>> this main branch, related to DMPlex halo. The solution field I got >>>>>>>>>> is not correct. But it works okay with 1-proc. >>>>>>>>>> >>>>>>>>>> Does anyone have any comments on this issue? I am curious if other >>>>>>>>>> DMPlex users have no problem regarding halo exchange. FYI, I do not >>>>>>>>>> declare ghost layers for halo exchange. >>>>>>>>> >>>>>>>>> There should not have been any changes there and there are definitely >>>>>>>>> tests for this. >>>>>>>>> >>>>>>>>> It would be great if you could send something that failed. I could >>>>>>>>> fix it and add it as a test. >>>>>>>> >>>>>>>> Just to follow up, we have tests of the low-level communication (Plex >>>>>>>> tests ex1, ex12, ex18, ex29, ex31), and then we have >>>>>>>> tests that use halo exchange for PDE calculations, for example SNES >>>>>>>> tutorial ex12, ex13, ex62. THe convergence rates >>>>>>>> should be off if the halo exchange were wrong. Is there any example >>>>>>>> similar to your code that is failing on your installation? >>>>>>>> Or is there a way to run your code? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Mike >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Dear PETSc team, >>>>>>>>>>> >>>>>>>>>>> I am using PETSc for Fortran with DMPlex. I have been using this >>>>>>>>>>> version of PETSc: >>>>>>>>>>> >>git rev-parse origin >>>>>>>>>>> >>995ec06f924a86c4d28df68d1fdd6572768b0de1 >>>>>>>>>>> >>git rev-parse FETCH_HEAD >>>>>>>>>>> >>9a04a86bf40bf893fb82f466a1bc8943d9bc2a6b >>>>>>>>>>> >>>>>>>>>>> There has been no issue, before the one with VTK viewer, which Jed >>>>>>>>>>> fixed today >>>>>>>>>>> (https://gitlab.com/petsc/petsc/-/merge_requests/6081/diffs?commit_id=27ba695b8b62ee2bef0e5776c33883276a2a1735). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Since that MR has been merged into the main repo, I pulled the >>>>>>>>>>> latest version of PETSc (basically I cloned it from scratch). But >>>>>>>>>>> if I use the same configure option with before, and run my code, >>>>>>>>>>> then there is an issue with halo exchange. The code runs without >>>>>>>>>>> error message, but it gives wrong solution field. I guess the issue >>>>>>>>>>> I have is related to graph partitioner or halo exchange part. This >>>>>>>>>>> is because if I run the code with 1-proc, the solution is correct. >>>>>>>>>>> I only updated the version of PETSc and there was no change in my >>>>>>>>>>> own code. Could I get any comments on the issue? I was wondering if >>>>>>>>>>> there have been many changes in halo exchange or graph partitioning >>>>>>>>>>> & distributing part related to DMPlex. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Mike >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>>> their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>> their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which >>>>>> their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> <http://www.cse.buffalo.edu/~knepley/> >> >> >> -- >> What most experimenters take for granted before they begin their experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
