Re: [petsc-dev] Plex - Metis warnings

2018-10-29 Thread Mark Adams via petsc-dev
On Mon, Oct 29, 2018 at 5:01 PM Matthew Knepley wrote: > On Mon, Oct 29, 2018 at 4:56 PM Mark Adams via petsc-dev < > petsc-dev@mcs.anl.gov> wrote: > >> I am building a fresh PETSc with GNU on Titan and I get these warnings >> about incompatible pointers in calls i

Re: [petsc-dev] [petsc-users] Convergence of AMG

2018-10-29 Thread Mark Adams via petsc-dev
On Mon, Oct 29, 2018 at 2:35 PM Smith, Barry F. wrote: > >Why not just stop it once it is equal to or less than the minimum > values set by the person. That is what it does now. It stops when it is below the value given. > Thus you need not "backtrack" by removing levels but the user

[petsc-dev] Plex - Metis warnings

2018-10-29 Thread Mark Adams via petsc-dev
I am building a fresh PETSc with GNU on Titan and I get these warnings about incompatible pointers in calls in PlexPartition to MarMetis. Mark /lustre/atlas1/geo127/proj-shared/petsc/src/dm/impls/plex/plexpartition.c: In function 'PetscPartitionerPartition_ParMetis':

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-10-29 Thread Mark Adams via petsc-dev
On Mon, Oct 29, 2018 at 5:07 PM Matthew Knepley wrote: > On Mon, Oct 29, 2018 at 5:01 PM Mark Adams via petsc-dev < > petsc-dev@mcs.anl.gov> wrote: > >> I get this error running the tests using GPUs. An error in an LAPACK >> routine. >> > > From the c

[petsc-dev] Error running on Titan with GPUs & GNU

2018-10-29 Thread Mark Adams via petsc-dev
I get this error running the tests using GPUs. An error in an LAPACK routine. 16:50 master= /lustre/atlas/proj-shared/geo127/petsc$ make PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda PETSC_ARCH="" test Running test examples to verify correct installation Using

Re: [petsc-dev] Plex - Metis warnings

2018-10-29 Thread Mark Adams via petsc-dev
ine/compiler working. > > Satish > > On Mon, 29 Oct 2018, Matthew Knepley via petsc-dev wrote: > > > On Mon, Oct 29, 2018 at 4:56 PM Mark Adams via petsc-dev < > > petsc-dev@mcs.anl.gov> wrote: > > > > > I am building a fresh PETSc with GNU on Tita

Re: [petsc-dev] Plex - Metis warnings

2018-10-29 Thread Mark Adams via petsc-dev
> > > Pushing language C > Popping language C > Executing: cc -o /tmp/petsc-yiGfSd/config.packages.MPI/conftest -O > /tmp/petsc-yiGfSd/config.packages.MPI/conftest.o -ldl > Testing executable

Re: [petsc-dev] Plex - Metis warnings

2018-10-29 Thread Mark Adams via petsc-dev
I can hand if off to my users with some assurance that it _can_ work! Thanks, Mark On Mon, Oct 29, 2018 at 7:23 PM Balay, Satish wrote: > On Mon, 29 Oct 2018, Mark Adams via petsc-dev wrote: > > > > > > > > > >

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-10-29 Thread Mark Adams via petsc-dev
Still getting this error with downloaded lapack. I sent the logs on the other thread. 18:02 master= /lustre/atlas/proj-shared/geo127/petsc$ make PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda PETSC_ARCH="" test Running test examples to verify correct installation Using

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-10-29 Thread Mark Adams via petsc-dev
I was able to run ex56 (ksp) which does not use GMRES. This error was from a GMRES method so maybe this is an isolated problem. > >Barry > > > > On Oct 29, 2018, at 3:56 PM, Mark Adams via petsc-dev < > petsc-dev@mcs.anl.gov> wrote: > > > > I get this error ru

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-10-29 Thread Mark Adams via petsc-dev
And a debug build seems to work: 21:04 1 master= /lustre/atlas/proj-shared/geo127/petsc$ make PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc_titan_dbg64idx_gnu_cuda PETSC_ARCH="" test Running test examples to verify correct installation Using

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-11-01 Thread Mark Adams via petsc-dev
On Wed, Oct 31, 2018 at 12:30 PM Mark Adams wrote: > > > On Wed, Oct 31, 2018 at 6:59 AM Karl Rupp wrote: > >> Hi Mark, >> >> ah, I was confused by the Python information at the beginning of >> configure.log. So it is picking up the correct compiler. >> >> Have you tried uncommenting the check

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-10-30 Thread Mark Adams via petsc-dev
> > > > Are there newer versions of the Gnu compiler for this system? Yes: -- /opt/modulefiles --

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-10-31 Thread Mark Adams via petsc-dev
Karli > > > > On 10/31/18 8:15 AM, Mark Adams via petsc-dev wrote: > > After loading a cuda module ... > > > > On Wed, Oct 31, 2018 at 2:58 AM Mark Adams > <mailto:mfad...@lbl.gov>> wrote: > > > > I get an error with --with-cuda=1 > >

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-10-31 Thread Mark Adams via petsc-dev
gt; gcc/4.8.2 gcc/5.3.0 gcc/6.2.0 gcc/7.1.0 > gcc/7.3.0 > > > >> >> Best regards, >> Karli >> >> >> >> On 10/31/18 8:15 AM, Mark Adams via petsc-dev wrote: >> > After loading a cuda module ... >> &g

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-11-02 Thread Mark Adams via petsc-dev
I did not configure hypre manually, so I guess it is not using GPUs. On Fri, Nov 2, 2018 at 2:40 PM Smith, Barry F. wrote: > > > > On Nov 2, 2018, at 1:25 PM, Mark Adams wrote: > > > > And I just tested it with GAMG and it seems fine. And hypre ran, but it > is not clear that it used GPUs

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-11-02 Thread Mark Adams via petsc-dev
FYI, I seem to have the new GPU machine at ORNL (summitdev) working with GPUs. That is good enough for now. Thanks, 14:00 master= ~/petsc/src/snes/examples/tutorials$ jsrun -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -pc_type none -ksp_type fgmres -snes_monitor_short -snes_rtol 1.e-5

Re: [petsc-dev] Error running on Titan with GPUs & GNU

2018-11-02 Thread Mark Adams via petsc-dev
And I just tested it with GAMG and it seems fine. And hypre ran, but it is not clear that it used GPUs 14:13 master= ~/petsc/src/snes/examples/tutorials$ jsrun -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -pc_type hypre -ksp_type fgmres -snes_monitor_short -snes_rtol 1.e-5

[petsc-dev] GPU web page out of date

2018-12-17 Thread Mark Adams via petsc-dev
The GPU web page looks like it is 8 years old ... this link is dead (but ex47cu seems to be in the repo): - Example that uses CUDA directly in the user function evaluation Thanks, Mark

Re: [petsc-dev] FW: Re[2]: Implementing of a variable block size BILU preconditioner

2018-12-05 Thread Mark Adams via petsc-dev
If you zero a row out then put something on the diagonal. And your matrix data file (it does not look like it has any sparsity meta-data) has about 18 orders of scales. When you diagonally scale, which most solvers implicitly do, it looks like some of these numbers will just go away and you will

Re: [petsc-dev] [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-27 Thread Mark Adams via petsc-dev
So is this the instructions that I should give him? This grad student is a quick study but he has not computing background. So we don't care what we use, we just want to work (easily). Thanks Do not use "--download-fblaslapack=1". Set it to 0. Same for "--download-mpich=1". Now do: > module

Re: [petsc-dev] [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-21 Thread Mark Adams via petsc-dev
I'm probably screwing up some sort of history by jumping into dev, but this is a dev comment ... (1) -matptap_via hypre: This call the hypre package to do the PtAP trough > an all-at-once triple product. In our experiences, it is the most memory > efficient, but could be slow. > FYI, I visited

Re: [petsc-dev] MatNest and FieldSplit

2019-03-24 Thread Mark Adams via petsc-dev
I think he is saying that this line seems to have no effect (and the comment is hence wrong): KSPSetOperators(subksp[nsplits - 1], S, S); // J2 = [[4, 0] ; [0, 0.1]] J2 is a 2x2 but this block has been changed into two single equation fields. Does this KSPSetOperators supposed to copy this

Re: [petsc-dev] SNESSolve and changing dimensions

2019-04-03 Thread Mark Adams via petsc-dev
I agree that you want to adapt around a converged solution. I have code that runs time step(s), adapts, Transfers solutions and state, creates a new TS & SNES, if you want to clone that. It works with PForest, but Toby and Matt are working on these abstractions so it might not be the most

Re: [petsc-dev] [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-27 Thread Mark Adams via petsc-dev
On Wed, Mar 27, 2019 at 12:06 AM Victor Eijkhout wrote: > > > On Mar 26, 2019, at 6:25 PM, Mark Adams via petsc-dev < > petsc-dev@mcs.anl.gov> wrote: > > /home1/04906/bonnheim/olympus-keaveny/Olympus/olympus.petsc-3.9.3.skx-cxx-O > on a skx-cxx-O named c478-06

Re: [petsc-dev] [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-26 Thread Mark Adams via petsc-dev
> > > The way to reduce the memory is to have the all-at-once algorithm (Mark is > an expert on this). But I am not sure how efficient it could be > implemented. > I have some data from a 3D elasticity problem with 1.4B equations on:

Re: [petsc-dev] [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-21 Thread Mark Adams via petsc-dev
> > > Could you explain this more by adding some small examples? > > Since you are considering implementing all-at-once (four nested loops, right?) I'll give you my old code. This code is hardwired for two AMG and for a geometric-AMG, where the blocks of the R (and hence P) matrices are scaled

Re: [petsc-dev] HYPRE_LinSysCore.h

2019-01-29 Thread Mark Adams via petsc-dev
On Tue, Jan 29, 2019 at 5:20 PM Victor Eijkhout via petsc-dev < petsc-dev@mcs.anl.gov> wrote: > > > On Jan 29, 2019, at 3:58 PM, Balay, Satish wrote: > > -args.append('--without-fei') > > > The late-1990s Finite Element Interface? > I would guess FEI is still actively used and interfaces to

[petsc-dev] is DMSetDS not in master?

2019-02-01 Thread Mark Adams via petsc-dev
10:37 master= ~/Codes/petsc$ git grep DMSetDS src/dm/interface/dm.c:.seealso: DMGetDS(), DMSetDS() 10:37 master= ~/Codes/petsc$

Re: [petsc-dev] is DMSetDS not in master?

2019-02-01 Thread Mark Adams via petsc-dev
OK, it's not in Changes and there is one comment to it. On Fri, Feb 1, 2019 at 10:50 AM Matthew Knepley wrote: > I removed it, since no one should use it anymore. You use > DMSetField()+DMCreateDS() instead. > > THanks, > > Matt > > On Fri, Feb 1, 2019 at 10:38 AM

Re: [petsc-dev] New implementation of PtAP based on all-at-once algorithm

2019-04-12 Thread Mark Adams via petsc-dev
On Thu, Apr 11, 2019 at 11:42 PM Smith, Barry F. wrote: > > > > On Apr 11, 2019, at 9:07 PM, Mark Adams via petsc-dev < > petsc-dev@mcs.anl.gov> wrote: > > > > Interesting, nice work. > > > > It would be interesting to get the flop counters work

Re: [petsc-dev] New implementation of PtAP based on all-at-once algorithm

2019-04-15 Thread Mark Adams via petsc-dev
> > > I guess you are interested in the performance of the new algorithms on > small problems. I will try to test a petsc example such as > mat/examples/tests/ex96.c. > It's not a big deal. And the fact that they are similar on one node tells us the kernels are similar. > > >> >> And are you

Re: [petsc-dev] New implementation of PtAP based on all-at-once algorithm

2019-04-15 Thread Mark Adams via petsc-dev
Mark Adams wrote: >>> >>>> >>>> >>>> On Thu, Apr 11, 2019 at 11:42 PM Smith, Barry F. >>>> wrote: >>>> >>>>> >>>>> >>>>> > On Apr 11, 2019, at 9:07 PM, Mark Adams vi

Re: [petsc-dev] New implementation of PtAP based on all-at-once algorithm

2019-04-15 Thread Mark Adams via petsc-dev
> >> I wonder if the their symbolic setup is getting called every time. You do >> 50 solves it looks like and that should be enough to amortize a one time >> setup cost. >> > > Hypre does not have concept called symbolic. They do everything from > scratch, and won't reuse any data. > Really,

Re: [petsc-dev] New implementation of PtAP based on all-at-once algorithm

2019-04-15 Thread Mark Adams via petsc-dev
> > So you could reorder your equations and see a block diagonal matrix with >> 576 blocks. right? >> > > I not sure I understand the question correctly. For each mesh vertex, we > have a 576x576 diagonal matrix. The unknowns are ordered in this way: > v0, v2.., v575 for vertex 1, and another

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-07-10 Thread Mark Adams via petsc-dev
gt; Barry > > BTW: Unrelated comment, the code > > ierr = VecSet(yy,0);CHKERRQ(ierr); > ierr = VecCUDAGetArrayWrite(yy,);CHKERRQ(ierr); > > has an unneeded ierr = VecSet(yy,0);CHKERRQ(ierr); here. > VecCUDAGetArrayWrite() requires that you ignore the values in yy

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-07-10 Thread Mark Adams via petsc-dev
gt; > > On Wed, Jul 10, 2019 at 9:22 AM Stefano Zampini < > stefano.zamp...@gmail.com> wrote: > > Mark, > > > > if the difference is on lvec, I suspect the bug has to do with > compressed row storage. I have fixed a similar bug in MatMult. > > you want to check

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-07-10 Thread Mark Adams via petsc-dev
or->size() against A->cmap->n. > > Stefano > > Il giorno mer 10 lug 2019 alle ore 15:54 Mark Adams via petsc-dev < > petsc-dev@mcs.anl.gov> ha scritto: > >> >> >> On Wed, Jul 10, 2019 at 1:13 AM Smith, Barry F. >> wrote: >> >>>

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-07-10 Thread Mark Adams via petsc-dev
> > > 3) Is comparison between pointers appropriate? For example if (dptr != > zarray) { is scary if some arrays are zero length how do we know what the > pointer value will be? > > Yes, you need to consider these cases, which is kind of error prone. Also, I think merging transpose,and not,is a

Re: [petsc-dev] New implementation of PtAP based on all-at-once algorithm

2019-04-11 Thread Mark Adams via petsc-dev
Interesting, nice work. It would be interesting to get the flop counters working. This looks like GMG, I assume 3D. The degree of parallelism is not very realistic. You should probably run a 10x smaller problem, at least, or use 10x more processes. I guess it does not matter. This basically

[petsc-dev] running test

2019-08-13 Thread Mark Adams via petsc-dev
I want to test a test as it will be executed. Can someone please tell me how do I run a "cuda" test SNES ex56 exactly? This is the test in ex56.c test: suffix: cuda nsize: 2 requires: cuda args: -cel Thanks, Mark

Re: [petsc-dev] Is master broken?

2019-08-12 Thread Mark Adams via petsc-dev
nges is not the correct thing here. But the current set of > 21 > > > commits are all over the place] > > > > > > If you are able to migrate to this branch - its best to delete > > the old > > > one [i.e origin/mark/gamg-f

Re: [petsc-dev] Is master broken?

2019-08-12 Thread Mark Adams via petsc-dev
Satish, I think I can do this now. Mark On Mon, Aug 12, 2019 at 6:26 AM Mark Adams wrote: > Satish, > > Your new branch mark/gamg-fix-viennacl-rebased-v2 does not seem to have > Barry's fixes (the source of this thread): > > ... > , line 243: error: function call is not allowed in a

Re: [petsc-dev] Is master broken?

2019-08-12 Thread Mark Adams via petsc-dev
Satish, Your new branch mark/gamg-fix-viennacl-rebased-v2 does not seem to have Barry's fixes (the source of this thread): ... , line 243: error: function call is not allowed in a constant expression #if PETSC_PKG_CUDA_VERSION_GE(10,1,0) Here is the reflog of the cherry

Re: [petsc-dev] Is master broken?

2019-08-12 Thread Mark Adams via petsc-dev
> > >> several issues: >> >> >> https://bitbucket.org/petsc/petsc/pull-requests/1954/cuda-fixes-to-pinning-onto-cpu/diff >> >> > These links are dead. > I found one issue with not protecting the pinnedtocpu member variable in Mat and Vec. Will fix asap.

Re: [petsc-dev] Is master broken?

2019-08-02 Thread Mark Adams via petsc-dev
;> Tried again after deleting the arch dirs and still have it. > >>>> This is my branch that just merged master. I will try with just > master. > >>>> Thanks, > >>>> > >>>> On Thu, Aug 1, 2019 at 1:36 AM Smith, Barry F. > w

Re: [petsc-dev] Is master broken?

2019-08-02 Thread Mark Adams via petsc-dev
I have been cherry-picking, etc, branch mark/gamg-fix-viennacl-rebased and it is very messed up. Can someone please update this branch when all the fixes are settled down? eg, I am seeing dozens of modified files that I don't know anything about and I certainly don't want to put in a PR for them.

Re: [petsc-dev] Is master broken?

2019-08-03 Thread Mark Adams via petsc-dev
le commit for all > > the changes is not the correct thing here. But the current set of 21 > > commits are all over the place] > > > > If you are able to migrate to this branch - its best to delete the old > > one [i.e origin/mark/gamg-fix-viennacl-rebased] > > > &g

Re: [petsc-dev] hypre and CUDA

2019-08-16 Thread Mark Adams via petsc-dev
56 ... > > > https://devblogs.nvidia.com/cuda-pro-tip-nvprof-your-handy-universal-gpu-profiler/ > > If it uses the GPU, then you will get some information on the GPU > kernels called. If it doesn't use the GPU, the list will be (almost) empty. > > Best regards, > Karli > > > >

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-14 Thread Mark Adams via petsc-dev
I am getting this error with single: 22:21 /gpfs/alpine/geo127/scratch/adams$ jsrun -n 1 -a 1 -c 1 -g 1 ./ex56_single -cells 2,2,2 -ex56_dm_vec_type cuda -ex56_dm_mat_type aijcusparse -fp_trap [0] 81 global equations, 27 vertices [0]PETSC ERROR: *** unknown floating point error occurred ***

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-14 Thread Mark Adams via petsc-dev
OK, I'll run single. It a bit perverse to run with 4 byte floats and 8 byte integers ... I could use 32 bit ints and just not scale out. On Wed, Aug 14, 2019 at 6:48 PM Smith, Barry F. wrote: > > Mark, > >Oh, I don't even care if it converges, just put in a fixed number of > iterations.

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-14 Thread Mark Adams via petsc-dev
On Wed, Aug 14, 2019 at 2:19 PM Smith, Barry F. wrote: > > Mark, > > This is great, we can study these for months. > > 1) At the top of the plots you say SNES but that can't be right, there is > no way it is getting such speed ups for the entire SNES solve since the > Jacobians are CPUs

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-14 Thread Mark Adams via petsc-dev
I can run single, I just can't scale up. But I can use like 1500 processors. On Wed, Aug 14, 2019 at 9:31 PM Smith, Barry F. wrote: > > Oh, are all your integers 8 bytes? Even on one node? > > Once Karl's new middleware is in place we should see about reducing to 4 > bytes on the GPU. > >

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-14 Thread Mark Adams via petsc-dev
On Wed, Aug 14, 2019 at 3:37 PM Jed Brown wrote: > Mark Adams via petsc-dev writes: > > > On Wed, Aug 14, 2019 at 2:35 PM Smith, Barry F. > wrote: > > > >> > >> Mark, > >> > >>Would you be able to make one run using single pr

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-14 Thread Mark Adams via petsc-dev
> > > > Do you have any applications that specifically want Q2 (versus Q1) > elasticity or have some test problems that would benefit? > > No, I'm just trying to push things.

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-14 Thread Mark Adams via petsc-dev
Here is the times for KSPSolve on one node with 2,280,285 equations. These nodes seem to have 42 cores. There are 6 "devices" (GPUs) and 7 core attached to the device. The anomalous 28 core result could be from only using 4 "devices". I figure I will use 36 cores for now. I should really do this

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-14 Thread Mark Adams via petsc-dev
have linear variations in stress. > > On 8/14/19 2:51 PM, Mark Adams via petsc-dev wrote: > > > > > > Do you have any applications that specifically want Q2 (versus Q1) > > elasticity or have some test problems that would benefit? > > > > > > No, I'm just trying to push things. >

[petsc-dev] hypre and CUDA

2019-08-15 Thread Mark Adams via petsc-dev
I have configured with Hypre on SUMMIT, with cuda, and it ran. I'm now trying to verify that it used GPUs (I doubt it). Any ideas on how to verify this? Should I use the cuda vecs and mats, or does Hypre not care. Can I tell hypre not to use GPUs other than configuring an non-cude PETSc? I'm not

Re: [petsc-dev] hypre and CUDA

2019-08-15 Thread Mark Adams via petsc-dev
e just been timing it. It is very slow. 3x slower than GAMG/CPU and 20x slower than GAMG/GPU. But Hypers parameters tend to be optimized for 2D and I have not optimized parameters. But clearly its not using GPUs. > >Barry > > > > On Aug 15, 2019, at 10:47 AM, Mark Adams vi

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-14 Thread Mark Adams via petsc-dev
On Wed, Aug 14, 2019 at 2:35 PM Smith, Barry F. wrote: > > Mark, > >Would you be able to make one run using single precision? Just single > everywhere since that is all we support currently? > > Experience in engineering at least is single does not work for FE elasticity. I have tried it

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-08-31 Thread Mark Adams via petsc-dev
On Sat, Aug 31, 2019 at 4:28 PM Smith, Barry F. wrote: > > Any explanation for why the scaling is much better for CPUs and than > GPUs? Is it the "extra" time needed for communication from the GPUs? > The GPU work is well load balanced so it weak scales perfectly. When you put that work in

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-09-01 Thread Mark Adams via petsc-dev
Junchao and Barry, I am using mark/fix-cuda-with-gamg-pintocpu, which is built on barry's robustify branch. Is this in master yet? If so, I'd like to get my branch merged to master, then merge Junchao's branch. Then us it. I think we were waiting for some refactoring from Karl to proceed.

Re: [petsc-dev] Should we add something about GPU support to the user manual?

2019-09-12 Thread Mark Adams via petsc-dev
> > >> And are there any thoughts on where this belongs in the manual? >> > > I think just make another chapter. > > Agreed. That way we can make it very clear that this is WIP, interfaces will change, etc. > Thanks, > > Matt > > >> --Richard >> > > > -- > What most experimenters take for

Re: [petsc-dev] MatPinToCPU

2019-07-27 Thread Mark Adams via petsc-dev
Barry, I fixed CUDA to pin to CPUs correctly for GAMG at least. There are some hacks here that we can work on. I will start testing it tomorrow, but I am pretty sure that I have not regressed. I am hoping that this will fix the numerical problems, which seem to be associated with empty

Re: [petsc-dev] PCREDUNDANT

2019-07-28 Thread Mark Adams via petsc-dev
On Sun, Jul 28, 2019 at 2:54 AM Pierre Jolivet via petsc-dev < petsc-dev@mcs.anl.gov> wrote: > Hello, > I’m facing multiple issues with PCREDUNDANT and MATMPISBAIJ: > 1) > https://www.mcs.anl.gov/petsc/petsc-current/src/mat/impls/sbaij/mpi/mpisbaij.c.html#line3354 > shouldn’t > this be sum != N?

Re: [petsc-dev] MatPinToCPU

2019-07-30 Thread Mark Adams via petsc-dev
On Mon, Jul 29, 2019 at 11:27 PM Smith, Barry F. wrote: > > Thanks. Could you please send the 24 processors with the GPU? > That is in out_cuda_24 >Note the final column of the table gives you the percentage of flops > (not rates, actual operations) on the GPU. For you biggest

[petsc-dev] Is master broken?

2019-07-31 Thread Mark Adams via petsc-dev
I am seeing this when I pull master into my branch: "/autofs/nccs-svm1_home1/adams/petsc/src/mat/impls/dense/seq/cuda/ densecuda.cu" , line 243: error: function call is not allowed in a constant expression #if PETSC_PKG_CUDA_VERSION_GE(10,1,0) and I see that this macro does

Re: [petsc-dev] Is master broken?

2019-08-01 Thread Mark Adams via petsc-dev
arch-linux2-c-debug/include/petscpkg_version.h >> contains PETSC_PKG_CUDA_VERSION_GE and similar macros. If not send >> configure.lo >> >> check what is in arch-linux2-c-debug/include/petscpkg_version.h it >> nothing or broken send configure.lo >> >> >> B

Re: [petsc-dev] MatPinToCPU

2019-07-28 Thread Mark Adams via petsc-dev
This is looking good. I'm not seeing the numerical problems, but I've just hid them by avoiding the GPU on coarse grids. Should I submit a pull request now or test more or wait for Karl? On Sat, Jul 27, 2019 at 7:37 PM Mark Adams wrote: > Barry, I fixed CUDA to pin to CPUs correctly for GAMG

Re: [petsc-dev] Is master broken?

2019-08-01 Thread Mark Adams via petsc-dev
at top of the "bad" source file crashes so in theory everything > is in order check that arch-linux2-c-debug/include/petscpkg_version.h > contains PETSC_PKG_CUDA_VERSION_GE and similar macros. If not send > configure.lo > > > > check what is in arch-linux2-c-debug/i

Re: [petsc-dev] MatPinToCPU

2019-07-27 Thread Mark Adams via petsc-dev
I'm not sure what to do here. The problem is that pinned-to-cpu vectors are calling *VecCUDACopyFromGPU* here. Should I set *x->valid_GPU_array *to something else, like PETSC_OFFLOAD_CPU, in PinToCPU so this block of code i s not executed? PetscErrorCode VecGetArray(Vec x,PetscScalar **a) {

Re: [petsc-dev] MatPinToCPU

2019-07-27 Thread Mark Adams via petsc-dev
Yea, I just figured out the problem. VecDuplicate_MPICUDA did not call PinToCPU or even copy pinnedtocpu. It just copied ops, so I added and am testing: ierr = VecCreate_MPICUDA_Private(*v,PETSC_TRUE,w->nghost,0);CHKERRQ(ierr); vw = (Vec_MPI*)(*v)->data; ierr =

[petsc-dev] CUDA GAMG coarse grid solver

2019-07-21 Thread Mark Adams via petsc-dev
I am running ex56 with -ex56_dm_vec_type cuda -ex56_dm_mat_type aijcusparse and I see no GPU communication in MatSolve (the serial LU coarse grid solver). I am thinking the dispatch of the CUDA version of this got dropped somehow. I see that this is getting called: PETSC_EXTERN PetscErrorCode

Re: [petsc-dev] CUDA GAMG coarse grid solver

2019-07-21 Thread Mark Adams via petsc-dev
can try stepping through from MatGetFactor to see what its doing. Thanks, Mark On Sun, Jul 21, 2019 at 11:14 AM Smith, Barry F. wrote: > > > > On Jul 21, 2019, at 8:55 AM, Mark Adams via petsc-dev < > petsc-dev@mcs.anl.gov> wrote: > > > > I am running

[petsc-dev] MatPinToCPU

2019-07-23 Thread Mark Adams via petsc-dev
I've tried to add pining the matrix and prolongator to the CPU on coarse grids in GAMG with this: /* pin reduced coase grid - could do something smarter */ ierr = MatPinToCPU(*a_Amat_crs,PETSC_TRUE);CHKERRQ(ierr); ierr = MatPinToCPU(*a_P_inout,PETSC_TRUE);CHKERRQ(ierr); It does not

Re: [petsc-dev] MatPinToCPU

2019-07-23 Thread Mark Adams via petsc-dev
> > > What are the symptoms of it not working? Does it appear to be still > copying the matrices to the GPU? then running the functions on the GPU? > > The object is dispatching the CUDA mat-vec etc. I suspect the pinning is incompletely done for CUDA (and MPIOpenCL) > matrices. > > Yes, git

Re: [petsc-dev] CUDA GAMG coarse grid solver

2019-07-21 Thread Mark Adams via petsc-dev
wrote: > >> >> >> > On Jul 21, 2019, at 8:55 AM, Mark Adams via petsc-dev < >> petsc-dev@mcs.anl.gov> wrote: >> > >> > I am running ex56 with -ex56_dm_vec_type cuda -ex56_dm_mat_type >> aijcusparse and I see no GPU communication in MatSolv

Re: [petsc-dev] MatMult on Summit

2019-09-21 Thread Mark Adams via petsc-dev
I came up with 36 cores/node for CPU GAMG runs. The memory bus is pretty saturated at that point. On Sat, Sep 21, 2019 at 1:44 AM Zhang, Junchao via petsc-dev < petsc-dev@mcs.anl.gov> wrote: > Here are CPU version results on one node with 24 cores, 42 cores. Click > the links for core layout. >

Re: [petsc-dev] MatMult on Summit

2019-09-21 Thread Mark Adams via petsc-dev
On Sat, Sep 21, 2019 at 12:48 AM Smith, Barry F. via petsc-dev < petsc-dev@mcs.anl.gov> wrote: > > Junchao, > >Very interesting. For completeness please run also 24 and 42 CPUs > without the GPUs. Note that the default layout for CPU cores is not good. > You will want 3 cores on each socket

Re: [petsc-dev] error with karlrupp/fix-cuda-streams

2019-09-28 Thread Mark Adams via petsc-dev
On Sat, Sep 28, 2019 at 12:55 AM Karl Rupp wrote: > Hi Mark, > > > OK, so now the problem has shifted somewhat in that it now manifests > > itself on small cases. It is somewhat random and anecdotal but it does happen on the smaller test problem now. When I try to narrow down when the problem

Re: [petsc-dev] error with karlrupp/fix-cuda-streams

2019-09-28 Thread Mark Adams via petsc-dev
The logic is basically correct because I simple zero out yy vector (the output vector) and it runs great now. The numerics look fine without CPU pinning. AND, it worked with 1,2, and 3 GPUs (one node, one socket), but failed with 4 GPU's which uses the second socket. Strange. On Sat, Sep 28,

Re: [petsc-dev] error with karlrupp/fix-cuda-streams

2019-09-26 Thread Mark Adams via petsc-dev
rds, > Karli > > > > > > > On Wed, Sep 25, 2019 at 5:26 AM Karl Rupp via petsc-dev > > mailto:petsc-dev@mcs.anl.gov>> wrote: > > > > > > > > On 9/25/19 11:12 AM, Mark Adams via petsc-dev wrote: > >

Re: [petsc-dev] getting eigen estimates from GAMG to CHEBY

2019-09-26 Thread Mark Adams via petsc-dev
> > Okay, it seems like they should be stored in GAMG. > Before we stored them in the matrix. When you get to the test in Cheby you don't have caller anymore (GAMG). > Why would the PC type change anything? > Oh, the eigenvalues are the preconditioned ones, the PC (Jacobi) matters but it is

Re: [petsc-dev] getting eigen estimates from GAMG to CHEBY

2019-09-27 Thread Mark Adams via petsc-dev
As I recall we attached the eigenstates to the matrix. Is that old attach mechanism still the used/recommended? Or is there a better way to do this now? Thanks, Mark On Thu, Sep 26, 2019 at 7:45 AM Mark Adams wrote: > > >> Okay, it seems like they should be stored in GAMG. >> > > Before we

Re: [petsc-dev] MatMult on Summit

2019-09-24 Thread Mark Adams via petsc-dev
Yes, please, thank you. On Tue, Sep 24, 2019 at 1:46 AM Mills, Richard Tran via petsc-dev < petsc-dev@mcs.anl.gov> wrote: > Karl, that would be fantastic. Much obliged! > > --Richard > > On 9/23/19 8:09 PM, Karl Rupp wrote: > > Hi, > > `git grep cudaStreamCreate` reports that vectors, matrices

Re: [petsc-dev] MatMult on Summit

2019-09-23 Thread Mark Adams via petsc-dev
Note, the numerical problems that we have look a lot like a race condition of some sort. Happens with empty processors and goes away under cuda-memcheck (valgrind like thing). I did try adding WaitForGPU() , but maybe I did do it right or there are other synchronization mechanisms. On Mon, Sep

[petsc-dev] CUDA STREAMS

2019-10-02 Thread Mark Adams via petsc-dev
I found a CUDAVersion.cu of STREAMS and tried to build it. I got it to compile manually with: nvcc -o CUDAVersion.o -ccbin pgc++ -I/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/pgi-19.4/spectrum-mpi-10.3.0.1-20190611-4ymaahbai7ehhw4rves5jjiwon2laz3a/include

Re: [petsc-dev] Why no SpGEMM support in AIJCUSPARSE and AIJVIENNACL?

2019-10-02 Thread Mark Adams via petsc-dev
FWIW, I've heard that CUSPARSE is going to provide integer matrix-matrix products for indexing applications, and that it should be easy to extend that to double, etc. On Wed, Oct 2, 2019 at 6:00 PM Mills, Richard Tran via petsc-dev < petsc-dev@mcs.anl.gov> wrote: > Fellow PETSc developers, > > I

Re: [petsc-dev] error with karlrupp/fix-cuda-streams

2019-09-25 Thread Mark Adams via petsc-dev
> > If jsrun is not functional from configure, alternatives are > --with-mpiexec=/bin/true or --with-batch=1 > > --with-mpiexec=/bin/true seems to be working. Thanks, Mark > Satish >

Re: [petsc-dev] error with karlrupp/fix-cuda-streams

2019-09-25 Thread Mark Adams via petsc-dev
h/adams$ jsrun -g 1 -n 1 printenv GIT_PS1_SHOWDIRTYSTATE=1 XDG_SESSION_ID=494 SHELL=/bin/bash HISTSIZE=100 PETSC_ARCH=arch-summit-opt64-pgi-cuda SSH_CLIENT=160.91.202.152 48626 22 LC_ALL= USER=adams ... > > Satish > > > On Wed, 25 Sep 2019, Mark Adams via petsc-dev wrote: > > >

Re: [petsc-dev] error with karlrupp/fix-cuda-streams

2019-09-25 Thread Mark Adams via petsc-dev
> deal with in detached mode - which makes this obvious] > I got this <> and "fixed" it by deleting the branch and repulling it. I guess I needed to fetch also. Mark > > Satish > > > On Wed, 25 Sep 2019, Mark Adams via petsc-dev wrote: > > > I will test

Re: [petsc-dev] [petsc-maint] running CUDA on SUMMIT

2019-07-09 Thread Mark Adams via petsc-dev
I am stumped with this GPU bug(s). Maybe someone has an idea. I did find a bug in the cuda transpose mat-vec that cuda-memcheck detected, but I still have differences between the GPU and CPU transpose mat-vec. I've got it down to a very simple test: bicg/none on a tiny mesh with two processors.

Re: [petsc-dev] Parmetis bug

2019-11-10 Thread Mark Adams via petsc-dev
Fande, It looks to me like this branch in ParMetis must be taken to trigger this error. First *Match_SHEM* and then CreateCoarseGraphNoMask. /* determine which matching scheme you will use */ switch (ctrl->ctype) { case METIS_CTYPE_RM: Match_RM(ctrl, graph); break;

Re: [petsc-dev] GPU counters

2019-11-06 Thread Mark Adams via petsc-dev
t; --Junchao Zhang > > > On Wed, Nov 6, 2019 at 8:44 AM Mark Adams via petsc-dev < > petsc-dev@mcs.anl.gov> wrote: > >> I am puzzled. >> >> I am running AMGx now, and I am getting flop counts/rates. How does that >> happen? Does PETSc use hardware counters to get flops? >> >

[petsc-dev] GPU counters

2019-11-06 Thread Mark Adams via petsc-dev
I am puzzled. I am running AMGx now, and I am getting flop counts/rates. How does that happen? Does PETSc use hardware counters to get flops?

Re: [petsc-dev] Parmetis bug

2019-11-09 Thread Mark Adams via petsc-dev
On Sat, Nov 9, 2019 at 10:51 PM Fande Kong wrote: > Hi Mark, > > Thanks for reporting this bug. I was surprised because we have sufficient > heavy tests in moose using partition weights and do not have any issue so > far. > > I have been pounding on this code with elasticity and have not seen

Re: [petsc-dev] Parmetis bug

2019-11-10 Thread Mark Adams via petsc-dev
Fande, the problem is k below seems to index beyond the end of htable, resulting in a crazy m and a segv on the last line below. I don't have a clean valgrind machine now, that is what is needed if no one has seen anything like this. I could add a test in a MR and get the pipeline to do it. void

Re: [petsc-dev] ksp_error_if_not_converged in multilevel solvers

2019-10-20 Thread Mark Adams via petsc-dev
> If one just wants to run a fixed number of iterations, not checking for > convergence, why would one set ksp->errorifnotconverged to true? > > Good question. I can see not worrying too much about convergence on the coarse grids, but to not allow it ... and now that I think about it, it seems

[petsc-dev] SuperLU + GPUs

2019-10-18 Thread Mark Adams via petsc-dev
What is the status of supporting SuperLU_DIST with GPUs? Thanks, Mark

[petsc-dev] getting eigen estimates from GAMG to CHEBY

2019-09-25 Thread Mark Adams via petsc-dev
It's been a few years since we lost the ability to cache the eigen estimates, that smoothed aggregation computes, to chebyshev smoothers. I'd like to see if we bring this back. This is slightly (IMO) complicated by the fact that the smoother PC may not be Jacobi, but I think it is close enough

  1   2   >