Re: [petsc-users] Kokkos Interface for PETSc

2022-02-17 Thread Richard Tran Mills via petsc-users

Hi Philip,

Sorry to be a bit late in my reply. Jed has explained the gist of what's 
involved with using the Kokkos/Kokkos-kernels back-end for the PETSc 
solves, though, depending on exactly how Xolotl creates its vectors, 
there may be a bit of work required to ensure that the command-line 
options specifying the matrix and GPU types get applied to the right 
objects, and that non-GPU types are not being hardcoded somewhere (by a 
call like "DMSetMatType(dm,MATAIJ)").


In addition to looking at the -log_view output, since Xolotl uses TS you 
can specify "-ts_view" and look at the output that describes the solver 
hierarchy that Xolotl sets up. If matrix types are being set correctly, 
you'll see things like


  Mat Object: 1 MPI processes
    type: seqaijkokkos

(I note that I've also sent a related message about getting Xolotl 
working with Kokkos back-ends on Summit to you, Sophie, and Phil in 
reply to old thread about this.)


Were you also asking about how to use Kokkos for PETSc matrix assembly, 
or is that a question for later?


Cheers,
Richard

On 2/15/22 09:07, Satish Balay via petsc-users wrote:

Also - perhaps the following info might be useful

Satish



balay@sb /home/balay/petsc (main=)
$ git grep -l download-kokkos-kernels config/examples
config/examples/arch-ci-freebsd-cxx-cmplx-pkgs-dbg.py
config/examples/arch-ci-linux-cuda-double.py
config/examples/arch-ci-linux-gcc-ifc-cmplx.py
config/examples/arch-ci-linux-hip-double.py
config/examples/arch-ci-linux-pkgs-dbg-ftn-interfaces.py
config/examples/arch-ci-linux-pkgs-valgrind.py
config/examples/arch-ci-osx-cxx-pkgs-opt.py
config/examples/arch-nvhpc.py
config/examples/arch-olcf-crusher.py
config/examples/arch-olcf-spock.py
balay@sb /home/balay/petsc (main=)
$ git grep -l "requires:.*kokkos_kernels"
src/ksp/ksp/tests/ex3.c
src/ksp/ksp/tests/ex43.c
src/ksp/ksp/tests/ex60.c
src/ksp/ksp/tutorials/ex7.c
src/mat/tests/ex123.c
src/mat/tests/ex132.c
src/mat/tests/ex2.c
src/mat/tests/ex250.c
src/mat/tests/ex251.c
src/mat/tests/ex252.c
src/mat/tests/ex254.c
src/mat/tests/ex5.c
src/mat/tests/ex62.c
src/mat/tutorials/ex5k.kokkos.cxx
src/snes/tests/ex13.c
src/snes/tutorials/ex13.c
src/snes/tutorials/ex3k.kokkos.cxx
src/snes/tutorials/ex56.c
src/ts/utils/dmplexlandau/tutorials/ex1.c
src/ts/utils/dmplexlandau/tutorials/ex1f90.F90
src/ts/utils/dmplexlandau/tutorials/ex2.c
src/vec/vec/tests/ex21.c
src/vec/vec/tests/ex22.c
src/vec/vec/tests/ex23.c
src/vec/vec/tests/ex28.c
src/vec/vec/tests/ex34.c
src/vec/vec/tests/ex37.c
src/vec/vec/tests/ex38.c
src/vec/vec/tests/ex4.c
src/vec/vec/tests/ex43.c
src/vec/vec/tests/ex60.c
src/vec/vec/tutorials/ex1.c
balay@sb /home/balay/petsc (main=)
$

On Tue, 15 Feb 2022, Satish Balay via petsc-users wrote:


Also - best to use petsc repo - 'main' branch.

And for install on crusher - check config/examples/arch-olcf-crusher.py

Satish

On Tue, 15 Feb 2022, Jed Brown wrote:


We need to make these docs more explicit, but the short answer is configure with 
--download-kokkos --download-kokkos-kernels and run almost any example with 
-dm_mat_type aijkokkos -dm_vec_type kokkos. If you run with -log_view, you should 
see that all the flops take place on the device and there are few host->device 
transfers. Message packing is done on the device and it'll use GPU-aware MPI. 
There are a few examples of residual evaluation and matrix assembly on the device 
using Kokkos. You can also see libCEED examples for assembly on the device into 
Kokkos matrices and vectors without touching host memory.

"Fackler, Philip via petsc-users"  writes:


We're intending to transitioning the Xolotl interfaces with PETSc.

I am hoping someone (can) point us to some documentation (and examples) for 
using PETSc's Kokkos-based interface. If this does not yet exist, then perhaps 
some slides (like the ones Richard Mills showed at the NE-SciDAC all-hands 
meeting) showing some examples could get us started.

Thanks for any help that can be provided,

Philip Fackler
Research Software Engineer, Application Engineering Group
Advanced Computing Systems Research Section
Computer Science and Mathematics Division
Oak Ridge National Laboratory


Re: [petsc-users] config error with hypre on Summit

2018-04-29 Thread Richard Tran Mills
Mark,

I asked OLCF if I could get access to Summit so that I could ensure that
the latest PETSc builds and works on there and was told "NO". If you are
associated with an allocation on there, though, then I could get on and
poke around if the PI is willing to approve me on their project. (I already
have an OLCF account, just no access to pre-production Summit as I'm not
associated with an allocation on there.) Should we ask the PI of your
allocation to get me an account on there?

Pat: I see you cc'ed on this message thread. You are doing a terrible job
of being "retired"! =)

Cheers,
Richard

On Sun, Apr 29, 2018 at 11:24 AM, Matthew Knepley  wrote:

> On Sun, Apr 29, 2018 at 11:38 AM, Mark Adams  wrote:
>
>> I'm getting an error configure hypre on the new IBM at ORNL, Summit.
>>
>
> I can spot your error right away. You are trying to build Hypre ;)
>
> It looks like autoconf is failing for this system, since the config.guess
> script is old. This is
> one for the sysadmin I think.
>
>   Thanks,
>
>  Matt
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
>


Re: [petsc-users] How to turn off preconditioner in PETSC?

2018-03-21 Thread Richard Tran Mills
On Wed, Mar 21, 2018 at 6:12 AM, Matthew Knepley  wrote:

> On Wed, Mar 21, 2018 at 9:07 AM, 我  wrote:
>
>> Thanks for your reply! You mean the preconditioner must be the necessary
>> choice for the linear iterative method in PETSc?
>>
>
> No. As you saw, you can use no preconditioner. The bad convergence has
> nothing to do with PETSc. It is a mathematical fact. All
> iterative methods behave this way.
>
>
>> And the default preconditioner in PETSC is which one?
>>
>
> ILU(0).
>

Let me also point out: If you want to know all of the details of what
solvers and preconditioners are being used, you can run with "-ksp_view"
and you will be given all of the details of what PETSc is using. If you are
using SNES to solve nonlinear problems, then you can also do "-snes_view".

--Richard


>
>
>> I want to compare them in order to illustrate that PCHYPRE is best one
>> for my problem.
>>
>
> Then the right thing to do is read some papers and reproduce what other
> people have done, and show that Hypre is better than that.
>
>
>> If I want to get the matrix after preconditioned (e.g. PAx=Pb, and I want
>> to get PA), is there a function in PETSc?
>>
>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/
> KSPComputeExplicitOperator.html
>
> It is extremely expensive and should only be used for very small problems.
> The whole idea of iterative methods is that you do
> NOT compute this operator explicitly.
>
>Matt
>
>
>> Thanks again!
>> Daye
>>
>>
>>
>>
>> At 2018-03-21 18:45:18, "Matthew Knepley"  wrote:
>>
>> On Wed, Mar 21, 2018 at 3:35 AM, 我  wrote:
>>
>>> Hi,
>>> I want to compare the time cost between preconditioner and
>>> unpreconditioner in PETSc. But I didn't know how to turn off the
>>> preconditioner in Petsc. If I choose the PCNONE, but the solution even can
>>> not converge.
>>>
>>
>> That is how you turn off a preconditioner, -pc_type none. Without a
>> preconditioner, almost nothing converges. You can't have it both ways.
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> If I do not declare PC at the beginning of my program, will PETSc choose
>>> a default preconditioner? I just want to turn off it. Any suggestions?
>>> Thank you very much!
>>> Daye
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/ 
>>
>>
>>
>>
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
>


Re: [petsc-users] questions about vectorization

2017-11-14 Thread Richard Tran Mills
Yes, that's worth a try. Xiangdong, if you want to employ the MKL
implementations for BAIJ MatMult() and friends, you can do so by
configuring petsc-master with a recent version of MKL and then using the
option "-mat_type baijmkl" (on the command line or set in your
PETSC_OPTIONS environment variable).

Note that the above requires a version of MKL that is recent enough to have
the sparse inspector-executor routines. MKL is now free, so I recommend
installing the latest version.

(You can also try using the sparse MKL routines with AIJ format matrices by
using either "-mat_type aijmkl" or "-mat_seqaij_type seqaijmkl". This will
use MKL for MatMult()-type operations and some sparse matrix-matrix
products.)

Best regards,
Richard




On Tue, Nov 14, 2017 at 2:42 PM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote:

>
>   Use MKL versions of block formats?
>
> > On Nov 14, 2017, at 4:40 PM, Richard Tran Mills <rtmi...@anl.gov> wrote:
> >
> > On Tue, Nov 14, 2017 at 12:13 PM, Zhang, Hong <hongzh...@anl.gov> wrote:
> >
> >
> >> On Nov 13, 2017, at 10:49 PM, Xiangdong <epsco...@gmail.com> wrote:
> >>
> >> 1) How about the vectorization of BAIJ format?
> >
> > BAIJ kernels are optimized with manual unrolling, but not with AVX
> intrinsics. So the vectorization relies on the compiler's ability.
> > It may or may not get vectorized depending on the compiler's
> optimization decisions. But vectorization is not essential for the
> performance of most BAIJ kernels.
> >
> > I know that this has come up in previous discussions, but I'm guessing
> that the manual unrolling actually impedes the ability of many modern
> compilers to optimize the BAIJ calculations. I suppose we ought to have a
> switch to enable or disable the use of the unrolled versions? (And, further
> down the road, some sort of performance model to tell us what the setting
> for the switch should be...)
> >
> > --Richard
> >
> >
> >> If the block size s is 2 or 4, would it be ideal for AVXs? Do I need to
> do anything special (more than AVX flag) for the compiler to vectorize it?
> >
> > In double precision, 4 would be good for AVX/AVX2, and 8 would be ideal
> for AVX512. But other block sizes would make vectorization less profitable
> because of the remainders.
> >
> >> 2) Could you please update the linear solver table to label the
> preconditioners/solvers compatible with ELL format?
> >> http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
> >
> > This is still in a working progress. The easiest thing to do would be to
> use ELL for the Jacobian matrix and other formats (e.g. AIJ) for the
> preconditioners.
> > Then you would not need to worry about which preconditioners are
> compatible. An example can be found at ts/examples/tutorials/
> advection-diffusion-reaction/ex5adj.c.
> > For preconditioners such as block jacobi and mg (with bjacobi or with
> sor), you can use ELL for both the preconditioner and the Jacobian,
> > and expect a considerable gain since MatMult is the dominating operation.
> >
> > The makefile for ex5adj includes a few use cases that demonstrate how
> ELL plays with various preconditioners.
> >
> > Hong (Mr.)
> >
> >> Thank you.
> >>
> >> Xiangdong
> >>
> >> On Mon, Nov 13, 2017 at 11:32 AM, Zhang, Hong <hongzh...@anl.gov>
> wrote:
> >> Most operations in PETSc would not benefit much from vectorization
> since they are memory-bounded. But this does not discourage you from
> compiling PETSc with AVX2/AVX512. We have added a new matrix format
> (currently named ELL, but will be changed to SELL shortly) that can make
> MatMult ~2X faster than the AIJ format. The MatMult kernel is
> hand-optimized with AVX intrinsics. It works on any Intel processors that
> support AVX or AVX2 or AVX512, e.g. Haswell, Broadwell, Xeon Phi, Skylake.
> On the other hand, we have been optimizing the AIJ MatMult kernel for these
> architectures as well. And one has to use AVX compiler flags in order to
> take advantage of the optimized kernels and the new matrix format.
> >>
> >> Hong (Mr.)
> >>
> >> > On Nov 12, 2017, at 10:35 PM, Xiangdong <epsco...@gmail.com> wrote:
> >> >
> >> > Hello everyone,
> >> >
> >> > Can someone comment on the vectorization of PETSc? For example, for
> the MatMult function, will it perform better or run faster if it is
> compiled with avx2 or avx512?
> >> >
> >> > Thank you.
> >> >
> >> > Best,
> >> > Xiangdong
> >>
> >>
> >
> >
>
>


Re: [petsc-users] questions about vectorization

2017-11-14 Thread Richard Tran Mills
On Tue, Nov 14, 2017 at 12:13 PM, Zhang, Hong  wrote:

>
>
> On Nov 13, 2017, at 10:49 PM, Xiangdong  wrote:
>
> 1) How about the vectorization of BAIJ format?
>
>
> BAIJ kernels are optimized with manual unrolling, but not with AVX
> intrinsics. So the vectorization relies on the compiler's ability.
> It may or may not get vectorized depending on the compiler's optimization
> decisions. But vectorization is not essential for the performance of most
> BAIJ kernels.
>

I know that this has come up in previous discussions, but I'm guessing that
the manual unrolling actually impedes the ability of many modern compilers
to optimize the BAIJ calculations. I suppose we ought to have a switch to
enable or disable the use of the unrolled versions? (And, further down the
road, some sort of performance model to tell us what the setting for the
switch should be...)

--Richard


> If the block size s is 2 or 4, would it be ideal for AVXs? Do I need to do
> anything special (more than AVX flag) for the compiler to vectorize it?
>
>
> In double precision, 4 would be good for AVX/AVX2, and 8 would be ideal
> for AVX512. But other block sizes would make vectorization less profitable
> because of the remainders.
>
> 2) Could you please update the linear solver table to label the
> preconditioners/solvers compatible with ELL format?
> http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>
>
> This is still in a working progress. The easiest thing to do would be to
> use ELL for the Jacobian matrix and other formats (e.g. AIJ) for the
> preconditioners.
> Then you would not need to worry about which preconditioners are
> compatible. An example can be found at ts/examples/tutorials/
> advection-diffusion-reaction/ex5adj.c.
> For preconditioners such as block jacobi and mg (with bjacobi or with
> sor), you can use ELL for both the preconditioner and the Jacobian,
> and expect a considerable gain since MatMult is the dominating operation.
>
> The makefile for ex5adj includes a few use cases that demonstrate how ELL
> plays with various preconditioners.
>
> Hong (Mr.)
>
> Thank you.
>
> Xiangdong
>
> On Mon, Nov 13, 2017 at 11:32 AM, Zhang, Hong  wrote:
>
>> Most operations in PETSc would not benefit much from vectorization since
>> they are memory-bounded. But this does not discourage you from compiling
>> PETSc with AVX2/AVX512. We have added a new matrix format (currently named
>> ELL, but will be changed to SELL shortly) that can make MatMult ~2X faster
>> than the AIJ format. The MatMult kernel is hand-optimized with AVX
>> intrinsics. It works on any Intel processors that support AVX or AVX2 or
>> AVX512, e.g. Haswell, Broadwell, Xeon Phi, Skylake. On the other hand, we
>> have been optimizing the AIJ MatMult kernel for these architectures as
>> well. And one has to use AVX compiler flags in order to take advantage of
>> the optimized kernels and the new matrix format.
>>
>> Hong (Mr.)
>>
>> > On Nov 12, 2017, at 10:35 PM, Xiangdong  wrote:
>> >
>> > Hello everyone,
>> >
>> > Can someone comment on the vectorization of PETSc? For example, for the
>> MatMult function, will it perform better or run faster if it is compiled
>> with avx2 or avx512?
>> >
>> > Thank you.
>> >
>> > Best,
>> > Xiangdong
>>
>>
>
>


Re: [petsc-users] questions about vectorization

2017-11-14 Thread Richard Tran Mills
Xiangdong,

If you are running on an Intel-based system with support for recent
instruction sets like AVX2 or AVX-512, and you have access to the Intel
compilers, then telling the compiler to target these instruction sets
(e.g., "-xCORE-AVX2" or "-xMIC-AVX512") will probably give you some
noticeable gain in performance. It will be much less than you would expect
from something very CPU-bound like xGEMM code, but, in my experience, it
will be noticeable (remember, even if you have a memory-bound code, your
code's performance won't be bound by the memory subsystem 100% of the
time). I don't know how well the non-Intel compilers are able to
auto-vectorize, so your mileage may vary for those. As Hong has pointed
out, there are some places in the PETSc source in which we have introduced
code using AVX/AVX512 intrinsics. For those codes, you should see benefit
with any compiler that supports these intrinsics, as one is not relying on
the auto-vectorizer then.

Best regards,
Richard

On Mon, Nov 13, 2017 at 8:32 AM, Zhang, Hong  wrote:

> Most operations in PETSc would not benefit much from vectorization since
> they are memory-bounded. But this does not discourage you from compiling
> PETSc with AVX2/AVX512. We have added a new matrix format (currently named
> ELL, but will be changed to SELL shortly) that can make MatMult ~2X faster
> than the AIJ format. The MatMult kernel is hand-optimized with AVX
> intrinsics. It works on any Intel processors that support AVX or AVX2 or
> AVX512, e.g. Haswell, Broadwell, Xeon Phi, Skylake. On the other hand, we
> have been optimizing the AIJ MatMult kernel for these architectures as
> well. And one has to use AVX compiler flags in order to take advantage of
> the optimized kernels and the new matrix format.
>
> Hong (Mr.)
>
> > On Nov 12, 2017, at 10:35 PM, Xiangdong  wrote:
> >
> > Hello everyone,
> >
> > Can someone comment on the vectorization of PETSc? For example, for the
> MatMult function, will it perform better or run faster if it is compiled
> with avx2 or avx512?
> >
> > Thank you.
> >
> > Best,
> > Xiangdong
>
>


Re: [petsc-users] Can not configure PETSc-master with clang-3.9

2017-10-16 Thread Richard Tran Mills
Fande,

Did you remember to agree to the XCode license after your upgrade, if you
did an XCode upgrade? You have to do the license agreement again, otherwise
the compilers don't work at all. Apologies if this seems like a silly thing
to ask, but this has caused me a few minutes of confusion before.

--Richard

On Mon, Oct 16, 2017 at 9:52 AM, Jed Brown  wrote:

> "Kong, Fande"  writes:
>
> > Hi All,
> >
> > I just upgraded  MAC OS, and also updated all other related packages.
> Now
> > I can not configure PETSc-master any more.
>
> Your compiler paths are broken.
>
> /var/folders/6q/y12qpzw12dg5qx5x96dd5_bhtzr4_y/T/petsc-mFgio7/config.setCompilers/conftest.c:3:10:
> fatal error: 'stdlib.h' file not found
> #include 
>  ^
> 1 error generated.
>


Re: [petsc-users] Logo?

2017-08-25 Thread Richard Tran Mills
Maybe it's time to make another attempt. Some years ago the PFLOTRAN
developers decided we needed a new logo and Glenn Hammond came up with a
simple one that I think looks quite nice (see www.pflotran.org). It just
has block lettering and some minimal elements like a water table curve that
convey a sense of what PFLOTRAN does. Perhaps we can think up something
simple like this for PETSc?

--Richard

On Fri, Aug 25, 2017 at 7:12 AM, Jed Brown  wrote:

> Haha, there's a reason nobody uses it.  ;-)
>
> Lukas van de Wiel  writes:
>
> > Well, I would rather not use a logo at all than use that design, to be
> > honest...
> >
> > On Fri, Aug 25, 2017 at 3:05 PM, Satish Balay  wrote:
> >
> >> Well we had this logo created many years ago - but I don't remember the
> >> last time we used it..
> >>
> >> https://bitbucket.org/petsc/petsc/src/15785cf8cfc19332123bc897ee7bf1
> >> 70d1911f0c/src/docs/tex/pictures/petsc_color_logo.jpg?
> >> at=master=file-view-default
> >>
> >> Satish
> >>
> >> On Fri, 25 Aug 2017, Dave May wrote:
> >>
> >> > On Fri, 25 Aug 2017 at 03:43, Mohammad Mirzadeh 
> >> wrote:
> >> >
> >> > > Hey folks,
> >> > >
> >> > > Does PETSc have any sort of official logo?
> >> > >
> >> >
> >> > The answer is no
> >> >
> >> > I often like to include these in my acknowledgment slides but can't
> find
> >> > > any for PETSc. I'd appreciate if you could point me to if there is
> one.
> >> > >
> >> >
> >> > I take a screen shot of the red text on the lovely cornflower blue
> >> backdrop
> >> > from the website and use that as the logo.
> >> >
> >> > petsc was created before catchy logos where fashionable. :) (and
> >> seemingly
> >> > developers devote time to coding and not artistic endeavors)
> >> >
> >> >
> >> > Thanks,
> >> >   Dave
> >> >
> >> >
> >> > > Mohammad
> >> > >
> >> >
> >>
> >>
>


Re: [petsc-users] PETSC profiling on 1536 cores

2017-06-16 Thread Richard Tran Mills
Pietro,

On Fri, Jun 16, 2017 at 1:29 AM, Pietro Incardona 
wrote:

> [...]
>
> I will try now other examples more in the direction of having only CG and
> no preconditioners to see what happen in scalability and understand what I
> am doing wrong. But in the meanwhile I have a second question does the fact
> that I compiled PETSC with debugging = 0 could affect the profiler numbers
> to be unreliable ?
>
Building with debugging = 0 is the right thing to do if you want to do any
performance studies. Building with debugging turned *on* is what might make
the profile numbers unreliable.

--Richard

>
> Thanks in advance
>
> Pietro Incardona
>
>
>
> --
> *Da:* Barry Smith 
> *Inviato:* giovedì 15 giugno 2017 23:16:50
> *A:* Pietro Incardona
> *Cc:* petsc-users@mcs.anl.gov
> *Oggetto:* Re: [petsc-users] PETSC profiling on 1536 cores
>
>
>   Please send the complete -log_view files as attachments. Both 1536 and
> 48 cores. The mailers mess up the ASCII formatting in text so I can't make
> heads or tails out of the result.
>
>   What is the MPI being used and what kind of interconnect does the
> network have? Also is the MPI specific to that interconnect or just
> something compiled off the web?
>
>
>   Barry
>
> > On Jun 15, 2017, at 4:09 PM, Pietro Incardona 
> wrote:
> >
> > Dear All
> >
> > I tried PETSC version 3.6.5 to solve a linear system with 256 000 000
> unknown. The equation is Finite differences Poisson equation.
> >
> > I am using Conjugate gradient (the matrix is symmetric) with no
> preconditioner. Visualizing the solution is reasonable.
> > Unfortunately the Conjugate-Gradient does not scale at all and I am
> extremely concerned about this problem in paticular about the profiling
> numbers.
> > Looking at the profiler it seem that
> >
> > 1536 cores = 24 cores x 64
> >
> > VecScatterBegin  348 1.0 2.3975e-01 1.8 0.00e+00 0.0 7.7e+06 3.1e+04
> 0.0e+00  0  0 85 99  0   0  0 85 99  0 0
> > VecScatterEnd348 1.0 2.8680e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
> > MatMult  348 1.0 4.1088e+00 1.4 8.18e+08 1.3 7.7e+06 3.1e+04
> 0.0e+00  2 52 85 99  0   2 52 85 99  0 281866
> >
> > I was expecting that this part was the most expensive and take around 4
> second in total that sound reasonable
> >
> > Unfortunately
> >
> > on 1536 cores = 24 cores x 64
> >
> > VecTDot  696 1.0 3.4442e+01 1.4 2.52e+08 1.3 0.0e+00 0.0e+00
> 7.0e+02 12 16  0  0 65  12 16  0  0 65 10346
> > VecNorm  349 1.0 1.1101e+02 1.1 1.26e+08 1.3 0.0e+00 0.0e+00
> 3.5e+02 46  8  0  0 33  46  8  0  0 33  1610
> > VecAXPY  696 1.0 8.3134e+01 1.1 2.52e+08 1.3 0.0e+00 0.0e+00
> 0.0e+00 34 16  0  0  0  34 16  0  0  0  4286
> >
> > Take over 228 seconds. Considering that doing some test on the cluster I
> do not see any problem with MPI_Reduce I do not understand how these
> numbers are possible
> >
> >
> > // I also attach to the profiling part the
> inversion on 48 cores /
> >
> > VecTDot  696 1.0 1.4684e+01 1.3 3.92e+09 1.1 0.0e+00 0.0e+00
> 7.0e+02  6 16  0  0 65   6 16  0  0 65 24269
> > VecNorm  349 1.0 4.9612e+01 1.3 1.96e+09 1.1 0.0e+00 0.0e+00
> 3.5e+02 22  8  0  0 33  22  8  0  0 33  3602
> > VecCopy  351 1.0 8.8359e+00 7.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  2  0  0  0  0   2  0  0  0  0 0
> > VecSet 2 1.0 1.6177e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> > VecAXPY  696 1.0 8.8559e+01 1.1 3.92e+09 1.1 0.0e+00 0.0e+00
> 0.0e+00 42 16  0  0  0  42 16  0  0  0  4024
> > VecAYPX  347 1.0 4.6790e+00 1.2 1.95e+09 1.1 0.0e+00 0.0e+00
> 0.0e+00  2  8  0  0  0   2  8  0  0  0 37970
> > VecAssemblyBegin   2 1.0 5.0942e-02 2.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.0e+00  0  0  0  0  1   0  0  0  0  1 0
> > VecAssemblyEnd 2 1.0 1.9073e-05 6.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> > VecScatterBegin  348 1.0 1.2763e+00 1.5 0.00e+00 0.0 4.6e+05 2.0e+05
> 0.0e+00  0  0 97100  0   0  0 97100  0 0
> > VecScatterEnd348 1.0 4.6741e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
> > MatMult  348 1.0 2.8440e+01 1.1 1.27e+10 1.1 4.6e+05 2.0e+05
> 0.0e+00 13 52 97100  0  13 52 97100  0 40722
> > MatAssemblyBegin   1 1.0 7.4749e-0124.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  0  0  0  0  0   0  0  0  0  0 0
> > MatAssemblyEnd 1 1.0 8.3194e-01 1.0 0.00e+00 0.0 2.7e+03 5.1e+04
> 8.0e+00  0  0  1  0  1   0  0  1  0  1 0
> > KSPSetUp   1 1.0 8.2883e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> > KSPSolve   1 1.0 1.7964e+02 1.0 2.45e+10 1.1 4.6e+05 2.0e+05
> 1.0e+03 87100 97100 98  87100 97100 98 

Re: [petsc-users] Fwd: PETSc installation on Intrepid

2013-07-17 Thread Richard Tran Mills
,kargs)
   File /gpfs/home/jkumar/lib/petsc/config/BuildSystem/config/types.py,
line 296, in checkSizeof
 raise RuntimeError(msg)

This is what my configuration looks like (adapted from
config/examples/arch-bgp-ibm-opt.py)
configure_options = [
   '--with-cc=mpixlc',
   '--with-fc=mpixlf90',
   '--with-cxx=mpixlcxx',
   'COPTFLAGS=-O3',
   'FOPTFLAGS=-O3',
   '--with-debugging=0',
   '--with-cmake=/soft/apps/fen/cmake-2.8.3/bin/cmake',
#  '--with-hdf5=/soft/apps/hdf5-1.8.0',
   '--download-parmetis=1',
   '--download-metis=1',
   '--download-plapack=1',
   '--download-hdf5=1'
   ]

I would appreciate any help building the llbrary there.

Thanks,
Jitu






--
Richard Tran Mills, Ph.D.
Computational Earth Scientist  | Joint Assistant Professor
Hydrogeochemical Dynamics Team | EECS and Earth  Planetary Sciences
Oak Ridge National Laboratory  | University of Tennessee, Knoxville
E-mail: rmi...@ornl.gov  V: 865-241-3198 http://climate.ornl.gov/~rmills



[petsc-users] choosing a preconditioner

2010-02-20 Thread Richard Tran Mills
For domain decomposition, I'd also recommend Barry's book (co-authored with 
Bjorstad and Gropp), Domain Decomposition: Parallel Multilevel Methods for 
Elliptic Partial Differential Equations.  Google it and you'll find a preview 
in Google Books.

--Richard

Matthew Knepley wrote:
 The unfortunate part of the word preconditioner is that it is about as 
 precise as justice.
 All good preconditioners are problem specific. That said, my quick 
 suggestions are:
 
 Block box PCs: Yousef Saad's Iterative Methods etc. is a good overview
 
 MG: Bill Brigg's Multigrid Tutorial is good, and so is Multigrid... 
 by Wesseling
 
 Domain Decomp: Widlund and Tosseli's Title I can't remember is good
 
 but most really good PCs come from special solutions, linearizations, 
 frozen terms,
 recognizing strong vs. weak coupling, etc.
 
Matt

-- 
Richard Tran Mills, Ph.D.|   E-mail: rmills at climate.ornl.gov
Computational Scientist  |   Phone:  (865) 241-3198
Computational Earth Sciences Group   |   Fax:(865) 574-0405
Oak Ridge National Laboratory|   http://climate.ornl.gov/~rmills