Ashish Patel writes: > Hi Jed, > VmRss is on a higher side and seems to match what PetscMallocGetMaximumUsage is reporting. HugetlbPages was 0 for me. > > Mark, running without the near nullspace also
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
Mark Adams writes: >>> Yea, my interpretation of these methods is also that " > PetscMemoryGetMaximumUsage" should be >= "PetscMallocGetMaximumUsage". >>> But you are seeing the opposite. >
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
Matthew Knepley writes: >> I'm developing routines that will read/write CGNS files to DMPlex and vice >> versa. >> One of the recurring challenges is the bookkeeping of global numbering for >>
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
Interfaces like KSPSetOperators (https: //urldefense. us/v3/__https: //petsc. org/main/manualpages/KSP/KSPSetOperators/__;!!G_uCfscf7eWS!YINsVNEe8TcsVMY3AVwkS1hf46fWdiKi5JNOe9560N5QG1LPQyjMQgodivpJtg1IwxHgRR3_V3uHWG4h2AI$) have Amat and Pmat arguments.
ZjQcmQRYFpfptBannerStart
s online, I'll post on linkedin but
> ideally we can motivate someone who is already known.
>
> best regards,
> Martin
>
> On Thu, 2024-03-21 at 23:13 -0600, Jed Brown wrote:
>> Barry Smith writes:
>>
>> > > We already have the generated ftn-a
Barry Smith writes: >> We already have the generated ftn-auto-interfaces/*. h90. The INTERFACE keyword could be replaced with CONTAINS (making these definitions instead of just interfaces), and then the bodies
ZjQcmQRYFpfptBannerStart
This Message Is From an External
Barry Smith writes: > In my limited understanding of the Fortran iso_c_binding, if we do not provide an equivalent Fortran stub (the user calls) that uses the iso_c_binding to call PETSc C code, then when the user
ZjQcmQRYFpfptBannerStart
This Message Is From an
Barry Smith writes: > We've always had some tension between adding new features to bfort vs developing an entirely new tool (for example in Python (maybe calling a little LLVM to help parse the C function), for maybe
ZjQcmQRYFpfptBannerStart
This Message Is From an
If you're having PETSc use coloring and have confirmed that the stencil is sufficient, then it would be nonsmoothness (again, consider the limiter you've chosen) preventing quadratic convergence (assuming that doesn't kick in eventually). Note
ZjQcmQRYFpfptBannerStart
One option is to form the preconditioner using the FV1 method, which is sparser and satisfies h-ellipticity, while using FV2 for the residual and (optionally) for matrix-free operator application. FV1 is a highly diffusive method so in a sense,
ZjQcmQRYFpfptBannerStart
For a bit of assistance, you can use DMComposite and DMRedundantCreate; see
src/snes/tutorials/ex21.c and ex22.c.
Note that when computing redundantly, it's critical that the computation be
deterministic (i.e., not using atomics or randomness without matching seeds) so
the logic stays
implement.
>
> Best, Yi
>
> -Original Message-----
> From: Jed Brown
> Sent: Wednesday, December 20, 2023 5:40 PM
> To: Yi Hu ; petsc-users@mcs.anl.gov
> Subject: Re: [petsc-users] fortran interface to snes matrix-free jacobian
>
> Are you wanting an analytic matr
Are you wanting an analytic matrix-free operator or one created for you based
on finite differencing? If the latter, just use -snes_mf or -snes_mf_operator.
https://petsc.org/release/manual/snes/#jacobian-evaluation
Yi Hu writes:
> Dear PETSc team,
>
> My solution scheme relies on a
; Thank you,
>
> Philip Fackler
> Research Software Engineer, Application Engineering Group
> Advanced Computing Systems Research Section
> Computer Science and Mathematics Division
> Oak Ridge National Laboratory
>
> From: Jed Brown
> Sen
I had a one-character typo in the diff above. This MR to release should work
now.
https://gitlab.com/petsc/petsc/-/merge_requests/7120
Jed Brown writes:
> 17 GB for a 1D DMDA, wow. :-)
>
> Could you try applying this diff to make it work for DMDA (it's currently
> handl
17 GB for a 1D DMDA, wow. :-)
Could you try applying this diff to make it work for DMDA (it's currently
handled by DMPlex)?
diff --git i/src/dm/impls/da/fdda.c w/src/dm/impls/da/fdda.c
index cad4d926504..bd2a3bda635 100644
--- i/src/dm/impls/da/fdda.c
+++ w/src/dm/impls/da/fdda.c
@@ -675,19
Pierre Jolivet writes:
>> On 10 Dec 2023, at 8:40 AM, Stephan Köhler
>> wrote:
>>
>> Dear PETSc/Tao team,
>>
>> there is a bug in the voector interface: In the function
>> VecNorm, see, eg.
>> https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm line
>> 197 the check
It uses nonblocking point-to-point by default since that tends to perform
better and is less prone to MPI implementation bugs, but you can select
`-sf_type window` to try it, or use other strategies here depending on the sort
of problem you're working with.
#define PETSCSFBASIC "basic"
meshes are Cartesian, but non-uniform.
>
> Thanks,
> Kevin
>
> On Thu, Nov 30, 2023 at 1:02 AM Jed Brown wrote:
>
>> Is it necessary that it be VTK format or can it be PETSc's binary format
>> or a different mesh format? VTK (be it legacy .vtk or the XML-based .vtu
Is it necessary that it be VTK format or can it be PETSc's binary format or a
different mesh format? VTK (be it legacy .vtk or the XML-based .vtu, etc.) is a
bad format for parallel reading, no matter how much effort might go into an
implementation.
"Kevin G. Wang" writes:
> Good morning
'redundant'. If it's diffusive, then algebraic
multigrid would be a good place to start.
> Let us know what we can do to answer this question more accurately.
>
> Cheers,
>
> Sophie
>
> From: Jed Brown
> Sent: Tuesday, November 28,
"Fackler, Philip via petsc-users" writes:
> That makes sense. Here are the arguments that I think are relevant:
>
> -fieldsplit_1_pc_type redundant -fieldsplit_0_pc_type sor -pc_type fieldsplit
> -pc_fieldsplit_detect_coupling
What sort of physics are in splits 0 and 1?
SOR is not a good GPU
I don't think you want to hash floating point values, but I've had a number of
reasons to want spatial hashing for near-neighbor queries in PETSc and that
would be a great contribution. (Spatial hashes have a length scale and compute
integer bins.)
Brandon Denton via petsc-users writes:
>
What sort of problem are you solving? Algebraic multigrid like gamg or hypre
are good choices for elliptic problems. Sparse triangular solves have horrific
efficiency even on one GPU so you generally want to do your best to stay away
from them.
"Ramoni Z. Sedano Azevedo" writes:
> Hey!
>
> I
What modules do you have loaded. I don't know if it currently works with
cuda-11.7. I assume you're following these instructions carefully.
https://docs.nersc.gov/development/programming-models/mpi/cray-mpich/#cuda-aware-mpi
In our experience, GPU-aware MPI continues to be brittle on these
It's probably easier to apply boundary conditions when you have the serial
mesh. You may consider contributing the reader if it's a format that others use.
"onur.notonur via petsc-users" writes:
> Hi,
>
> I hope this message finds you all in good health and high spirits.
>
> I wanted to
You can place it in a parallel Mat (that has rows or columns on only one rank
or a subset of ranks) and then MatCreateSubMatrix with all new rows/columns on
a different rank or subset of ranks.
That said, you usually have a function that assembles the matrix and you can
just call that on the
Matthew Knepley writes:
> On Wed, Oct 11, 2023 at 1:03 PM Jed Brown wrote:
>
>> I don't see an attachment, but his thesis used conservative variables and
>> defined an effective length scale in a way that seemed to assume constant
>> shape function gradients. I'm
I don't see an attachment, but his thesis used conservative variables and
defined an effective length scale in a way that seemed to assume constant shape
function gradients. I'm not aware of systematic literature comparing the
covariant and contravariant length measures on anisotropic meshes,
Do you want to write a new code using only PETSc or would you be up for
collaborating on ceed-fluids, which is a high-performance compressible SUPG
solver based on DMPlex with good GPU support? It uses the metric to compute
covariant length for stabilization. We have YZƁ shock capturing, though
Suitesparse includes a sparse QR algorithm. The main issue is that (even with
pivoting) the R factor has the same nonzero structure as a Cholesky factor of
A^T A, which is generally much denser than a factor of A, and this degraded
sparsity impacts Q as well.
I wonder if someone would like to
Jacob Faibussowitsch writes:
> More generally, it would be interesting to know the breakdown of installed
> CUDA versions for users. Unlike compilers etc, I suspect that cluster admins
> (and those running on local machines) are much more likely to be updating
> their CUDA toolkits to the
Rohan Yadav writes:
> With modern GPU sizes, for example A100's with 80GB of memory, a vector of
> length 2^31 is not that much memory -- one could conceivably run a CG solve
> with local vectors > 2^31.
Yeah, each vector would be 8 GB (single precision) or 16 GB (double). You can't
store a
t
> be able to take some time to implement a more sustainable solution soon.
>
> Thanks again,
> David
>
> On Fri, Aug 4, 2023 at 9:23 AM Jed Brown wrote:
>
>> Some other TS implementations have a concept of extrapolation as an
>> initial guess. Such method-speci
like
> it would be unnecessary if we instead used a callback in
> `SNESSetComputeInitialGuess` that had access to the internals of
> `TS_Alpha`.
>
> Thanks, David
>
> On Thu, Aug 3, 2023 at 11:28 PM Jed Brown wrote:
>
>> I think you can use TSGetSNES() and SNESSetComputeInitialGuess
I think you can use TSGetSNES() and SNESSetComputeInitialGuess() to modify the
initial guess for SNES. Would that serve your needs? Is there anything else you
can say about how you'd like to compute this initial guess? Is there a paper or
something?
David Kamensky writes:
> Hi,
>
> My
; it seems to be related to AL methods ... but requires that the matrix be
> symmetric?
>
> On Fri, Jul 28, 2023 at 7:04 PM Jed Brown wrote:
>
>> See src/snes/tutorials/ex70.c for the code that I think was used for that
>> paper.
>>
>> Alexander Lindsay write
built an appropriate mesh and problem size for the problem they want to solve
> and added appropriate turbulence modeling (although my general assumption
> is often violated).
>
> > And to confirm, are you doing a nonlinearly implicit velocity-pressure
> solve?
>
> Yes
AMG is subtle here. With AMG for systems, you typically feed it elements of the
near null space. In the case of (smoothed) aggregation, the coarse space will
have a regular block structure with block sizes equal to the number of
near-null vectors. You can use pc_fieldsplit options to select
I think random matrices will produce misleading results. The chance of randomly
generating a matrix that resembles an application is effectively zero. I think
you'd be better off with some model problems varying parameters that control
the physical regime (e.g., shifts to a Laplacian, advection
Zisheng Ye via petsc-users writes:
> Dear PETSc Team
>
> We are testing the GPU support in PETSc's KSPSolve, especially for the GAMG
> and Hypre preconditioners. We have encountered several issues that we would
> like to ask for your suggestions.
>
> First, we have couple of questions when
It looks like Victor is working on hypre-ILU so it is active. PETSc used to
have PILUT support, but it was so buggy/leaky that we removed the interface.
Alexander Lindsay writes:
> Haha no I am not sure. There are a few other preconditioning options I will
> explore before knocking on this
Matthew Knepley writes:
>> The matrix entries are multiplied by 2, that is, the number of processes
>> used to execute the code.
>>
>
> No. This was mostly intended for GPUs, where there is 1 process. If you
> want to use multiple MPI processes, then each process can only introduce
> some
You should partition the entries so each entry is submitted by only one
process. Note that duplicate entries (on the same or different proceses) are
summed as you've seen. For example, in finite elements, it's typical to
partition the elements and each process submits entries from its elements.
e error
> message to petsc-ma...@mcs.anl.gov--
> ------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF
> with errorcode 56.
>
>
> Does cgns work for degree >= 4?
>
>
> Junming
And here's an MR to do what you want without any code/arg changes.
https://gitlab.com/petsc/petsc/-/merge_requests/6588
Jed Brown writes:
> Duan Junming writes:
>
>> Dear Jed,
>>
>>
>> Thank you for the suggestion.
>>
>> When I run tests/e
Duan Junming writes:
> Dear Jed,
>
>
> Thank you for the suggestion.
>
> When I run tests/ex33.c with
>
> ./ex33 -dm_plex_simplex 0 -dm_plex_box_faces 1,1 -mesh_transform annulus
> -dm_coord_space 0 -dm_coord_petscspace_degree 3 -dm_refine 1 -dm_view
> cgns:test.cgns
>
> and load it using
Matthew Knepley writes:
> On Mon, Jun 12, 2023 at 6:01 AM Duan Junming wrote:
>
>> Dear Matt,
>>
>> Thank you for the reply. I have a more specific question about the
>> spectral element example. Do you have any suggestions that how to write
>> all the nodes in each cell to .vtu?
>>
> It is the
Alexander Lindsay writes:
> This has been a great discussion to follow. Regarding
>
>> when time stepping, you have enough mass matrix that cheaper preconditioners
>> are good enough
>
> I'm curious what some algebraic recommendations might be for high Re in
> transients.
What mesh aspect
Matthew Knepley writes:
> On Fri, May 5, 2023 at 10:55 AM Vilmer Dahlberg via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>> Hi.
>>
>>
>> I'm trying to read a mesh of higher element order, in this example a mesh
>> consisting of 10-node tetrahedral elements, from gmsh, into PETSC. But It
Good to hear this works for you. I believe there is still a problem with high
order tetrahedral elements (we've been coping with it for months and someone
asked last week) and plan to look at it as soon as possible now that my
semester finished.
Zongze Yang writes:
> Hi, Matt,
>
> The issue
Boundary faces are often labeled already on a mesh, but you can use this to set
a label for all boundary faces.
https://petsc.org/main/manualpages/DMPlex/DMPlexMarkBoundaryFaces/
"Ferrand, Jesus A." writes:
> Greetings.
>
> I terms of dm-plex terminology, I need a list points corresponding to
Edoardo alinovi writes:
> Hello Barry,
>
> Welcome to the party! Thank you guys for your precious suggestions, they
> are really helpful!
>
> It's been a while since I am messing around and I have tested many
> combinations. Schur + selfp is the best preconditioner, it converges within
> 5 iters
Sebastian Blauth writes:
> Hello everyone,
>
> I wanted to briefly follow up on my question (see my last reply).
> Does anyone know / have an idea why the LSC preconditioner in PETSc does
> not seem to scale well with the problem size (the outer fgmres solver I
> am using nearly scale nearly
Sebastian Blauth writes:
> I agree with your comment for the Stokes equations - for these, I have
> already tried and used the pressure mass matrix as part of a (additive)
> block preconditioner and it gave mesh independent results.
>
> However, for the Navier Stokes equations, is the Schur
Look at config/examples/arch-ci-*.py for the configurations. They're driven
from .gitlab-ci.yml
Alexander Lindsay writes:
> Hi, is there a place I can look to understand the testing recipes used in
> PETSc CI, e.g. what external packages are included (if any), what C++
> dialect is used for
> that it persists, could provide a reproduction scenario.
>
>
>
> On Sat, Apr 1, 2023 at 9:53 PM Jed Brown wrote:
>
>> Mark McClure writes:
>>
>> > Thank you, I will try BCGSL.
>> >
>> > And good to know that this is worth pursuing, and
Mark McClure writes:
> Thank you, I will try BCGSL.
>
> And good to know that this is worth pursuing, and that it is possible. Step
> 1, I guess I should upgrade to the latest release on Petsc.
>
> How can I make sure that I am "using an MPI that follows the suggestion for
> implementers about
If you use unpreconditioned BCGS and ensure that you assemble the same matrix
(depends how you do the communication for that), I think you'll get bitwise
reproducible results when using an MPI that follows the suggestion for
implementers about determinism. Beyond that, it'll depend somewhat on
Great that you got it working. We would accept a merge request that made our
infrastructure less PETSc-specific so long as it doesn't push more complexity
on the end user. That would likely make it easier for you to pull updates in
the future.
Daniele Prada writes:
> Dear Matthew, dear
This suite has been good for my solid mechanics solvers. (It's written here as
a coarse grid solver because we do matrix-free p-MG first, but you can use it
directly.)
https://github.com/hypre-space/hypre/issues/601#issuecomment-1069426997
Blaise Bourdin writes:
> On Mar 27, 2023, at 9:11
Try -pc_gamg_reuse_interpolation 0. I thought this was disabled by default, but
I see pc_gamg->reuse_prol = PETSC_TRUE in the code.
Blaise Bourdin writes:
> On Mar 24, 2023, at 3:21 PM, Mark Adams wrote:
>
> * Do you set:
>
> PetscCall(MatSetOption(Amat, MAT_SPD, PETSC_TRUE));
>
>
You can -pc_gamg_threshold .02 to slow the coarsening and either stronger
smoother or increase number of iterations used for estimation (or increase
tolerance). I assume your system is SPD and you've set the near-null space.
Blaise Bourdin writes:
> Hi,
>
> I am having issue with GAMG for
You can test a benchmark problem with both. It probably doesn't make a lot of
difference with the solver configuration you've selected (most of those
operations are memory bandwidth limited).
If your residual and Jacobian assembly code is written to vectorize, you may
get significant benefit
>
>> Mike, can you test that this branch works with your large problems? I
>> tested that .vtu works in parallel for small problems, where works = loads
>> correctly in Paraview and VisIt.
>>
>> https://gitlab.com/petsc/petsc/-/merge_requests/6081
>>
>> Da
21:27, Jed Brown wrote:
>
>> Dave May writes:
>>
>> > On Tue 14. Feb 2023 at 17:17, Jed Brown wrote:
>> >
>> >> Can you share a reproducer? I think I recall the format requiring
>> certain
>> >> things to be Int32.
>> &g
Dave May writes:
> On Tue 14. Feb 2023 at 17:17, Jed Brown wrote:
>
>> Can you share a reproducer? I think I recall the format requiring certain
>> things to be Int32.
>
>
> By default, the byte offset used with the appended data format is UInt32. I
> belie
Can you share a reproducer? I think I recall the format requiring certain
things to be Int32.
Mike Michell writes:
> Thanks for the note.
> I understood that PETSc calculates the offsets for me through "boffset"
> variable in plexvtu.c file. Please correct me if it is wrong.
>
> If plexvtu.c
Ces VLC writes:
> El El vie, 10 feb 2023 a las 21:44, Barry Smith escribió:
>
>>
>>What is the use case you are looking for that cannot be achieved by
>> just distributing a single precision application? If the user is happy when
>> they happen to have GPUs to use single precision
Is the small matrix dense? Then you can use MatSetValues. If the small matrix
is sparse, you can assemble it with larger dimension (empty rows and columns)
and use MatAXPY.
김성익 writes:
> Hello,
>
>
> I want to put small matrix to large matrix.
> The schematic of operation is as below.
>
You're probably looking for ./configure --prefix=/opt/petsc. It's documented in
./configure --help.
Tim Meehan writes:
> Hi - I am trying to set up a local workstation for a few other developers who
> need PETSc installed from the latest release. I figured that it would be
> easiest for me
Copying my private reply that appeared off-list. If you have one base with
different element types, that's in scope for what I plan to develop soon.
Congrats, you crashed cgnsview.
$ cgnsview dl/HybridGrid.cgns
Error in startup script: file was not found
while executing
"CGNSfile
Matthew Knepley writes:
> On Mon, Jan 16, 2023 at 6:15 PM Jed Brown wrote:
>
>> How soon do you need this? I understand the grumbling about CGNS, but it's
>> easy to build, uses HDF5 parallel IO in a friendly way, supports high order
>> elements, and is generally pr
How soon do you need this? I understand the grumbling about CGNS, but it's easy
to build, uses HDF5 parallel IO in a friendly way, supports high order
elements, and is generally pretty expressive. I wrote a parallel writer (with
some limitations that I'll remove) and plan to replace the current
Dave May writes:
> On Thu 12. Jan 2023 at 17:58, Blaise Bourdin wrote:
>
>> Out of curiosity, what is the rationale for _reading_ high order gmsh
>> meshes?
>>
>
> GMSH can use a CAD engine like OpenCascade. This provides geometric
> representations via things like BSplines. Such geometric
It's confusing, but this line makes high order simplices always read as
discontinuous coordinate spaces. I would love if someone would revisit that,
perhaps also using DMPlexSetIsoperiodicFaceSF(), which should simplify the code
and avoid the confusing cell coordinates pattern. Sadly, I don't
Mark Lohry writes:
> I definitely need multigrid. I was under the impression that GAMG was
> relatively cuda-complete, is that not the case? What functionality works
> fully on GPU and what doesn't, without any host transfers (aside from
> what's needed for MPI)?
>
> If I use -ksp-pc_type gamg
up the vector
>> and copy down the result.
>>
>>
>> On Tue, Jan 10, 2023 at 1:52 PM Barry Smith wrote:
>>
>>>
>>> We don't have colored smoothers currently in PETSc.
>>>
>>> > On Jan 10, 2023, at 12:56 PM, Jed Brown wrote:
>>
Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if the
node size is not uniform). The are good choices for scale-resolving CFD on GPUs.
Mark Lohry writes:
> I'm running GAMG with CUDA, and I'm wondering how the nominally serial
> smoother algorithms are implemented on
The make convention would be to respond to `libdir`, which is probably the
simplest if we can defer that choice until install time. It probably needs to
be known at build time, thus should go in configure.
https://www.gnu.org/software/make/manual/html_node/Directory-Variables.html
Satish Balay
Junchao Zhang writes:
>> I don't think it's remotely crazy. libCEED supports both together and it's
>> very convenient when testing on a development machine that has one of each
>> brand GPU and simplifies binary distribution for us and every package that
>> uses us. Every day I wish PETSc could
Mark Adams writes:
> Support of HIP and CUDA hardware together would be crazy,
I don't think it's remotely crazy. libCEED supports both together and it's very
convenient when testing on a development machine that has one of each brand GPU
and simplifies binary distribution for us and every
This default probably shouldn't be zero, and probably lengthening steps should
be more gentle after a recent failure. But Mark, please let us know if what's
there works for you.
"Zhang, Hong via petsc-users" writes:
> Hi Mark,
>
> You might want to try -ts_adapt_time_step_increase_delay to
This is what I'd expect to observe if you didn't preallocate correctly for the
second matrix, which has more nonzeros per row.
https://petsc.org/release/docs/manual/mat/#sec-matsparse
김성익 writes:
> Hello,
>
>
>
> I have a question about memory of matsetvalue.
>
> When I assembly the local
Indeed, this is exactly how we do quasistatic analysis for solid mechanics in
Ratel (https://gitlab.com/micromorph/ratel) -- make sure to choose an L-stable
integrator (backward Euler being the most natural choice). Implicit dynamics
can be done by choosing a suitable integrator, like TSALPHA2,
Matthew Knepley writes:
> On Fri, Dec 16, 2022 at 12:22 AM Praveen C wrote:
>
>> Thank you very much. I do see correct normals now.
>>
>> Is there a way to set the option
>>
>> -dm_localize_height 1
>>>
>>
>> within the code ?
>>
>
> The problem is that the localization happens within the Gmsh
I ran your code successfully with and without GPU-aware MPI. I see a bit of
time in MatSetValue -- you can make it a bit faster using one MatSetValues call
per row, but it's typical that assembling a matrix like this (sequentially on
the host) will be more expensive than some unpreconditioned
Do you have slip/symmetry boundary conditions, where some components are
constrained? In that case, there is no uniform block size and I think you'll
need DMPlexCreateRigidBody() and MatSetNearNullSpace().
The PCSetCoordinates() code won't work for non-constant block size.
-pc_type gamg should
The description matches MATNEST (MATCOMPOSITE is for a sum or product of
matrices) or parallel decompositions. Also consider the assembly style of
src/snes/tutorials/ex28.c, which can create either a monolithic or block
(MATNEST) matrix without extra storage or conversion costs.
Mark Adams
Barry Smith writes:
>> We could test at runtime whether child threads exist/are created when
>> calling BLAS and deliver a warning.
>
> How does one test for this? Some standard Unix API for checking this?
I'm not sure, the ids of child threads are in /proc/$pid/task/ and (when opened
by a
It isn't always wrong to link threaded BLAS. For example, a user might need to
call threaded BLAS on the side (but the application can only link one) or a
sparse direct solver might want threading for the supernode. We could test at
runtime whether child threads exist/are created when calling
If you're using iterative solvers, compare memory bandwidth first, then cache.
Flops aren't very important unless you use sparse direct solvers or have SNES
residual/Jacobian evaluation that is expensive and has been written for
vectorization.
If you can get the 6650U with LPDDR5-6400, it'll
You do if preconditioners (like AMG) will use it or if using functions like
MatSetValuesBlocked(). If you have uniform block structure, it doesn't hurt.
Edoardo alinovi writes:
> Hi Guys,
>
> Very quick one. Do I need to set the block size with MPIAIJ?
Francesc Levrero-Florencio writes:
> Hi Jed,
>
> Thanks for the answer.
>
> We do have a monolithic arc-length implementation based on the TS/SNES logic,
> but we are also exploring having a custom SNESSHELL because the arc-length
> logic is substantially more complex than that of traditional
First, I believe arc-length continuation is the right approach in this problem
domain. I have a branch starting an implementation, but need to revisit it in
light of some feedback (and time has been too short lately).
My group's nonlinear mechanics solver uses TSBEULER because it's convenient
ess has some velocity and
> pressure dofs) ... but in order to leverage field split we need those index
> sets in order to avoid the equal size constraint?
>
> On Tue, Nov 1, 2022 at 11:57 PM Jed Brown wrote:
>
>> In most circumstances, you can and should interlace in some form such
In most circumstances, you can and should interlace in some form such that each
block in fieldsplit is distributed across all ranks. If you interlace at scalar
granularity as described, then each block needs to be able to do that. So for
the Stokes equations with equal order elements (like
This looks like one block row per process? (BAIJ formats store explicit zeros
that appear within nonzero blocks.) You'd use d_nnz[] = {1}, o_nnz[] = {1} on
each process.
If each of the dummy numbers there was replaced by a nonzero block (so the
diagram would be sketching nonzero 3x3 blocks of
You can get lucky with null spaces even with factorization preconditioners,
especially if the right hand side is orthogonal to the null space. But it's
fragile and you shouldn't rely on that being true as you change the problem.
You can either remove the null space in your problem formulation
I recommend calling this one preallocation function, which will preallocate
scalar and block formats. It takes one value per block row, counting in blocks.
https://petsc.org/release/docs/manualpages/Mat/MatXAIJSetPreallocation/
Edoardo alinovi writes:
> Hello Barry,
>
> I am doing some
1 - 100 of 3943 matches
Mail list logo