On Mon, 7 Dec 2009 10:53:10 -0500, Alex Peyser peyser.alex at gmail.com wrote:
I've had endless trouble integrating petsc into my system because the object-
oriented C is both only partially object oriented while simultaneously
making access to the underlying functionality obscure.
Okay, I'm
This whole thread has been really interesting. I'm interested in
Barry's suggestion, and I know a handful of people not on the list who
would also be interested. I've got a couple of questions/responses
to the MatSetValues, point-wise physics, and support tools issues
that were raised.
On Mon, Dec 7, 2009 at 3:05 PM, Chris Kees
christopher.e.kees at usace.army.mil wrote:
This whole thread has been really interesting. I'm interested in Barry's
suggestion, and I know a handful of people not on the list who would also be
interested. I've got a couple of questions/responses
On Dec 7, 2009, at 9:53 AM, Alex Peyser wrote:
Have y'all considered a different approach?
I've had endless trouble integrating petsc into my system because
the object-
oriented C is both only partially object oriented while
simultaneously
making access to the underlying functionality
On Sat, 5 Dec 2009 16:50:38 -0600, Matthew Knepley knepley at gmail.com wrote:
You assign a few threads per element to calculate the FEM
integral. You could maintain this unassembled if you only need
actions.
You can also store it with much less memory as just values at quadrature
points.
On Fri, Dec 4, 2009 at 10:42 PM, Barry Smith bsmith at mcs.anl.gov wrote:
Suggestion:
1) Discard PETSc
2) Develop a general Py{CL, CUDA, OpenMP-C} system that dispatches tasks
onto GPUs and multi-core systems (generally we would have one python process
per compute node and local
This is an interesting proposal. Two thoughts:
Residual and Jacobian evaluation cannot be written in Python (though it
can be prototyped there). After a discretization is chosen, the physics
is usually representable as a tiny kernel (Riemann solvers/pointwise
operation at quadrature points),
On Sat, Dec 5, 2009 at 1:06 PM, Jed Brown jed at 59a2.org wrote:
This is an interesting proposal. Two thoughts:
Residual and Jacobian evaluation cannot be written in Python (though it
can be prototyped there). After a discretization is chosen, the physics
is usually representable as a tiny
On Sat, 5 Dec 2009 13:09:33 -0600, Matthew Knepley knepley at gmail.com wrote:
Then kernels are moved to an accelerator.
These kernels necessarily involve user code (physics). It's a lot to
ask users to maintain two versions of their physics, one which is
debuggable and another which is fast
As someone who has a finite-element code built upon PETSc/Sieve with the
top-level code in Python, I am in favor of Barry's approach.
As Matt mentions debugging multi-languages is more complex. Unit testing
helps solve some of this because tests associated with the low-level
code involve only
Somehow this drifted off the list, hopefully the deep citations provide
sufficient context.
On Sat, 5 Dec 2009 13:42:31 -0600, Matthew Knepley knepley at gmail.com wrote:
On Sat, Dec 5, 2009 at 1:32 PM, Jed Brown jed at 59a2.org wrote:
On Sat, 5 Dec 2009 13:20:20 -0600, Matthew Knepley
Cython can accelerate almost any Python code nearly immediately
(although it supports a somewhat restricted subset of Python).
This is simply due to converting it to equivalent C code that is compiled
and runs within CPython.
Then, chunks of the Python code can be explicitly typed, which can
On Sat, Dec 5, 2009 at 2:29 PM, Ethan Coon etc2103 at columbia.edu wrote:
I'm a big fan of Barry's approach as well.
However, the current state of debugging tools is not up to snuff for
this type of model. In using petsc4py regularly, debugging cython and
python (user-defined) functions
This is a very interesting issue. Suppose you write the RHSFunction in
Python
and pass to SNES. Are you saying that pdb cannot stop in that method
when
you step over SNESSolve() in Python? That would suck. If on the other
hand,
you passed in C, I can see how you are relegated to obscure
On Fri, 4 Dec 2009 22:42:35 -0600, Barry Smith bsmith at mcs.anl.gov wrote:
generally we would have one python process per compute node and local
parallelism would be done via the low-level kernels to the cores
and/or GPUs.
I think one MPI process per node is fine for MPI performance on good
On Sat, Dec 5, 2009 at 3:50 PM, Jed Brown jed at 59a2.org wrote:
On Fri, 4 Dec 2009 22:42:35 -0600, Barry Smith bsmith at mcs.anl.gov wrote:
generally we would have one python process per compute node and local
parallelism would be done via the low-level kernels to the cores
and/or GPUs.
On Sat, 5 Dec 2009 16:02:38 -0600, Matthew Knepley knepley at gmail.com wrote:
I need to understand better. You are asking about the case where we have
many GPUs and one CPU? If its always one or two GPUs per CPU I do not
see the problem.
Barry initially proposed one Python thread per node,
On Sat, Dec 5, 2009 at 4:25 PM, Jed Brown jed at 59a2.org wrote:
On Sat, 5 Dec 2009 16:02:38 -0600, Matthew Knepley knepley at gmail.com
wrote:
I need to understand better. You are asking about the case where we have
many GPUs and one CPU? If its always one or two GPUs per CPU I do not
On Sat, Dec 5, 2009 at 6:01 PM, Jed Brown jed at 59a2.org wrote:
On Sat, 5 Dec 2009 16:50:38 -0600, Matthew Knepley knepley at gmail.com
wrote:
You assign a few threads per element to calculate the FEM
integral. You could maintain this unassembled if you only need
actions.
You can also
Suggestion:
1) Discard PETSc
2) Develop a general Py{CL, CUDA, OpenMP-C} system that dispatches
tasks onto GPUs and multi-core systems (generally we would have one
python process per compute node and local parallelism would be done
via the low-level kernels to the cores and/or GPUs.)
20 matches
Mail list logo