Yaron,
Anything is possible :-) and maybe not terribly difficult to get started.
You could use DAGetMatrx() to give you the properly pre-allocated huge Mat.
Have each process loop over the rectangular portion[s] of the domain that
it mostly owns (that is if a rectangular portion lies
On 2/2/07, Shi Jin jinzishuai at yahoo.com wrote:
I found out that on a shared-memory machine (60GB RAM,
16CPUS), the code runs around 4 times slower than
on a distributed memory cluster (4GB Ram, 4CPU/node),
although they yield identical results.
However, I read the PETSc FAQ and found
There are 2 aspects to performance.
- MPI performance [while message passing]
- sequential performance for the numerical stuff.
So it could be that the SMP box has better MPI performance. This can
be verified with -log_summary from both the runs [and looking at
VecScatter times]
However with
On Fri, 2 Feb 2007, Satish Balay wrote:
However with the sequential numerical codes - it primarily depends
upon the bandwidth between the CPU and the memory. On the SMP box -
depending upon how the memory subsystem is designed - the effective
memory bandwidth per cpu could be a small fraction
There is a point which is not clear for me.
When you run in your shared-memory machine...
- Are you running your as a 'sequential' program
with a global,shared
memory space?
- Or are you running it through MPI, as a
distributed memory
application using MPI message passing (where
...
URL:
http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20070202/f305e7e8/attachment.htm
Hi Ben,
It will probably work, but it will be more expensive. If you use an
implicit algorithm to solve the flow, it really pays off to have the
boundary conditions implicit as well. Explicit boundary conditions
mean you will need additional iterations, which is really unnecessary
in your