problem compiling PETSC on MacOS Leopard

2007-11-18 Thread Bernard Knaepen
 unsigned char *,
  unsigned);
void setprogname(const char *);
int sradixsort(const unsigned char **, int, const unsigned char *,
  unsigned);
void sranddev(void);
void srandomdev(void);
void *reallocf(void *, size_t);
long long
   strtoq(const char *, char **, int);
unsigned long long
   strtouq(const char *, char **, int);
extern char *suboptarg;
void *valloc(size_t);
# 3 conftest.c 2

 Popping language C
= 
= 
= 
= 
= 
= 
= 
= 

TEST checkCxxCompiler from config.setCompilers(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:541)
TESTING: checkCxxCompiler from config.setCompilers(python/BuildSystem/ 
config/setCompilers.py:541)
   Locate a functional Cxx compiler
= 
= 
= 
= 
= 
= 
= 
= 

TEST checkFortranCompiler from config.setCompilers(/Users/bknaepen/ 
Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:708)
TESTING: checkFortranCompiler from config.setCompilers(python/ 
BuildSystem/config/setCompilers.py:708)
   Locate a functional Fortran compiler
Checking for program /opt/intel/fc/10.0.020/bin/mpif90...not found
Checking for program /usr/X11R6/bin/mpif90...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpif90...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpif90...found
   Defined make macro FC to mpif90
   Pushing language FC
sh: mpif90 -c -o conftest.o   conftest.F
Executing: mpif90 -c -o conftest.o   conftest.F
sh:
Possible ERROR while running compiler: ret = 256
error message = {ifort: error #10106: Fatal error in /opt/intel/fc/ 
10.0.020/bin/fpp, terminated by segmentation violation
}
Source:
   program main

   end
   Popping language FC
   Error testing Fortran compiler: Cannot compile FC with mpicc.
MPI installation mpif90 is likely incorrect.
   Use --with-mpi-dir to indicate an alternate MPI.
*
  UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log  
for details):
---
Fortran compiler you provided with --with-fc=mpif90 does not work
*
   File ./config/configure.py, line 190, in petsc_configure
 framework.configure(out = sys.stdout)
   File /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
framework.py, line 878, in configure
 child.configure()
   File /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
setCompilers.py, line 1267, in configure
 self.executeTest(self.checkFortranCompiler)
   File /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
base.py, line 93, in executeTest
 return apply(test, args,kargs)
   File /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
setCompilers.py, line 714, in checkFortranCompiler
 for compiler in self.generateFortranCompilerGuesses():
   File /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
setCompilers.py, line 631, in generateFortranCompilerGuesses
 raise RuntimeError('Fortran compiler you provided with --with- 
fc='+self.framework.argDB['with-fc']+' does not work')

-- next part --
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 486 bytes
Desc: not available
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071118/0b20d767/attachment.pgp


problem compiling PETSC on MacOS Leopard

2007-11-18 Thread Barry Smith

Please direct these problems to petsc-maint instead of petsc-users.

From the log file
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpif90...found
Defined make macro FC to mpif90
Pushing language FC
sh: mpif90 -c -o conftest.o   conftest.F
Executing: mpif90 -c -o conftest.o   conftest.F
sh:
Possible ERROR while running compiler: ret =3D 256
error message =3D {ifort: error #10106: Fatal error in /opt/intel/fc/=20
10.0.020/bin/fpp, terminated by segmentation violation
}
Source:
program main

end

So the mpif90 is crashing on a simple Fortran program with nothing
in it. Can you try compiling exactly as above from the command line?


Barry


On Sun, 18 Nov 2007, Bernard Knaepen wrote:

 Hello,

 I would like to compile PETSC on Leopard but I am encountering a problem 
 during configuration. The scripts stops with:

 dolfin:petsc-2.3.3-p8 bknaepen$ ./config/configure.py --with-cc=mpicc 
 --with-fc=mpif90 --with-cxx=mpicxx

 =
Configuring PETSc to compile on your system
 =
 TESTING: checkFortranCompiler from 
 config.setCompilers(python/BuildSystem/config/setCompilers.py:708) 
 *
UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log for 
 details):
 ---
 Fortran compiler you provided with --with-fc=mpif90 does not work
 *


 My MPI installation is mpich2 1.0.6p1 and I have the latest ifort compiler 
 installed (10.0.20). I have test mpif90 and it is working ok. I copy below 
 the configure.log file.


 Any help would be appreciated, thanks,

 Bernard.




   Pushing language C
   Popping language C
   Pushing language Cxx
   Popping language Cxx
   Pushing language FC
   Popping language FC
 sh: /bin/sh 
 /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.guess
 Executing: /bin/sh 
 /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.guess
 sh: i686-apple-darwin9.1.0

 sh: /bin/sh 
 /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.sub
  
 i686-apple-darwin9.1.0

 Executing: /bin/sh 
 /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.sub
  
 i686-apple-darwin9.1.0

 sh: i686-apple-darwin9.1.0


 
 
 Starting Configure Run at Sun Nov 18 10:29:29 2007
 Configure Options: --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx 
 --with-shared=0 --configModules=PETSc.Configure 
 --optionsModule=PETSc.compilerOptions
 Working directory: /Users/bknaepen/Unix/petsc-2.3.3-p8
 Python version:
 2.5.1 (r251:54863, Oct  5 2007, 21:08:09)
 [GCC 4.0.1 (Apple Inc. build 5465)]
 
   Pushing language C
   Popping language C
   Pushing language Cxx
   Popping language Cxx
   Pushing language FC
   Popping language FC
 
 TEST configureExternalPackagesDir from 
 config.framework(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/framework.py:807)
 TESTING: configureExternalPackagesDir from 
 config.framework(python/BuildSystem/config/framework.py:807)
 
 TEST configureLibrary from 
 PETSc.packages.PVODE(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/packages/PVODE.py:10)
 TESTING: configureLibrary from 
 PETSc.packages.PVODE(python/PETSc/packages/PVODE.py:10)
 Find a PVODE installation and check if it can work with PETSc
 
 TEST configureLibrary from 
 PETSc.packages.NetCDF(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/packages/NetCDF.py:10)
 TESTING: configureLibrary from 
 PETSc.packages.NetCDF(python/PETSc/packages/NetCDF.py:10)
 Find a NetCDF installation and check if it can work with PETSc
 
 TEST configureMercurial from 
 config.sourceControl(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:23)
 TESTING: configureMercurial from 
 config.sourceControl(python/BuildSystem/config/sourceControl.py:23)
 Find the Mercurial executable
 Checking for program 

problem compiling PETSC on MacOS Leopard

2007-11-18 Thread Satish Balay

Executing: mpif90 -c -o conftest.o   conftest.F
sh:
Possible ERROR while running compiler: ret = 256
error message = {ifort: error #10106: Fatal error in 
/opt/intel/fc/10.0.020/bin/fpp, terminated by segmentation violation


ifort is giving SEGV - hence configure failed. There must be some
compatibility issue with ifort and Leopard.

Satish

On Sun, 18 Nov 2007, Bernard Knaepen wrote:

 Hello,
 
 I would like to compile PETSC on Leopard but I am encountering a problem
 during configuration. The scripts stops with:
 
 dolfin:petsc-2.3.3-p8 bknaepen$ ./config/configure.py --with-cc=mpicc
 --with-fc=mpif90 --with-cxx=mpicxx




Parallel ISCreateGeneral()

2007-11-18 Thread Tim Stitt
Hi all,

Just wanted to know if the the length of the index set for a call to 
ISCreateGeneral() in a parallel code, is a global length, or the length 
of the local elements on each process?

Thanks,

Tim.

-- 
Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)




Parallel ISCreateGeneral()

2007-11-18 Thread Matthew Knepley
IS are not really parallel, so all the lengths, etc. only refer to local things.

  Matt

On Nov 18, 2007 11:22 AM, Tim Stitt timothy.stitt at ichec.ie wrote:
 Hi all,

 Just wanted to know if the the length of the index set for a call to
 ISCreateGeneral() in a parallel code, is a global length, or the length
 of the local elements on each process?

 Thanks,

 Tim.

 --
 Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
 HPC Application Consultant - ICHEC (www.ichec.ie)

 Dublin Institute for Advanced Studies
 5 Merrion Square - Dublin 2 - Ireland

 +353-1-6621333 (tel) / +353-1-6621477 (fax)





-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener




Parallel ISCreateGeneral()

2007-11-18 Thread Matthew Knepley
On Nov 18, 2007 11:34 AM, Tim Stitt timothy.stitt at ichec.ie wrote:
 OK..so I should be using the aggregate length returned by
 MatGetOwnershipRange() routine?

If you are using it to permute a Mat, yes.

  Matt

 Thanks Matt for you help.


 Matthew Knepley wrote:
  IS are not really parallel, so all the lengths, etc. only refer to local 
  things.
 
Matt
 
  On Nov 18, 2007 11:22 AM, Tim Stitt timothy.stitt at ichec.ie wrote:
 
  Hi all,
 
  Just wanted to know if the the length of the index set for a call to
  ISCreateGeneral() in a parallel code, is a global length, or the length
  of the local elements on each process?
 
  Thanks,
 
  Tim.
 
  --
  Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
  HPC Application Consultant - ICHEC (www.ichec.ie)
 
  Dublin Institute for Advanced Studies
  5 Merrion Square - Dublin 2 - Ireland
 
  +353-1-6621333 (tel) / +353-1-6621477 (fax)
 
 
 
 
 
 
 


 --
 Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
 HPC Application Consultant - ICHEC (www.ichec.ie)

 Dublin Institute for Advanced Studies
 5 Merrion Square - Dublin 2 - Ireland

 +353-1-6621333 (tel) / +353-1-6621477 (fax)





-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener




Parallel ISCreateGeneral()

2007-11-18 Thread Tim Stitt
Matt,

It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls 
which require index sets. I have distributed my rows across the 
processes and now just a bit confused about the arguments to the 
ISCreateGeneral() routine to set up the IS sets used by the Factor 
routines in parallel.

So my basic question is what in general is the length and integers that 
get passed to ISCreateGeneral() when doing this type of calculation in 
parallel? Are they local index values (0..#rows on process-1) or do they 
refer to the distributed indices of the global matrix?

Tim.

Matthew Knepley wrote:
 On Nov 18, 2007 11:34 AM, Tim Stitt timothy.stitt at ichec.ie wrote:
   
 OK..so I should be using the aggregate length returned by
 MatGetOwnershipRange() routine?
 

 If you are using it to permute a Mat, yes.

   Matt

   
 Thanks Matt for you help.


 Matthew Knepley wrote:
 
 IS are not really parallel, so all the lengths, etc. only refer to local 
 things.

   Matt

 On Nov 18, 2007 11:22 AM, Tim Stitt timothy.stitt at ichec.ie wrote:

   
 Hi all,

 Just wanted to know if the the length of the index set for a call to
 ISCreateGeneral() in a parallel code, is a global length, or the length
 of the local elements on each process?

 Thanks,

 Tim.

 --
 Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
 HPC Application Consultant - ICHEC (www.ichec.ie)

 Dublin Institute for Advanced Studies
 5 Merrion Square - Dublin 2 - Ireland

 +353-1-6621333 (tel) / +353-1-6621477 (fax)



 


   
 --
 Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
 HPC Application Consultant - ICHEC (www.ichec.ie)

 Dublin Institute for Advanced Studies
 5 Merrion Square - Dublin 2 - Ireland

 +353-1-6621333 (tel) / +353-1-6621477 (fax)


 



   


-- 
Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)




Parallel ISCreateGeneral()

2007-11-18 Thread Matthew Knepley
On Nov 18, 2007 11:52 AM, Tim Stitt timothy.stitt at ichec.ie wrote:
 Matt,

 It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls
 which require index sets. I have distributed my rows across the
 processes and now just a bit confused about the arguments to the
 ISCreateGeneral() routine to set up the IS sets used by the Factor
 routines in parallel.

 So my basic question is what in general is the length and integers that
 get passed to ISCreateGeneral() when doing this type of calculation in
 parallel? Are they local index values (0..#rows on process-1) or do they
 refer to the distributed indices of the global matrix?

To be consistent, these would be local sizes and global numberings. However,
I am not sure why you would be doing this. I do not believe any of the parallel
LU packages accept an ordering from the user (they calculate their own),
and I would really only use them from a KSP (or PC at the least).

  Matt

 Tim.


 Matthew Knepley wrote:
  On Nov 18, 2007 11:34 AM, Tim Stitt timothy.stitt at ichec.ie wrote:
 
  OK..so I should be using the aggregate length returned by
  MatGetOwnershipRange() routine?
 
 
  If you are using it to permute a Mat, yes.
 
Matt
 
 
  Thanks Matt for you help.
 
 
  Matthew Knepley wrote:
 
  IS are not really parallel, so all the lengths, etc. only refer to local 
  things.
 
Matt
 
  On Nov 18, 2007 11:22 AM, Tim Stitt timothy.stitt at ichec.ie wrote:
 
 
  Hi all,
 
  Just wanted to know if the the length of the index set for a call to
  ISCreateGeneral() in a parallel code, is a global length, or the length
  of the local elements on each process?
 
  Thanks,
 
  Tim.
 
  --
  Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
  HPC Application Consultant - ICHEC (www.ichec.ie)
 
  Dublin Institute for Advanced Studies
  5 Merrion Square - Dublin 2 - Ireland
 
  +353-1-6621333 (tel) / +353-1-6621477 (fax)
 
 
 
 
 
 
 
  --
  Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
  HPC Application Consultant - ICHEC (www.ichec.ie)
 
  Dublin Institute for Advanced Studies
  5 Merrion Square - Dublin 2 - Ireland
 
  +353-1-6621333 (tel) / +353-1-6621477 (fax)
 
 
 
 
 
 
 


 --
 Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
 HPC Application Consultant - ICHEC (www.ichec.ie)

 Dublin Institute for Advanced Studies
 5 Merrion Square - Dublin 2 - Ireland

 +353-1-6621333 (tel) / +353-1-6621477 (fax)





-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener




Parallel ISCreateGeneral()

2007-11-18 Thread Tim Stitt
Oh...ok I am now officially confused.

I have developed a serial code for getting the first k rows of an 
inverted sparse matrix..thanks to PETSC users/developers help this past 
week.

In that code I was calling MatLUFactorSymbolic() and 
MatLUFactorNumeric() to factor the sparse matrix and then calling 
MatSolve for each of the first k columns in the identity matrix as the 
RHS. I then varied the matrix type from the command line to test MUMPS, 
SUPERLU etc. for the best performance.

Now I just want to translate the code into a parallel version...so I now 
assemble rows in a distributed fashion and now working on translating 
the MatLUFactorSymbolic() and MatLUFactorNumeric() calls which require 
index sets...hence my original question.

Are you saying that I now shouldn't be calling those routines?

Tim.

Matthew Knepley wrote:
 On Nov 18, 2007 11:52 AM, Tim Stitt timothy.stitt at ichec.ie wrote:
   
 Matt,

 It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls
 which require index sets. I have distributed my rows across the
 processes and now just a bit confused about the arguments to the
 ISCreateGeneral() routine to set up the IS sets used by the Factor
 routines in parallel.

 So my basic question is what in general is the length and integers that
 get passed to ISCreateGeneral() when doing this type of calculation in
 parallel? Are they local index values (0..#rows on process-1) or do they
 refer to the distributed indices of the global matrix?
 

 To be consistent, these would be local sizes and global numberings. However,
 I am not sure why you would be doing this. I do not believe any of the 
 parallel
 LU packages accept an ordering from the user (they calculate their own),
 and I would really only use them from a KSP (or PC at the least).

   Matt

   
 Tim.


 Matthew Knepley wrote:
 
 On Nov 18, 2007 11:34 AM, Tim Stitt timothy.stitt at ichec.ie wrote:

   
 OK..so I should be using the aggregate length returned by
 MatGetOwnershipRange() routine?

 
 If you are using it to permute a Mat, yes.

   Matt


   
 Thanks Matt for you help.


 Matthew Knepley wrote:

 
 IS are not really parallel, so all the lengths, etc. only refer to local 
 things.

   Matt

 On Nov 18, 2007 11:22 AM, Tim Stitt timothy.stitt at ichec.ie wrote:


   
 Hi all,

 Just wanted to know if the the length of the index set for a call to
 ISCreateGeneral() in a parallel code, is a global length, or the length
 of the local elements on each process?

 Thanks,

 Tim.

 --
 Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
 HPC Application Consultant - ICHEC (www.ichec.ie)

 Dublin Institute for Advanced Studies
 5 Merrion Square - Dublin 2 - Ireland

 +353-1-6621333 (tel) / +353-1-6621477 (fax)




 

   
 --
 Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
 HPC Application Consultant - ICHEC (www.ichec.ie)

 Dublin Institute for Advanced Studies
 5 Merrion Square - Dublin 2 - Ireland

 +353-1-6621333 (tel) / +353-1-6621477 (fax)



 


   
 --
 Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
 HPC Application Consultant - ICHEC (www.ichec.ie)

 Dublin Institute for Advanced Studies
 5 Merrion Square - Dublin 2 - Ireland

 +353-1-6621333 (tel) / +353-1-6621477 (fax)


 



   


-- 
Dr. Timothy Stitt timothy_dot_stitt_at_ichec.ie
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)




Dual core performance estimate

2007-11-18 Thread Ben Tay
Hi,

someone was talking abt core 2 duo performance on os x in some previous
email. it seems that due to memory issues, it's not possible to get 2x the
performance. there's also some mention of amd vs intel dual core.

for computation using PETSc, is there any reason to buy one instead of the
other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, what sort
of performance increase can we expect as compared to PETSc + nompi on the
same machine?

or is that too difficult an answer to give since there are too many factors?

thank you

regards
-- next part --
An HTML attachment was scrubbed...
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071118/4fbf2f74/attachment.htm


Dual core performance estimate

2007-11-18 Thread Aron Ahmadia
Hi Ben,

You're asking a question that is very specific to the program you're
running.  I think the general consensus on this list has been that for
the more common uses of PETSc, getting dual-cores will not speed up
your performance as much as dual-processors.  For OS X, dual-cores are
pretty much the baseline now, so I wouldn't worry too much about it.

~A

On Nov 18, 2007 3:34 PM, Ben Tay zonexo at gmail.com wrote:
 Hi,

 someone was talking abt core 2 duo performance on os x in some previous
 email. it seems that due to memory issues, it's not possible to get 2x the
 performance. there's also some mention of amd vs intel dual core.

 for computation using PETSc, is there any reason to buy one instead of the
 other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, what sort
 of performance increase can we expect as compared to PETSc + nompi on the
 same machine?

 or is that too difficult an answer to give since there are too many factors?

 thank you

 regards




Dual core performance estimate

2007-11-18 Thread Gideon Simpson
I asked the original question, and I have a follow up.  Like it or  
not, multi-core CPUs have been thrust upon us by the manufacturers  
and many of us are more likely to have access to a shared memory,  
multi core/multi processor machine, than a properly built cluster  
with MPI in mind.

So, two questions in this direction:

1.  How feasible would it be to implement  OpenMP in PETSc so that  
multi core CPUs could be properly used?

2.  Even if we are building a cluster, it looks like AMD/Intel are  
thrusting multi core up on is.  To that end, what is the feasibility  
of merging MPI and OpenMP so that between nodes, we use MPI, but  
within each node, OpenMP is used to take advantage of the multiple  
cores.

-gideon

On Nov 18, 2007, at 3:53 PM, Aron Ahmadia wrote:

 Hi Ben,

 You're asking a question that is very specific to the program you're
 running.  I think the general consensus on this list has been that for
 the more common uses of PETSc, getting dual-cores will not speed up
 your performance as much as dual-processors.  For OS X, dual-cores are
 pretty much the baseline now, so I wouldn't worry too much about it.

 ~A

 On Nov 18, 2007 3:34 PM, Ben Tay zonexo at gmail.com wrote:
 Hi,

 someone was talking abt core 2 duo performance on os x in some  
 previous
 email. it seems that due to memory issues, it's not possible to  
 get 2x the
 performance. there's also some mention of amd vs intel dual core.

 for computation using PETSc, is there any reason to buy one  
 instead of the
 other? Also, supposed I use winxp + mpich2 + PETSc on a dual core,  
 what sort
 of performance increase can we expect as compared to PETSc + nompi  
 on the
 same machine?

 or is that too difficult an answer to give since there are too  
 many factors?

 thank you

 regards


-- next part --
An HTML attachment was scrubbed...
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071118/8e49f4ba/attachment.htm


Dual core performance estimate

2007-11-18 Thread Barry Smith

   Gideon,

On Sun, 18 Nov 2007, Gideon Simpson wrote:

 I asked the original question, and I have a follow up.  Like it or not, 
 multi-core CPUs have been thrust upon us by the manufacturers and many of us 
 are more likely to have access to a shared memory, multi core/multi processor 
 machine, than a properly built cluster with MPI in mind.

 So, two questions in this direction:

 1.  How feasible would it be to implement  OpenMP in PETSc so that multi core 
 CPUs could be properly used?

 2.  Even if we are building a cluster, it looks like AMD/Intel are thrusting 
 multi core up on is.  To that end, what is the feasibility of merging MPI and 
 OpenMP so that between nodes, we use MPI, but within each node, OpenMP is 
 used to take advantage of the multiple cores.

 -gideon


Unfortunately using MPI+OpenMP on multi-core systems for the iterative
solution of linear systems will not help AT ALL. Sparse matrix algorithms
(like matrix-vector production, triangular solves) are memory bandwidth limited.
The speed of the memory is not enough to support 2 (or more) processes both
trying to pull sparse matrices from memory at the same time; the details of
the parallelism are not the issue.

   Now it is possible that other parts of a PETSc code; like evaluating
nonlinear functions, evaluating Jacobians and other stuff may NOT be
memory bandwidth limited. Those parts of the code might benefit by using
OpenMP on those pieces of the code, while only using the single thread
on the linear solvers. That is, you would run PETSc with one MPI process
per node, then in parts of your code you would use OpenMP loop level
parallelism or OpenMP task parallelism.

Barry



 On Nov 18, 2007, at 3:53 PM, Aron Ahmadia wrote:

 Hi Ben,
 
 You're asking a question that is very specific to the program you're
 running.  I think the general consensus on this list has been that for
 the more common uses of PETSc, getting dual-cores will not speed up
 your performance as much as dual-processors.  For OS X, dual-cores are
 pretty much the baseline now, so I wouldn't worry too much about it.
 
 ~A
 
 On Nov 18, 2007 3:34 PM, Ben Tay zonexo at gmail.com wrote:
 Hi,
 
 someone was talking abt core 2 duo performance on os x in some previous
 email. it seems that due to memory issues, it's not possible to get 2x the
 performance. there's also some mention of amd vs intel dual core.
 
 for computation using PETSc, is there any reason to buy one instead of the
 other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, what 
 sort
 of performance increase can we expect as compared to PETSc + nompi on the
 same machine?
 
 or is that too difficult an answer to give since there are too many 
 factors?
 
 thank you
 
 regards