[petsc-users] Eskom, SolarReserve, Acciona and KfW join CSP Focus South Africa November

2014-09-23 Thread Ms . Ella Wei
If the message does not display properly, pleaseClick HereHi,CSP Focus South Africa 2014 November6-7 Cape Town is coming up in 40 days! More industry leading players including Eskom, SolarReserve, Acciona and KfW have confirmed as speakers to share their views and experience on the key issues facing South Africa CSP market.Government officials and generation offtaker l Ms. Lena Mangondo, Head: Legal, IPP Office, Department of Energyl Ms. Annelize van der Merwe, Director: Green Economy, Department of Trade and Industryl Mr. Ashley Bhugwandin, Manager: Technology Localization at CSIR, Department of Science and Technologyl TBA, EskomFinanciers and advisors l Mr. Gerrit Kruyswijk, Industry Champion: Green Industries SBU, Industrial Development Corporation (IDC)l Mr. Busso von Alvensleben, Director, KfW Office Pretoria l Mr. Justin Ma, Vice President: Power, Utilities and Infrastructure, Absa Corporate and Investment Bankl Mr. Javier Relancio, Solar Advisor, Mott MacDonaldInternational and local developers and EPCsl Mr. Julian Lopez Garrido, Kaxu Solar One General Manager, Abengoa Solar l Mr. Daniel Schwab, Regional Director-South Africa, BrightSource Energyl Mr.Terence Govender, Director of Development  Africa, SolarReservel Mr. Pablo Garca-Arenal, Managing Director Industrial Infrastructure, Acciona Engineeringl Mr. Ignacio Saenz, Business Development Director, ARIES INGENIERA Y SISTEMAS, S.A.l Mr. Andrew Kesiamang, Managing Director, Afri-Devo  Chairman, Ample SolarComponent manufacturers and service  technology providersl Mr. Luis Alberto Sola, Commercial General Manager, Schott Solar S.Ll Mr. Markus Balz, Managing Director, schlaich bergermann und partnerl Mr. Alexandre Allegue, General Manager, Flabegl Dr. Frank Dinter, Professor, Eskom Chair of CSP at Stellenborsch University Solar Thermal Energy Research Group (STERG)l Mr. Riaan Meyer, CEO, GeoSun Africal Mr. Tobias Schwind, Managing Director, Industrial SolarIf you would like to get more information about this news or CSP Focus South Africa 2014 conference, please visit event website. CSP Focus team is offering 10 extra tickets with early-bird discount (USD400 saved) within this week.Click here to get the limited dicount tickets for CSP Focus South Africa 2014 November 6-7 Do let me know about any of your questions.Cheers,Ella Wei (Ms.)Marketing Manager, CSP Focus Mobile: +86-15000210672Tel: +86-21-58300710ext.8081Email: c...@szwgroup.comWebsite: http://www.szwgroup.com/cspTo ensure you can receive our mail in the future, please addc...@info.szwgroup.comto contacts list

Re: [petsc-users] PETSc/TAU configuration

2014-09-23 Thread Matthew Hills



Hi PETSc Team,


I successfully configured PETSc with TAU using:











  ./configure
--with-mpi=1 --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90
--download-f-blas-lapack=${SESKADIR}/packages/downloads/fblaslapacklinpack-3.1.1.tar.gz

  make PETSC_DIR=/home/hills/seska/packages/petsc PETSC_ARCH=linux-gnu-cxx-opt 
 all

While building I received this error message:

[100%] Built target petsc

**ERROR
  Error during compile, check linux-gnu-cxx-opt/conf/make.log
  Send it and linux-gnu-cxx-opt/conf/configure.log to petsc-ma...@mcs.anl.gov



Because of their large file size, I've attached links to the log files below. 
Any assistance would be greatly appreciated.

make.log - https://www.dropbox.com/s/n6y8nl8go4s41tq/make.log?dl=0
configure.log - https://www.dropbox.com/s/ipvt4urvgldft8x/configure.log?dl=0


Matthew 


 Date: Mon, 22 Sep 2014 10:24:15 -0500
 From: ba...@mcs.anl.gov
 To: hillsma...@outlook.com
 CC: petsc-users@mcs.anl.gov
 Subject: Re: [petsc-users] PETSc/TAU configuration
 
 On Mon, 22 Sep 2014, Matthew Hills wrote:
 
  Hi PETSc Team,
  
  I'm still experiencing difficulties  with configuring PETSc with TAU. I'm 
  currently:
  
  building OpenMPI
  1. ./configure --prefix=${SESKADIR}/packages/openmpi 
  2. make all install
  
  set library path
  1. export LD_LIBRARY_PATH=${SESKADIR}/lib:${SESKADIR}/packages/openmpi 
  /lib:${SESKADIR}/packages/pdt/x86_64/lib:/${SESKADIR}/packages/tau/x86_64/lib:${SESKADIR}/packages/petsc/${PETSC_ARCH}/lib:$LD_LIBRARY_PATH
  2. export 
  PATH=${SESKADIR}/bin:${SESKADIR}/packages/petsc/${PETSC_ARCH}/bin:$PATH
  
  build PDT (pdtoolkit-3.20)
  1. ./configure -GNU
  2. export 
  PATH=${SESKADIR}/packages/pdt/x86_64/bin:${SESKADIR}/packages/pdt/x86_64//bin:$PATH
  5. make
  6. make install
  
  build TAU (tau-2.23.1) using OpenMPI
  1. ./configure -prefix=`pwd` -cc=mpicc -c++=mpicxx -fortran=mpif90 
  -pdt=${SESKADIR}/packages/pdt -mpiinc=${SESKADIR}/packages/openmpi/include 
  -mpilib=${SESKADIR}/packages/openmpi/lib -bfd=download
  2. export PATH=${SESKADIR}/packages/tau/x86_64/bin:$PATH
  3. make install
  
  build fblaslapacklinpack-3.1.1
  1. make
 
 Should have said '--download-fblaslapack' would be fine here [as it
 uses mpif90 - not tau_cc.sh]. Building seperately is also fine.
 
  build PETSc using TAU_CC/MPI
  1. export 
  TAU_MAKEFILE=${SESKADIR}/packages/tau/x86_64/lib/Makefile.tau-mpi-pdt
  2. ./configure --prefix='pwd' --with-mpi=1 --with-cc=tau_cc.sh  
  --with-cxx=mpicxx --with-fc=mpif90  
  --with-blas-lapack-dir=${SESKADIR}/packages/fblaslapack
 
 --prefix='pwd' doesn't make sense. Please remove it.
 
  Error: Tried looking for file: 
  /tmp/petsc-U9YCMv/config.setCompilers/conftest
  Error: Failed to link with TAU options
  Error: Command(Executable) is -- gcc
 
 configure.log looks complete [and indicates a successful run]. Did the
 messages above come during configure step on the terminal?
 
 Can you try the following and see if PETSc builds successfully? [but
 recommend rerunning configure first - without --prefix option]
 
 make PETSC_DIR=/home/hills/seska/packages/petsc PETSC_ARCH=linux-gnu-cxx-opt 
 all
 
 Satish
 
  
  
  Attached you'll find my configure log. Any assistance would be greatly 
  appreciated.
  
  Warm regards,
  Matthew
  
   Date: Tue, 16 Sep 2014 08:21:41 -0500
   From: ba...@mcs.anl.gov
   To: hillsma...@outlook.com
   CC: petsc-users@mcs.anl.gov
   Subject: Re: [petsc-users] PETSc/TAU configuration
   
   I haven't tried using TAU in a while - but here are some obvious things 
   to try.
   
   1. --download-mpich [or openmpi] with TAU does not make sense.
   
   You would have to build MPICH/OpenMPI first.
   
   Then build TAU to use this MPI.
   
   And then build PETSc to use this TAU_CC/MPI
   
   2. I would use only tau_cc.sh - and not bother with c++/fortran
   
   i.e [with TAU build with a given mpicc] - configure PETSc with:
   ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90
   
   3. Do not use any --download-package when using tau_cc.sh. First check
   if you are able to use TAU with PETSc - without externalpackages [you
   would need blas,mpi. Use system blas/lapack for blas/lapack - and
   build MPI as mentioned above for use with TAU and later PETSc]
   
   And if you really need these externalpackage [assuming the above basic
   build with TAU works] - I would recommend the following 2 step build 
   process:
   
   build packages without TAU
   4.1. ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 
   --download-PACKAGE PETSC_ARCH=arch-packages
   
   4.2. Now strip out the petsc relavent stuff from this location
   rm -f arch-packages/include/petsc*.h
   
   4.3. Now build PETSc with TAU - using these prebuilt-packages
   
   ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 
  

Re: [petsc-users] GPU speedup in Poisson solvers

2014-09-23 Thread Dominic Meiser

Hi Karli,

PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c uses 
DMDA's which require a few additional fixes. I haven't opened a pull 
request for these yet but I will do that before Thursday.


Regarding the rebase, wouldn't it be preferable to just resolve the 
conflicts in the merge commit? In any event, I've merged these branches 
several times into local integration branches created off of recent 
petsc/master branches so I'm pretty familiar with the conflicts and how 
to resolve them. I can help with the merge or do a rebase, whichever you 
prefer.


Cheers,
Dominic


On 09/22/2014 10:37 PM, Karl Rupp wrote:

Hi Dominic,

I've got some time available at the end of this week for a merge to 
next. Is there anything other than PR #178 needed? It currently shows 
some conflicts, so is there any chance to rebase it on ~Thursday?


Best regards,
Karli



On 09/22/2014 09:38 PM, Dominic Meiser wrote:

On 09/22/2014 12:57 PM, Chung Shen wrote:

Dear PETSc Users,

I am new to PETSc and trying to determine if GPU speedup is possible
with the 3D Poisson solvers. I configured 2 copies of 'petsc-master'
on a standalone machine, one with CUDA toolkit 5.0 and one without
(both without MPI):
Machine: HP Z820 Workstation, Redhat Enterprise Linux 5.0
CPU: (x2) 8-core Xeon E5-2650 2.0GHz, 128GB Memory
GPU: (x2) Tesla K20c (706MHz, 5.12GB Memory, Cuda Compatibility: 3.5,
Driver: 313.09)

I used 'src/ksp/ksp/examples/tests/ex32.c' as a test and was getting
about 20% speedup with GPU. Is this reasonable or did I miss something?

Attached is a comparison chart with two sample logs. The y-axis is the
elapsed time in seconds and the x-axis corresponds to the size of the
problem. In particular, I wonder if the numbers of calls to
'VecCUSPCopyTo' and 'VecCUSPCopyFrom' shown in the GPU log are 
excessive?


Thanks in advance for your reply.

Best Regards,

Chung Shen

A few comments:

- To get reliable timing you should configure PETSc without debugging
(i.e. --with-debugging=no)
- The ILU preconditioning in your GPU benchmark is done on the CPU. The
host-device data transfers are killing performance. Can you try to run
with the additional option --pc_factor_mat_solver_packe cusparse? This
will perform the preconditioning on the GPU.
- If you're interested in running benchmarks in parallel you will need a
few patches that are not yet in petsc/master. I can put together a
branch that has the needed fixes.

Cheers,
Dominic






--
Dominic Meiser
Tech-X Corporation
5621 Arapahoe Avenue
Boulder, CO 80303
USA
Telephone: 303-996-2036
Fax: 303-448-7756
www.txcorp.com



[petsc-users] SNESSetJacobian

2014-09-23 Thread anton
Starting from version 3.5 the matrix parameters in SNESSetJacobian are 
no longer pointers, hence my question:
What is the most appropriate place to call SNESSetJacobian if I need  to 
change the Jacobian during solution?

What about FormFunction?

Thanks,
Anton


Re: [petsc-users] Valgrind Errors

2014-09-23 Thread Hong
James,
The fix is pushed to petsc-maint (release)
https://bitbucket.org/petsc/petsc/commits/c974faeda5a26542265b90934a889773ab380866

Thanks for your report!

Hong

On Mon, Sep 15, 2014 at 5:05 PM, Hong hzh...@mcs.anl.gov wrote:
 James :
 I'm fixing it in branch
 hzhang/matmatmult-bugfix
 https://bitbucket.org/petsc/petsc/commits/a7c7454dd425191f4a23aa5860b8c6bac03cfd7b

 Once it is further cleaned, and other routines are checked, I will
 patch petsc-release.

 Hong

 Hi Barry,

 Thanks for the response.  You're right, it (both ex70 and my own code) 
 doesn't give those valgrind errors when I run it in parallel.  Changing the 
 type to MATAIJ also fixes the issue.

 Thanks for the help, I appreciate it.

 James





 -Original Message-
 From: Hong [mailto:hzh...@mcs.anl.gov]
 Sent: Friday, September 12, 2014 4:29 PM
 To: Dominic Meiser
 Cc: Barry Smith; James Balasalle; Zhang, Hong; petsc-users@mcs.anl.gov
 Subject: Re: [petsc-users] Valgrind Errors

 I'll check it.
 Hong

 On Fri, Sep 12, 2014 at 3:40 PM, Dominic Meiser dmei...@txcorp.com
 wrote:
  On 09/12/2014 02:11 PM, Barry Smith wrote:
 
  James (and Hong),
 
   Do you ever see this problem in parallel runs?
 
   You are not doing anything wrong.
 
   Here is what is happening.
 
  MatGetBrowsOfAoCols_MPIAIJ() which is used by
  MatMatMult_MPIAIJ_MPIAIJ() assumes that the VecScatters for the
  matrix-vector products are
 
 gen_to   = (VecScatter_MPI_General*)ctx-todata;
 gen_from = (VecScatter_MPI_General*)ctx-from data;
 
  but when run on one process the scatters are not of that form; hence
  the code accesses values in what it thinks is one struct but is
  actually a different one. Hence the valgrind errors.
 
  But since the matrix only lives on one process there is actually
  nothing to move between processors hence no error happens in the
  computation. You can avoid the issue completely by using MATAIJ
  matrix for the type instead of MATMPIAIJ and then on one process it
 automatically uses MATSEQAIJ.
 
  I don’t think the bug has anything in particular to do with the
  MatTranspose.
 
 Hong,
 
   Can you please fix this code? Essentially you can by pass parts
  of the code when the Mat is on only one process. (Maybe this also
  happens for MPIBAIJ matrices?) Send a response letting me know you
 saw this.
 
  Thanks
 
Barry
 
  I had to fix a few issues similar to this a while back. The method
  VecScatterGetTypes_Private introduced in pull request 176 might be
  useful in this context.
 
  Cheers,
  Dominic
 








 This electronic communication and any attachments may contain confidential 
 and proprietary
 information of DigitalGlobe, Inc. If you are not the intended recipient, or 
 an agent or employee
 responsible for delivering this communication to the intended recipient, or 
 if you have received
 this communication in error, please do not print, copy, retransmit, 
 disseminate or
 otherwise use the information. Please indicate to the sender that you have 
 received this
 communication in error, and delete the copy you received. DigitalGlobe 
 reserves the
 right to monitor any electronic communication sent or received by its 
 employees, agents
 or representatives.



Re: [petsc-users] superlu_dist and MatSolveTranspose

2014-09-23 Thread Hong
Antoine,
I just find out that superlu_dist does not support MatSolveTransport yet
(see Sherry's email below).
Once superlu_dist provides this support, we can add it to the
petsc/superlu_dist interface.

Thanks for your patience.

Hong
---

Hong,
Sorry, the transposed solve is not there yet;  it's not as simple as
serial version, because here, it requires to set up entirely different
communication pattern.

I will try to find time to do it.

Sherry

On Tue, Sep 23, 2014 at 8:11 AM, Hong hzh...@mcs.anl.gov wrote:

 Sherry,
 Can superlu_dist be used for solving A^T x = b?

 Using the option
 options.Trans = TRANS;
 with the existing petsc-superlu_dist interface, I cannot get correct solution.

 Hong


On Mon, Sep 22, 2014 at 12:47 PM, Hong hzh...@mcs.anl.gov wrote:
 I'll add it. It would not take too long, just matter of priority.
 I'll try to get it done in a day or two, then let you know when it works.

 Hong

 On Mon, Sep 22, 2014 at 12:11 PM, Antoine De Blois
 antoine.debl...@aero.bombardier.com wrote:
 Dear all,

 Sorry for the delay on this topic.

 Thank you Gaetan for your suggestion. I had thought about doing that 
 originally, but I had left it out since I thought that a rank owned the 
 entire row of the matrix (and not only the sub-diagonal part). I will 
 certainly give it a try.

 I still need the MatSolveTranspose since I need the ability to reuse the 
 residual jacobian matrix from the flow (a 1st order approximation of it), 
 which is assembled in a non-transposed format. This way the adjoint system 
 is solved in a pseudo-time step manner, where the product of the exact 
 jacobian matrix and the adjoint vector is used as a source term in the rhs.

 Hong, do you have an estimation of the time required to implement it in 
 superlu_dist?

 Best,
 Antoine

 -Message d'origine-
 De : Hong [mailto:hzh...@mcs.anl.gov]
 Envoyé : Friday, August 29, 2014 9:14 PM
 À : Gaetan Kenway
 Cc : Antoine De Blois; petsc-users@mcs.anl.gov
 Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose

 We can add MatSolveTranspose() to the petsc interface with superlu_dist.

 Jed,
 Are you working on it? If not, I can work on it.

 Hong

 On Fri, Aug 29, 2014 at 6:14 PM, Gaetan Kenway gaet...@gmail.com wrote:
 Hi Antoine

 We are also using PETSc for solving adjoint systems resulting from
 CFD. To get around the matSolveTranspose issue we just assemble the
 transpose matrix directly and then call KSPSolve(). If this is
 possible in your application I think it is probably the best approach

 Gaetan


 On Fri, Aug 29, 2014 at 3:58 PM, Antoine De Blois
 antoine.debl...@aero.bombardier.com wrote:

 Hello Jed,

 Thank you for your quick response. So I spent some time to dig deeper
 into my problem. I coded a shell script that passes through a bunch
 of ksp_type, pc_type and sub_pc_type. So please disregard the comment
 about the does not converge properly for transpose. I had taken
 that conclusion from my own code (and not from the ex10 and extracted
 matrix), and a KSPSetFromOptions was missing. Apologies for that.

 What remains is the performance issue. The MatSolveTranspose takes a
 very long time to converge. For a matrix of 3 million rows,
 MatSolveTranspose takes roughly 5 minutes on 64 cpus, whereas the
 MatSolve is almost instantaneous!. When I gdb my code, petsc seems to
 be stalled in the MatLUFactorNumeric_SeqAIJ_Inode () for a long time.
 I also did a top on the compute node to check the RAM usage. It was
 hovering over 2 gig, so memory usage does not seem to be an issue here.

 #0  0x2afe8dfebd08 in MatLUFactorNumeric_SeqAIJ_Inode ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 bpetsc.so.3.5
 #1  0x2afe8e07f15c in MatLUFactorNumeric ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 bpetsc.so.3.5
 #2  0x2afe8e2afa99 in PCSetUp_ILU ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 bpetsc.so.3.5
 #3  0x2afe8e337c0d in PCSetUp ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 bpetsc.so.3.5
 #4  0x2afe8e39d643 in KSPSetUp ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 bpetsc.so.3.5
 #5  0x2afe8e39e3ee in KSPSolveTranspose ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 bpetsc.so.3.5
 #6  0x2afe8e300f8c in PCApplyTranspose_ASM ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 bpetsc.so.3.5
 #7  0x2afe8e338c13 in PCApplyTranspose ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 bpetsc.so.3.5
 #8  0x2afe8e3a8a84 in KSPInitialResidual ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 bpetsc.so.3.5
 #9  0x2afe8e376c32 in KSPSolve_GMRES ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
 

Re: [petsc-users] SNESSetJacobian

2014-09-23 Thread Barry Smith

On Sep 23, 2014, at 9:50 AM, anton po...@uni-mainz.de wrote:

 Starting from version 3.5 the matrix parameters in SNESSetJacobian are no 
 longer pointers, hence my question:
 What is the most appropriate place to call SNESSetJacobian if I need  to 
 change the Jacobian during solution?
 What about FormFunction?

   Could you please explain why you need to change the Mat? Our hope was that 
people would not need to change it. Note that you can change the type of a 
matrix at any time. So for example inside your FormJacobian you can have code 
like MatSetType(J,MATAIJ) this wipes out the old matrix data structure and 
gives you an empty matrix of the new type ready to be preallocated and then 
filled. Let us know what you need.

  Barry

 
 Thanks,
 Anton



Re: [petsc-users] superlu_dist and MatSolveTranspose

2014-09-23 Thread Antoine De Blois
Morning Hong,

Alright, fully understood. Please keep me posted on that matter.
Regards,
Antoine

-Message d'origine-
De : Hong [mailto:hzh...@mcs.anl.gov] 
Envoyé : Tuesday, September 23, 2014 11:49 AM
À : Hong
Cc : Antoine De Blois; Gaetan Kenway; petsc-users@mcs.anl.gov; Sherry Li
Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose

Antoine,
I just find out that superlu_dist does not support MatSolveTransport yet (see 
Sherry's email below).
Once superlu_dist provides this support, we can add it to the 
petsc/superlu_dist interface.

Thanks for your patience.

Hong
---

Hong,
Sorry, the transposed solve is not there yet;  it's not as simple as serial 
version, because here, it requires to set up entirely different communication 
pattern.

I will try to find time to do it.

Sherry

On Tue, Sep 23, 2014 at 8:11 AM, Hong hzh...@mcs.anl.gov wrote:

 Sherry,
 Can superlu_dist be used for solving A^T x = b?

 Using the option
 options.Trans = TRANS;
 with the existing petsc-superlu_dist interface, I cannot get correct solution.

 Hong


On Mon, Sep 22, 2014 at 12:47 PM, Hong hzh...@mcs.anl.gov wrote:
 I'll add it. It would not take too long, just matter of priority.
 I'll try to get it done in a day or two, then let you know when it works.

 Hong

 On Mon, Sep 22, 2014 at 12:11 PM, Antoine De Blois 
 antoine.debl...@aero.bombardier.com wrote:
 Dear all,

 Sorry for the delay on this topic.

 Thank you Gaetan for your suggestion. I had thought about doing that 
 originally, but I had left it out since I thought that a rank owned the 
 entire row of the matrix (and not only the sub-diagonal part). I will 
 certainly give it a try.

 I still need the MatSolveTranspose since I need the ability to reuse the 
 residual jacobian matrix from the flow (a 1st order approximation of it), 
 which is assembled in a non-transposed format. This way the adjoint system 
 is solved in a pseudo-time step manner, where the product of the exact 
 jacobian matrix and the adjoint vector is used as a source term in the rhs.

 Hong, do you have an estimation of the time required to implement it in 
 superlu_dist?

 Best,
 Antoine

 -Message d'origine-
 De : Hong [mailto:hzh...@mcs.anl.gov] Envoyé : Friday, August 29, 
 2014 9:14 PM À : Gaetan Kenway Cc : Antoine De Blois; 
 petsc-users@mcs.anl.gov Objet : Re: [petsc-users] superlu_dist and 
 MatSolveTranspose

 We can add MatSolveTranspose() to the petsc interface with superlu_dist.

 Jed,
 Are you working on it? If not, I can work on it.

 Hong

 On Fri, Aug 29, 2014 at 6:14 PM, Gaetan Kenway gaet...@gmail.com wrote:
 Hi Antoine

 We are also using PETSc for solving adjoint systems resulting from 
 CFD. To get around the matSolveTranspose issue we just assemble the 
 transpose matrix directly and then call KSPSolve(). If this is 
 possible in your application I think it is probably the best 
 approach

 Gaetan


 On Fri, Aug 29, 2014 at 3:58 PM, Antoine De Blois 
 antoine.debl...@aero.bombardier.com wrote:

 Hello Jed,

 Thank you for your quick response. So I spent some time to dig 
 deeper into my problem. I coded a shell script that passes through 
 a bunch of ksp_type, pc_type and sub_pc_type. So please disregard 
 the comment about the does not converge properly for transpose. I 
 had taken that conclusion from my own code (and not from the ex10 
 and extracted matrix), and a KSPSetFromOptions was missing. Apologies for 
 that.

 What remains is the performance issue. The MatSolveTranspose takes 
 a very long time to converge. For a matrix of 3 million rows, 
 MatSolveTranspose takes roughly 5 minutes on 64 cpus, whereas the 
 MatSolve is almost instantaneous!. When I gdb my code, petsc seems 
 to be stalled in the MatLUFactorNumeric_SeqAIJ_Inode () for a long time.
 I also did a top on the compute node to check the RAM usage. It was 
 hovering over 2 gig, so memory usage does not seem to be an issue here.

 #0  0x2afe8dfebd08 in MatLUFactorNumeric_SeqAIJ_Inode ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
 li
 bpetsc.so.3.5
 #1  0x2afe8e07f15c in MatLUFactorNumeric ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
 li
 bpetsc.so.3.5
 #2  0x2afe8e2afa99 in PCSetUp_ILU ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
 li
 bpetsc.so.3.5
 #3  0x2afe8e337c0d in PCSetUp ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
 li
 bpetsc.so.3.5
 #4  0x2afe8e39d643 in KSPSetUp ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
 li
 bpetsc.so.3.5
 #5  0x2afe8e39e3ee in KSPSolveTranspose ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
 li
 bpetsc.so.3.5
 #6  0x2afe8e300f8c in PCApplyTranspose_ASM ()
from
 /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
 li
 bpetsc.so.3.5
 #7  

Re: [petsc-users] GPU speedup in Poisson solvers

2014-09-23 Thread Karl Rupp

Hi Dominic,

 PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c uses

DMDA's which require a few additional fixes. I haven't opened a pull
request for these yet but I will do that before Thursday.

Regarding the rebase, wouldn't it be preferable to just resolve the
conflicts in the merge commit? In any event, I've merged these branches
several times into local integration branches created off of recent
petsc/master branches so I'm pretty familiar with the conflicts and how
to resolve them. I can help with the merge or do a rebase, whichever you
prefer.


Ok, I'll give the merge a try and see how things go. :-)

Best regards,
Karli



Re: [petsc-users] GPU speedup in Poisson solvers

2014-09-23 Thread Dominic Meiser

On 09/23/2014 01:45 PM, Karl Rupp wrote:

Hi Dominic,

 PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c 
uses

DMDA's which require a few additional fixes. I haven't opened a pull
request for these yet but I will do that before Thursday.

Regarding the rebase, wouldn't it be preferable to just resolve the
conflicts in the merge commit? In any event, I've merged these branches
several times into local integration branches created off of recent
petsc/master branches so I'm pretty familiar with the conflicts and how
to resolve them. I can help with the merge or do a rebase, whichever you
prefer.


Ok, I'll give the merge a try and see how things go. :-)

Best regards,
Karli


I can join a google+ or skype session to assist if that helps. Let me 
know if you run into problems.


Cheers,
Dominic

--
Dominic Meiser
Tech-X Corporation
5621 Arapahoe Avenue
Boulder, CO 80303
USA
Telephone: 303-996-2036
Fax: 303-448-7756
www.txcorp.com



[petsc-users] DMPlex with spring elements

2014-09-23 Thread Miguel Angel Salazar de Troya
Hi all

I was wondering if it could be possible to build a model similar to the
example snes/ex12.c, but with spring elements (for elasticity) instead of
simplicial elements. Spring elements in a grid, therefore each element
would have two nodes and each node two components. There would be more
differences, because instead of calling the functions f0,f1,g0,g1,g2 and g3
to build the residual and the jacobian, I would call a routine that would
build the residual vector and the jacobian matrix directly. I would not
have shape functions whatsoever. My problem is discrete, I don't have a PDE
and my equations are algebraic. What is the best way in petsc to solve this
problem? Is there any example that I can follow? Thanks in advance

Miguel



-- 
*Miguel Angel Salazar de Troya*
Graduate Research Assistant
Department of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign
(217) 550-2360
salaz...@illinois.edu


Re: [petsc-users] GPU speedup in Poisson solvers

2014-09-23 Thread Dominic Meiser

On 09/23/2014 01:45 PM, Karl Rupp wrote:

Hi Dominic,

 PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c 
uses

DMDA's which require a few additional fixes. I haven't opened a pull
request for these yet but I will do that before Thursday.

Regarding the rebase, wouldn't it be preferable to just resolve the
conflicts in the merge commit? In any event, I've merged these branches
several times into local integration branches created off of recent
petsc/master branches so I'm pretty familiar with the conflicts and how
to resolve them. I can help with the merge or do a rebase, whichever you
prefer.


Ok, I'll give the merge a try and see how things go. :-)

Best regards,
Karli



Hi Karli,

I just updated the branch for PR #178 with the additional fixes for the 
DMDA issues. This branch now has all my GPU related bug fixes.


Cheers,
Dominic

--
Dominic Meiser
Tech-X Corporation
5621 Arapahoe Avenue
Boulder, CO 80303
USA
Telephone: 303-996-2036
Fax: 303-448-7756
www.txcorp.com



[petsc-users] Memory requirements in SUPERLU_DIST

2014-09-23 Thread Zin Lin
Hi
I am solving a frequency domain Maxwell problem for a dielectric structure
of size 90x90x50, (the total matrix size is (90x90x50x6)^2 which includes
the three vector components as well as real and imaginary parts.)
I am using SUPERLU_DIST for the direct solver with the following options

parsymbfact = 1, (parallel symbolic factorization)
permcol = PARMETIS, (parallel METIS)
permrow = NATURAL (natural ordering).

First, I tried to use 4096 cores with 2GB / core memory which totals to
about 8 TB of memory.
I get the following error:

Using ParMETIS for parallel ordering.
Structual symmetry is:100%
   Current memory used:  1400271832 bytes
   Maximum memory used:  1575752120 bytes
***Memory allocation failed for SetupCoarseGraph: adjncy. Requested size:
148242928 bytes

So it seems to be an insufficient memory allocation problem (which
apparently happens at the METIS analysis phase?).

Then, I tried to use 64 large-memory cores which have a total of 2 TB
memory (so larger memory per each core), it seems to work fine (though the
solver takes about 900 sec ).
What I don't understand is  why memory per core matters rather than the
total memory? If the work space is distributed across the processors,
shouldn't it work as long as I choose a sufficient number of smaller-memory
cores? What kind of role does the memory per core play in the algorithm in
contrast to the total memory over all the cores?

The issue is I would rather use a large number of small-memory cores than
any number of the large-memory cores. The latter are two times more
expensive in terms of service units (I am running on STAMPEDE at TACC) and
not many cores are available either.

Any idea would be appreciated.

Zin

-- 
Zin Lin


Re: [petsc-users] Memory requirements in SUPERLU_DIST

2014-09-23 Thread Barry Smith

  This is something you better ask Sherri about. She’s the one who wrote and 
understands SuperLU_DIST


   Barry

On Sep 23, 2014, at 7:00 PM, Zin Lin zinlin.zin...@gmail.com wrote:

 Hi 
 I am solving a frequency domain Maxwell problem for a dielectric structure of 
 size 90x90x50, (the total matrix size is (90x90x50x6)^2 which includes the 
 three vector components as well as real and imaginary parts.)
 I am using SUPERLU_DIST for the direct solver with the following options
 
 parsymbfact = 1, (parallel symbolic factorization)
 permcol = PARMETIS, (parallel METIS)
 permrow = NATURAL (natural ordering).
 
 First, I tried to use 4096 cores with 2GB / core memory which totals to about 
 8 TB of memory.
 I get the following error:
 
 Using ParMETIS for parallel ordering.
 Structual symmetry is:100%
Current memory used:  1400271832 bytes
Maximum memory used:  1575752120 bytes
 ***Memory allocation failed for SetupCoarseGraph: adjncy. Requested size: 
 148242928 bytes
 
 So it seems to be an insufficient memory allocation problem (which apparently 
 happens at the METIS analysis phase?).
 
 Then, I tried to use 64 large-memory cores which have a total of 2 TB memory 
 (so larger memory per each core), it seems to work fine (though the solver 
 takes about 900 sec ).
 What I don't understand is  why memory per core matters rather than the total 
 memory? If the work space is distributed across the processors, shouldn't it 
 work as long as I choose a sufficient number of smaller-memory cores? What 
 kind of role does the memory per core play in the algorithm in contrast to 
 the total memory over all the cores? 
 
 The issue is I would rather use a large number of small-memory cores than any 
 number of the large-memory cores. The latter are two times more expensive in 
 terms of service units (I am running on STAMPEDE at TACC) and not many cores 
 are available either.
 
 Any idea would be appreciated.
 
 Zin
 
 -- 
 Zin Lin
 



Re: [petsc-users] DMPlex with spring elements

2014-09-23 Thread Matthew Knepley
On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya 
salazardetr...@gmail.com wrote:

 Hi all

 I was wondering if it could be possible to build a model similar to the
 example snes/ex12.c, but with spring elements (for elasticity) instead of
 simplicial elements. Spring elements in a grid, therefore each element
 would have two nodes and each node two components. There would be more
 differences, because instead of calling the functions f0,f1,g0,g1,g2 and g3
 to build the residual and the jacobian, I would call a routine that would
 build the residual vector and the jacobian matrix directly. I would not
 have shape functions whatsoever. My problem is discrete, I don't have a PDE
 and my equations are algebraic. What is the best way in petsc to solve this
 problem? Is there any example that I can follow? Thanks in advance


Yes, ex12 is fairly specific to FEM. However, I think the right tools for
what you want are
DMPlex and PetscSection. Here is how I would proceed:

  1) Make a DMPlex that encodes a simple network that you wish to simulate

  2) Make a PetscSection that gets the data layout right. Its hard from the
above
  for me to understand where you degrees of freedom actually are. This
is usually
  the hard part.

  3) Calculate the residual, so you can check an exact solution. Here you
use the
  PetscSectionGetDof/Offset() for each mesh piece that you are
interested in. Again,
  its hard to be more specific when I do not understand your
discretization.

  Thanks,

Matt


 Miguel



 --
 *Miguel Angel Salazar de Troya*
 Graduate Research Assistant
 Department of Mechanical Science and Engineering
 University of Illinois at Urbana-Champaign
 (217) 550-2360
 salaz...@illinois.edu




-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener


Re: [petsc-users] DMPlex with spring elements

2014-09-23 Thread Abhyankar, Shrirang G.
You may also want to take a look at the DMNetwork framework that can be
used for general unstructured networks that don't use PDEs. Its
description is given in the manual and an example is in
src/snes/examples/tutorials/network/pflow.

Shri 

From:  Matthew Knepley knep...@gmail.com
Date:  Tue, 23 Sep 2014 22:40:52 -0400
To:  Miguel Angel Salazar de Troya salazardetr...@gmail.com
Cc:  petsc-users@mcs.anl.gov petsc-users@mcs.anl.gov
Subject:  Re: [petsc-users] DMPlex with spring elements


On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya
salazardetr...@gmail.com wrote:

Hi all
I was wondering if it could be possible to build a model similar to the
example snes/ex12.c, but with spring elements (for elasticity) instead of
simplicial elements. Spring elements in a grid, therefore each element
would have two nodes and each node two components. There would be more
differences, because instead of calling the functions f0,f1,g0,g1,g2 and
g3 to build the residual and the jacobian, I would call a routine that
would build the residual vector and the jacobian matrix directly. I would
not have shape functions whatsoever. My problem is discrete, I don't have
a PDE and my equations are algebraic. What is the best way in petsc to
solve this problem? Is there any example that I can follow? Thanks in
advance




Yes, ex12 is fairly specific to FEM. However, I think the right tools for
what you want are
DMPlex and PetscSection. Here is how I would proceed:

  1) Make a DMPlex that encodes a simple network that you wish to simulate

  2) Make a PetscSection that gets the data layout right. Its hard from
the above
  for me to understand where you degrees of freedom actually are.
This is usually
  the hard part.

  3) Calculate the residual, so you can check an exact solution. Here you
use the
  PetscSectionGetDof/Offset() for each mesh piece that you are
interested in. Again,
  its hard to be more specific when I do not understand your
discretization.

  Thanks,

Matt
 

Miguel



-- 
Miguel Angel Salazar de Troya
Graduate Research Assistant
Department of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign
(217) 550-2360 tel:%28217%29%20550-2360
salaz...@illinois.edu









-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener



Re: [petsc-users] Memory requirements in SUPERLU_DIST

2014-09-23 Thread Jed Brown
Zin Lin zinlin.zin...@gmail.com writes:
 What I don't understand is  why memory per core matters rather than the
 total memory? If the work space is distributed across the processors,
 shouldn't it work as long as I choose a sufficient number of smaller-memory
 cores? 

METIS is a serial partitioner.  ParMETIS is parallel, but often performs
worse and still doesn't scale to very large numbers of cores, so it is
not the default for most direct solver packages.

 What kind of role does the memory per core play in the algorithm in
 contrast to the total memory over all the cores?

 The issue is I would rather use a large number of small-memory cores than
 any number of the large-memory cores. The latter are two times more
 expensive in terms of service units (I am running on STAMPEDE at TACC) and
 not many cores are available either.



pgpdS0gnFnArc.pgp
Description: PGP signature