[petsc-users] Eskom, SolarReserve, Acciona and KfW join CSP Focus South Africa November
If the message does not display properly, pleaseClick HereHi,CSP Focus South Africa 2014 November6-7 Cape Town is coming up in 40 days! More industry leading players including Eskom, SolarReserve, Acciona and KfW have confirmed as speakers to share their views and experience on the key issues facing South Africa CSP market.Government officials and generation offtaker l Ms. Lena Mangondo, Head: Legal, IPP Office, Department of Energyl Ms. Annelize van der Merwe, Director: Green Economy, Department of Trade and Industryl Mr. Ashley Bhugwandin, Manager: Technology Localization at CSIR, Department of Science and Technologyl TBA, EskomFinanciers and advisors l Mr. Gerrit Kruyswijk, Industry Champion: Green Industries SBU, Industrial Development Corporation (IDC)l Mr. Busso von Alvensleben, Director, KfW Office Pretoria l Mr. Justin Ma, Vice President: Power, Utilities and Infrastructure, Absa Corporate and Investment Bankl Mr. Javier Relancio, Solar Advisor, Mott MacDonaldInternational and local developers and EPCsl Mr. Julian Lopez Garrido, Kaxu Solar One General Manager, Abengoa Solar l Mr. Daniel Schwab, Regional Director-South Africa, BrightSource Energyl Mr.Terence Govender, Director of Development Africa, SolarReservel Mr. Pablo Garca-Arenal, Managing Director Industrial Infrastructure, Acciona Engineeringl Mr. Ignacio Saenz, Business Development Director, ARIES INGENIERA Y SISTEMAS, S.A.l Mr. Andrew Kesiamang, Managing Director, Afri-Devo Chairman, Ample SolarComponent manufacturers and service technology providersl Mr. Luis Alberto Sola, Commercial General Manager, Schott Solar S.Ll Mr. Markus Balz, Managing Director, schlaich bergermann und partnerl Mr. Alexandre Allegue, General Manager, Flabegl Dr. Frank Dinter, Professor, Eskom Chair of CSP at Stellenborsch University Solar Thermal Energy Research Group (STERG)l Mr. Riaan Meyer, CEO, GeoSun Africal Mr. Tobias Schwind, Managing Director, Industrial SolarIf you would like to get more information about this news or CSP Focus South Africa 2014 conference, please visit event website. CSP Focus team is offering 10 extra tickets with early-bird discount (USD400 saved) within this week.Click here to get the limited dicount tickets for CSP Focus South Africa 2014 November 6-7 Do let me know about any of your questions.Cheers,Ella Wei (Ms.)Marketing Manager, CSP Focus Mobile: +86-15000210672Tel: +86-21-58300710ext.8081Email: c...@szwgroup.comWebsite: http://www.szwgroup.com/cspTo ensure you can receive our mail in the future, please addc...@info.szwgroup.comto contacts list
Re: [petsc-users] PETSc/TAU configuration
Hi PETSc Team, I successfully configured PETSc with TAU using: ./configure --with-mpi=1 --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 --download-f-blas-lapack=${SESKADIR}/packages/downloads/fblaslapacklinpack-3.1.1.tar.gz make PETSC_DIR=/home/hills/seska/packages/petsc PETSC_ARCH=linux-gnu-cxx-opt all While building I received this error message: [100%] Built target petsc **ERROR Error during compile, check linux-gnu-cxx-opt/conf/make.log Send it and linux-gnu-cxx-opt/conf/configure.log to petsc-ma...@mcs.anl.gov Because of their large file size, I've attached links to the log files below. Any assistance would be greatly appreciated. make.log - https://www.dropbox.com/s/n6y8nl8go4s41tq/make.log?dl=0 configure.log - https://www.dropbox.com/s/ipvt4urvgldft8x/configure.log?dl=0 Matthew Date: Mon, 22 Sep 2014 10:24:15 -0500 From: ba...@mcs.anl.gov To: hillsma...@outlook.com CC: petsc-users@mcs.anl.gov Subject: Re: [petsc-users] PETSc/TAU configuration On Mon, 22 Sep 2014, Matthew Hills wrote: Hi PETSc Team, I'm still experiencing difficulties with configuring PETSc with TAU. I'm currently: building OpenMPI 1. ./configure --prefix=${SESKADIR}/packages/openmpi 2. make all install set library path 1. export LD_LIBRARY_PATH=${SESKADIR}/lib:${SESKADIR}/packages/openmpi /lib:${SESKADIR}/packages/pdt/x86_64/lib:/${SESKADIR}/packages/tau/x86_64/lib:${SESKADIR}/packages/petsc/${PETSC_ARCH}/lib:$LD_LIBRARY_PATH 2. export PATH=${SESKADIR}/bin:${SESKADIR}/packages/petsc/${PETSC_ARCH}/bin:$PATH build PDT (pdtoolkit-3.20) 1. ./configure -GNU 2. export PATH=${SESKADIR}/packages/pdt/x86_64/bin:${SESKADIR}/packages/pdt/x86_64//bin:$PATH 5. make 6. make install build TAU (tau-2.23.1) using OpenMPI 1. ./configure -prefix=`pwd` -cc=mpicc -c++=mpicxx -fortran=mpif90 -pdt=${SESKADIR}/packages/pdt -mpiinc=${SESKADIR}/packages/openmpi/include -mpilib=${SESKADIR}/packages/openmpi/lib -bfd=download 2. export PATH=${SESKADIR}/packages/tau/x86_64/bin:$PATH 3. make install build fblaslapacklinpack-3.1.1 1. make Should have said '--download-fblaslapack' would be fine here [as it uses mpif90 - not tau_cc.sh]. Building seperately is also fine. build PETSc using TAU_CC/MPI 1. export TAU_MAKEFILE=${SESKADIR}/packages/tau/x86_64/lib/Makefile.tau-mpi-pdt 2. ./configure --prefix='pwd' --with-mpi=1 --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 --with-blas-lapack-dir=${SESKADIR}/packages/fblaslapack --prefix='pwd' doesn't make sense. Please remove it. Error: Tried looking for file: /tmp/petsc-U9YCMv/config.setCompilers/conftest Error: Failed to link with TAU options Error: Command(Executable) is -- gcc configure.log looks complete [and indicates a successful run]. Did the messages above come during configure step on the terminal? Can you try the following and see if PETSc builds successfully? [but recommend rerunning configure first - without --prefix option] make PETSC_DIR=/home/hills/seska/packages/petsc PETSC_ARCH=linux-gnu-cxx-opt all Satish Attached you'll find my configure log. Any assistance would be greatly appreciated. Warm regards, Matthew Date: Tue, 16 Sep 2014 08:21:41 -0500 From: ba...@mcs.anl.gov To: hillsma...@outlook.com CC: petsc-users@mcs.anl.gov Subject: Re: [petsc-users] PETSc/TAU configuration I haven't tried using TAU in a while - but here are some obvious things to try. 1. --download-mpich [or openmpi] with TAU does not make sense. You would have to build MPICH/OpenMPI first. Then build TAU to use this MPI. And then build PETSc to use this TAU_CC/MPI 2. I would use only tau_cc.sh - and not bother with c++/fortran i.e [with TAU build with a given mpicc] - configure PETSc with: ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 3. Do not use any --download-package when using tau_cc.sh. First check if you are able to use TAU with PETSc - without externalpackages [you would need blas,mpi. Use system blas/lapack for blas/lapack - and build MPI as mentioned above for use with TAU and later PETSc] And if you really need these externalpackage [assuming the above basic build with TAU works] - I would recommend the following 2 step build process: build packages without TAU 4.1. ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-PACKAGE PETSC_ARCH=arch-packages 4.2. Now strip out the petsc relavent stuff from this location rm -f arch-packages/include/petsc*.h 4.3. Now build PETSc with TAU - using these prebuilt-packages ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90
Re: [petsc-users] GPU speedup in Poisson solvers
Hi Karli, PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c uses DMDA's which require a few additional fixes. I haven't opened a pull request for these yet but I will do that before Thursday. Regarding the rebase, wouldn't it be preferable to just resolve the conflicts in the merge commit? In any event, I've merged these branches several times into local integration branches created off of recent petsc/master branches so I'm pretty familiar with the conflicts and how to resolve them. I can help with the merge or do a rebase, whichever you prefer. Cheers, Dominic On 09/22/2014 10:37 PM, Karl Rupp wrote: Hi Dominic, I've got some time available at the end of this week for a merge to next. Is there anything other than PR #178 needed? It currently shows some conflicts, so is there any chance to rebase it on ~Thursday? Best regards, Karli On 09/22/2014 09:38 PM, Dominic Meiser wrote: On 09/22/2014 12:57 PM, Chung Shen wrote: Dear PETSc Users, I am new to PETSc and trying to determine if GPU speedup is possible with the 3D Poisson solvers. I configured 2 copies of 'petsc-master' on a standalone machine, one with CUDA toolkit 5.0 and one without (both without MPI): Machine: HP Z820 Workstation, Redhat Enterprise Linux 5.0 CPU: (x2) 8-core Xeon E5-2650 2.0GHz, 128GB Memory GPU: (x2) Tesla K20c (706MHz, 5.12GB Memory, Cuda Compatibility: 3.5, Driver: 313.09) I used 'src/ksp/ksp/examples/tests/ex32.c' as a test and was getting about 20% speedup with GPU. Is this reasonable or did I miss something? Attached is a comparison chart with two sample logs. The y-axis is the elapsed time in seconds and the x-axis corresponds to the size of the problem. In particular, I wonder if the numbers of calls to 'VecCUSPCopyTo' and 'VecCUSPCopyFrom' shown in the GPU log are excessive? Thanks in advance for your reply. Best Regards, Chung Shen A few comments: - To get reliable timing you should configure PETSc without debugging (i.e. --with-debugging=no) - The ILU preconditioning in your GPU benchmark is done on the CPU. The host-device data transfers are killing performance. Can you try to run with the additional option --pc_factor_mat_solver_packe cusparse? This will perform the preconditioning on the GPU. - If you're interested in running benchmarks in parallel you will need a few patches that are not yet in petsc/master. I can put together a branch that has the needed fixes. Cheers, Dominic -- Dominic Meiser Tech-X Corporation 5621 Arapahoe Avenue Boulder, CO 80303 USA Telephone: 303-996-2036 Fax: 303-448-7756 www.txcorp.com
[petsc-users] SNESSetJacobian
Starting from version 3.5 the matrix parameters in SNESSetJacobian are no longer pointers, hence my question: What is the most appropriate place to call SNESSetJacobian if I need to change the Jacobian during solution? What about FormFunction? Thanks, Anton
Re: [petsc-users] Valgrind Errors
James, The fix is pushed to petsc-maint (release) https://bitbucket.org/petsc/petsc/commits/c974faeda5a26542265b90934a889773ab380866 Thanks for your report! Hong On Mon, Sep 15, 2014 at 5:05 PM, Hong hzh...@mcs.anl.gov wrote: James : I'm fixing it in branch hzhang/matmatmult-bugfix https://bitbucket.org/petsc/petsc/commits/a7c7454dd425191f4a23aa5860b8c6bac03cfd7b Once it is further cleaned, and other routines are checked, I will patch petsc-release. Hong Hi Barry, Thanks for the response. You're right, it (both ex70 and my own code) doesn't give those valgrind errors when I run it in parallel. Changing the type to MATAIJ also fixes the issue. Thanks for the help, I appreciate it. James -Original Message- From: Hong [mailto:hzh...@mcs.anl.gov] Sent: Friday, September 12, 2014 4:29 PM To: Dominic Meiser Cc: Barry Smith; James Balasalle; Zhang, Hong; petsc-users@mcs.anl.gov Subject: Re: [petsc-users] Valgrind Errors I'll check it. Hong On Fri, Sep 12, 2014 at 3:40 PM, Dominic Meiser dmei...@txcorp.com wrote: On 09/12/2014 02:11 PM, Barry Smith wrote: James (and Hong), Do you ever see this problem in parallel runs? You are not doing anything wrong. Here is what is happening. MatGetBrowsOfAoCols_MPIAIJ() which is used by MatMatMult_MPIAIJ_MPIAIJ() assumes that the VecScatters for the matrix-vector products are gen_to = (VecScatter_MPI_General*)ctx-todata; gen_from = (VecScatter_MPI_General*)ctx-from data; but when run on one process the scatters are not of that form; hence the code accesses values in what it thinks is one struct but is actually a different one. Hence the valgrind errors. But since the matrix only lives on one process there is actually nothing to move between processors hence no error happens in the computation. You can avoid the issue completely by using MATAIJ matrix for the type instead of MATMPIAIJ and then on one process it automatically uses MATSEQAIJ. I don’t think the bug has anything in particular to do with the MatTranspose. Hong, Can you please fix this code? Essentially you can by pass parts of the code when the Mat is on only one process. (Maybe this also happens for MPIBAIJ matrices?) Send a response letting me know you saw this. Thanks Barry I had to fix a few issues similar to this a while back. The method VecScatterGetTypes_Private introduced in pull request 176 might be useful in this context. Cheers, Dominic This electronic communication and any attachments may contain confidential and proprietary information of DigitalGlobe, Inc. If you are not the intended recipient, or an agent or employee responsible for delivering this communication to the intended recipient, or if you have received this communication in error, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this communication in error, and delete the copy you received. DigitalGlobe reserves the right to monitor any electronic communication sent or received by its employees, agents or representatives.
Re: [petsc-users] superlu_dist and MatSolveTranspose
Antoine, I just find out that superlu_dist does not support MatSolveTransport yet (see Sherry's email below). Once superlu_dist provides this support, we can add it to the petsc/superlu_dist interface. Thanks for your patience. Hong --- Hong, Sorry, the transposed solve is not there yet; it's not as simple as serial version, because here, it requires to set up entirely different communication pattern. I will try to find time to do it. Sherry On Tue, Sep 23, 2014 at 8:11 AM, Hong hzh...@mcs.anl.gov wrote: Sherry, Can superlu_dist be used for solving A^T x = b? Using the option options.Trans = TRANS; with the existing petsc-superlu_dist interface, I cannot get correct solution. Hong On Mon, Sep 22, 2014 at 12:47 PM, Hong hzh...@mcs.anl.gov wrote: I'll add it. It would not take too long, just matter of priority. I'll try to get it done in a day or two, then let you know when it works. Hong On Mon, Sep 22, 2014 at 12:11 PM, Antoine De Blois antoine.debl...@aero.bombardier.com wrote: Dear all, Sorry for the delay on this topic. Thank you Gaetan for your suggestion. I had thought about doing that originally, but I had left it out since I thought that a rank owned the entire row of the matrix (and not only the sub-diagonal part). I will certainly give it a try. I still need the MatSolveTranspose since I need the ability to reuse the residual jacobian matrix from the flow (a 1st order approximation of it), which is assembled in a non-transposed format. This way the adjoint system is solved in a pseudo-time step manner, where the product of the exact jacobian matrix and the adjoint vector is used as a source term in the rhs. Hong, do you have an estimation of the time required to implement it in superlu_dist? Best, Antoine -Message d'origine- De : Hong [mailto:hzh...@mcs.anl.gov] Envoyé : Friday, August 29, 2014 9:14 PM À : Gaetan Kenway Cc : Antoine De Blois; petsc-users@mcs.anl.gov Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose We can add MatSolveTranspose() to the petsc interface with superlu_dist. Jed, Are you working on it? If not, I can work on it. Hong On Fri, Aug 29, 2014 at 6:14 PM, Gaetan Kenway gaet...@gmail.com wrote: Hi Antoine We are also using PETSc for solving adjoint systems resulting from CFD. To get around the matSolveTranspose issue we just assemble the transpose matrix directly and then call KSPSolve(). If this is possible in your application I think it is probably the best approach Gaetan On Fri, Aug 29, 2014 at 3:58 PM, Antoine De Blois antoine.debl...@aero.bombardier.com wrote: Hello Jed, Thank you for your quick response. So I spent some time to dig deeper into my problem. I coded a shell script that passes through a bunch of ksp_type, pc_type and sub_pc_type. So please disregard the comment about the does not converge properly for transpose. I had taken that conclusion from my own code (and not from the ex10 and extracted matrix), and a KSPSetFromOptions was missing. Apologies for that. What remains is the performance issue. The MatSolveTranspose takes a very long time to converge. For a matrix of 3 million rows, MatSolveTranspose takes roughly 5 minutes on 64 cpus, whereas the MatSolve is almost instantaneous!. When I gdb my code, petsc seems to be stalled in the MatLUFactorNumeric_SeqAIJ_Inode () for a long time. I also did a top on the compute node to check the RAM usage. It was hovering over 2 gig, so memory usage does not seem to be an issue here. #0 0x2afe8dfebd08 in MatLUFactorNumeric_SeqAIJ_Inode () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li bpetsc.so.3.5 #1 0x2afe8e07f15c in MatLUFactorNumeric () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li bpetsc.so.3.5 #2 0x2afe8e2afa99 in PCSetUp_ILU () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li bpetsc.so.3.5 #3 0x2afe8e337c0d in PCSetUp () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li bpetsc.so.3.5 #4 0x2afe8e39d643 in KSPSetUp () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li bpetsc.so.3.5 #5 0x2afe8e39e3ee in KSPSolveTranspose () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li bpetsc.so.3.5 #6 0x2afe8e300f8c in PCApplyTranspose_ASM () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li bpetsc.so.3.5 #7 0x2afe8e338c13 in PCApplyTranspose () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li bpetsc.so.3.5 #8 0x2afe8e3a8a84 in KSPInitialResidual () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li bpetsc.so.3.5 #9 0x2afe8e376c32 in KSPSolve_GMRES () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li
Re: [petsc-users] SNESSetJacobian
On Sep 23, 2014, at 9:50 AM, anton po...@uni-mainz.de wrote: Starting from version 3.5 the matrix parameters in SNESSetJacobian are no longer pointers, hence my question: What is the most appropriate place to call SNESSetJacobian if I need to change the Jacobian during solution? What about FormFunction? Could you please explain why you need to change the Mat? Our hope was that people would not need to change it. Note that you can change the type of a matrix at any time. So for example inside your FormJacobian you can have code like MatSetType(J,MATAIJ) this wipes out the old matrix data structure and gives you an empty matrix of the new type ready to be preallocated and then filled. Let us know what you need. Barry Thanks, Anton
Re: [petsc-users] superlu_dist and MatSolveTranspose
Morning Hong, Alright, fully understood. Please keep me posted on that matter. Regards, Antoine -Message d'origine- De : Hong [mailto:hzh...@mcs.anl.gov] Envoyé : Tuesday, September 23, 2014 11:49 AM À : Hong Cc : Antoine De Blois; Gaetan Kenway; petsc-users@mcs.anl.gov; Sherry Li Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose Antoine, I just find out that superlu_dist does not support MatSolveTransport yet (see Sherry's email below). Once superlu_dist provides this support, we can add it to the petsc/superlu_dist interface. Thanks for your patience. Hong --- Hong, Sorry, the transposed solve is not there yet; it's not as simple as serial version, because here, it requires to set up entirely different communication pattern. I will try to find time to do it. Sherry On Tue, Sep 23, 2014 at 8:11 AM, Hong hzh...@mcs.anl.gov wrote: Sherry, Can superlu_dist be used for solving A^T x = b? Using the option options.Trans = TRANS; with the existing petsc-superlu_dist interface, I cannot get correct solution. Hong On Mon, Sep 22, 2014 at 12:47 PM, Hong hzh...@mcs.anl.gov wrote: I'll add it. It would not take too long, just matter of priority. I'll try to get it done in a day or two, then let you know when it works. Hong On Mon, Sep 22, 2014 at 12:11 PM, Antoine De Blois antoine.debl...@aero.bombardier.com wrote: Dear all, Sorry for the delay on this topic. Thank you Gaetan for your suggestion. I had thought about doing that originally, but I had left it out since I thought that a rank owned the entire row of the matrix (and not only the sub-diagonal part). I will certainly give it a try. I still need the MatSolveTranspose since I need the ability to reuse the residual jacobian matrix from the flow (a 1st order approximation of it), which is assembled in a non-transposed format. This way the adjoint system is solved in a pseudo-time step manner, where the product of the exact jacobian matrix and the adjoint vector is used as a source term in the rhs. Hong, do you have an estimation of the time required to implement it in superlu_dist? Best, Antoine -Message d'origine- De : Hong [mailto:hzh...@mcs.anl.gov] Envoyé : Friday, August 29, 2014 9:14 PM À : Gaetan Kenway Cc : Antoine De Blois; petsc-users@mcs.anl.gov Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose We can add MatSolveTranspose() to the petsc interface with superlu_dist. Jed, Are you working on it? If not, I can work on it. Hong On Fri, Aug 29, 2014 at 6:14 PM, Gaetan Kenway gaet...@gmail.com wrote: Hi Antoine We are also using PETSc for solving adjoint systems resulting from CFD. To get around the matSolveTranspose issue we just assemble the transpose matrix directly and then call KSPSolve(). If this is possible in your application I think it is probably the best approach Gaetan On Fri, Aug 29, 2014 at 3:58 PM, Antoine De Blois antoine.debl...@aero.bombardier.com wrote: Hello Jed, Thank you for your quick response. So I spent some time to dig deeper into my problem. I coded a shell script that passes through a bunch of ksp_type, pc_type and sub_pc_type. So please disregard the comment about the does not converge properly for transpose. I had taken that conclusion from my own code (and not from the ex10 and extracted matrix), and a KSPSetFromOptions was missing. Apologies for that. What remains is the performance issue. The MatSolveTranspose takes a very long time to converge. For a matrix of 3 million rows, MatSolveTranspose takes roughly 5 minutes on 64 cpus, whereas the MatSolve is almost instantaneous!. When I gdb my code, petsc seems to be stalled in the MatLUFactorNumeric_SeqAIJ_Inode () for a long time. I also did a top on the compute node to check the RAM usage. It was hovering over 2 gig, so memory usage does not seem to be an issue here. #0 0x2afe8dfebd08 in MatLUFactorNumeric_SeqAIJ_Inode () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ li bpetsc.so.3.5 #1 0x2afe8e07f15c in MatLUFactorNumeric () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ li bpetsc.so.3.5 #2 0x2afe8e2afa99 in PCSetUp_ILU () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ li bpetsc.so.3.5 #3 0x2afe8e337c0d in PCSetUp () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ li bpetsc.so.3.5 #4 0x2afe8e39d643 in KSPSetUp () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ li bpetsc.so.3.5 #5 0x2afe8e39e3ee in KSPSolveTranspose () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ li bpetsc.so.3.5 #6 0x2afe8e300f8c in PCApplyTranspose_ASM () from /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ li bpetsc.so.3.5 #7
Re: [petsc-users] GPU speedup in Poisson solvers
Hi Dominic, PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c uses DMDA's which require a few additional fixes. I haven't opened a pull request for these yet but I will do that before Thursday. Regarding the rebase, wouldn't it be preferable to just resolve the conflicts in the merge commit? In any event, I've merged these branches several times into local integration branches created off of recent petsc/master branches so I'm pretty familiar with the conflicts and how to resolve them. I can help with the merge or do a rebase, whichever you prefer. Ok, I'll give the merge a try and see how things go. :-) Best regards, Karli
Re: [petsc-users] GPU speedup in Poisson solvers
On 09/23/2014 01:45 PM, Karl Rupp wrote: Hi Dominic, PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c uses DMDA's which require a few additional fixes. I haven't opened a pull request for these yet but I will do that before Thursday. Regarding the rebase, wouldn't it be preferable to just resolve the conflicts in the merge commit? In any event, I've merged these branches several times into local integration branches created off of recent petsc/master branches so I'm pretty familiar with the conflicts and how to resolve them. I can help with the merge or do a rebase, whichever you prefer. Ok, I'll give the merge a try and see how things go. :-) Best regards, Karli I can join a google+ or skype session to assist if that helps. Let me know if you run into problems. Cheers, Dominic -- Dominic Meiser Tech-X Corporation 5621 Arapahoe Avenue Boulder, CO 80303 USA Telephone: 303-996-2036 Fax: 303-448-7756 www.txcorp.com
[petsc-users] DMPlex with spring elements
Hi all I was wondering if it could be possible to build a model similar to the example snes/ex12.c, but with spring elements (for elasticity) instead of simplicial elements. Spring elements in a grid, therefore each element would have two nodes and each node two components. There would be more differences, because instead of calling the functions f0,f1,g0,g1,g2 and g3 to build the residual and the jacobian, I would call a routine that would build the residual vector and the jacobian matrix directly. I would not have shape functions whatsoever. My problem is discrete, I don't have a PDE and my equations are algebraic. What is the best way in petsc to solve this problem? Is there any example that I can follow? Thanks in advance Miguel -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaz...@illinois.edu
Re: [petsc-users] GPU speedup in Poisson solvers
On 09/23/2014 01:45 PM, Karl Rupp wrote: Hi Dominic, PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c uses DMDA's which require a few additional fixes. I haven't opened a pull request for these yet but I will do that before Thursday. Regarding the rebase, wouldn't it be preferable to just resolve the conflicts in the merge commit? In any event, I've merged these branches several times into local integration branches created off of recent petsc/master branches so I'm pretty familiar with the conflicts and how to resolve them. I can help with the merge or do a rebase, whichever you prefer. Ok, I'll give the merge a try and see how things go. :-) Best regards, Karli Hi Karli, I just updated the branch for PR #178 with the additional fixes for the DMDA issues. This branch now has all my GPU related bug fixes. Cheers, Dominic -- Dominic Meiser Tech-X Corporation 5621 Arapahoe Avenue Boulder, CO 80303 USA Telephone: 303-996-2036 Fax: 303-448-7756 www.txcorp.com
[petsc-users] Memory requirements in SUPERLU_DIST
Hi I am solving a frequency domain Maxwell problem for a dielectric structure of size 90x90x50, (the total matrix size is (90x90x50x6)^2 which includes the three vector components as well as real and imaginary parts.) I am using SUPERLU_DIST for the direct solver with the following options parsymbfact = 1, (parallel symbolic factorization) permcol = PARMETIS, (parallel METIS) permrow = NATURAL (natural ordering). First, I tried to use 4096 cores with 2GB / core memory which totals to about 8 TB of memory. I get the following error: Using ParMETIS for parallel ordering. Structual symmetry is:100% Current memory used: 1400271832 bytes Maximum memory used: 1575752120 bytes ***Memory allocation failed for SetupCoarseGraph: adjncy. Requested size: 148242928 bytes So it seems to be an insufficient memory allocation problem (which apparently happens at the METIS analysis phase?). Then, I tried to use 64 large-memory cores which have a total of 2 TB memory (so larger memory per each core), it seems to work fine (though the solver takes about 900 sec ). What I don't understand is why memory per core matters rather than the total memory? If the work space is distributed across the processors, shouldn't it work as long as I choose a sufficient number of smaller-memory cores? What kind of role does the memory per core play in the algorithm in contrast to the total memory over all the cores? The issue is I would rather use a large number of small-memory cores than any number of the large-memory cores. The latter are two times more expensive in terms of service units (I am running on STAMPEDE at TACC) and not many cores are available either. Any idea would be appreciated. Zin -- Zin Lin
Re: [petsc-users] Memory requirements in SUPERLU_DIST
This is something you better ask Sherri about. She’s the one who wrote and understands SuperLU_DIST Barry On Sep 23, 2014, at 7:00 PM, Zin Lin zinlin.zin...@gmail.com wrote: Hi I am solving a frequency domain Maxwell problem for a dielectric structure of size 90x90x50, (the total matrix size is (90x90x50x6)^2 which includes the three vector components as well as real and imaginary parts.) I am using SUPERLU_DIST for the direct solver with the following options parsymbfact = 1, (parallel symbolic factorization) permcol = PARMETIS, (parallel METIS) permrow = NATURAL (natural ordering). First, I tried to use 4096 cores with 2GB / core memory which totals to about 8 TB of memory. I get the following error: Using ParMETIS for parallel ordering. Structual symmetry is:100% Current memory used: 1400271832 bytes Maximum memory used: 1575752120 bytes ***Memory allocation failed for SetupCoarseGraph: adjncy. Requested size: 148242928 bytes So it seems to be an insufficient memory allocation problem (which apparently happens at the METIS analysis phase?). Then, I tried to use 64 large-memory cores which have a total of 2 TB memory (so larger memory per each core), it seems to work fine (though the solver takes about 900 sec ). What I don't understand is why memory per core matters rather than the total memory? If the work space is distributed across the processors, shouldn't it work as long as I choose a sufficient number of smaller-memory cores? What kind of role does the memory per core play in the algorithm in contrast to the total memory over all the cores? The issue is I would rather use a large number of small-memory cores than any number of the large-memory cores. The latter are two times more expensive in terms of service units (I am running on STAMPEDE at TACC) and not many cores are available either. Any idea would be appreciated. Zin -- Zin Lin
Re: [petsc-users] DMPlex with spring elements
On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya salazardetr...@gmail.com wrote: Hi all I was wondering if it could be possible to build a model similar to the example snes/ex12.c, but with spring elements (for elasticity) instead of simplicial elements. Spring elements in a grid, therefore each element would have two nodes and each node two components. There would be more differences, because instead of calling the functions f0,f1,g0,g1,g2 and g3 to build the residual and the jacobian, I would call a routine that would build the residual vector and the jacobian matrix directly. I would not have shape functions whatsoever. My problem is discrete, I don't have a PDE and my equations are algebraic. What is the best way in petsc to solve this problem? Is there any example that I can follow? Thanks in advance Yes, ex12 is fairly specific to FEM. However, I think the right tools for what you want are DMPlex and PetscSection. Here is how I would proceed: 1) Make a DMPlex that encodes a simple network that you wish to simulate 2) Make a PetscSection that gets the data layout right. Its hard from the above for me to understand where you degrees of freedom actually are. This is usually the hard part. 3) Calculate the residual, so you can check an exact solution. Here you use the PetscSectionGetDof/Offset() for each mesh piece that you are interested in. Again, its hard to be more specific when I do not understand your discretization. Thanks, Matt Miguel -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaz...@illinois.edu -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
Re: [petsc-users] DMPlex with spring elements
You may also want to take a look at the DMNetwork framework that can be used for general unstructured networks that don't use PDEs. Its description is given in the manual and an example is in src/snes/examples/tutorials/network/pflow. Shri From: Matthew Knepley knep...@gmail.com Date: Tue, 23 Sep 2014 22:40:52 -0400 To: Miguel Angel Salazar de Troya salazardetr...@gmail.com Cc: petsc-users@mcs.anl.gov petsc-users@mcs.anl.gov Subject: Re: [petsc-users] DMPlex with spring elements On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya salazardetr...@gmail.com wrote: Hi all I was wondering if it could be possible to build a model similar to the example snes/ex12.c, but with spring elements (for elasticity) instead of simplicial elements. Spring elements in a grid, therefore each element would have two nodes and each node two components. There would be more differences, because instead of calling the functions f0,f1,g0,g1,g2 and g3 to build the residual and the jacobian, I would call a routine that would build the residual vector and the jacobian matrix directly. I would not have shape functions whatsoever. My problem is discrete, I don't have a PDE and my equations are algebraic. What is the best way in petsc to solve this problem? Is there any example that I can follow? Thanks in advance Yes, ex12 is fairly specific to FEM. However, I think the right tools for what you want are DMPlex and PetscSection. Here is how I would proceed: 1) Make a DMPlex that encodes a simple network that you wish to simulate 2) Make a PetscSection that gets the data layout right. Its hard from the above for me to understand where you degrees of freedom actually are. This is usually the hard part. 3) Calculate the residual, so you can check an exact solution. Here you use the PetscSectionGetDof/Offset() for each mesh piece that you are interested in. Again, its hard to be more specific when I do not understand your discretization. Thanks, Matt Miguel -- Miguel Angel Salazar de Troya Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 tel:%28217%29%20550-2360 salaz...@illinois.edu -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
Re: [petsc-users] Memory requirements in SUPERLU_DIST
Zin Lin zinlin.zin...@gmail.com writes: What I don't understand is why memory per core matters rather than the total memory? If the work space is distributed across the processors, shouldn't it work as long as I choose a sufficient number of smaller-memory cores? METIS is a serial partitioner. ParMETIS is parallel, but often performs worse and still doesn't scale to very large numbers of cores, so it is not the default for most direct solver packages. What kind of role does the memory per core play in the algorithm in contrast to the total memory over all the cores? The issue is I would rather use a large number of small-memory cores than any number of the large-memory cores. The latter are two times more expensive in terms of service units (I am running on STAMPEDE at TACC) and not many cores are available either. pgpdS0gnFnArc.pgp Description: PGP signature