Re: [deal.II] Re: measuring cpu and wall time for assembly routine

2022-10-19 Thread Wolfgang Bangerth
On 10/19/22 08:45, Simon Wiesheier wrote: What I want to do boils down to the following: Given the reference co-ordinates of a point 'p', along with the cell on which 'p' lives, give me the value and gradient of a finite element function evaluated at 'p'. My idea was to create a quadrature

Re: [deal.II] Re: measuring cpu and wall time for assembly routine

2022-10-19 Thread Martin Kronbichler
Dear Simon, You seem to be looking for FEPointEvaluation. That class is shown in step-19 and provides, for simple FiniteElement types, a much faster way to evaluate solutions at arbitrary points within a cell. Do you want to give it a try? The issue you are facing is that FEValues that you are

[deal.II] deal.II Newsletter #230

2022-10-19 Thread 'Rene Gassmoeller' via deal.II User Group
Hello everyone! This is deal.II newsletter #230. It automatically reports recently merged features and discussions about the deal.II finite element library. ## Below you find a list of recently proposed or merged features: #14362: Execute explicit instantiations of

Re: [deal.II] Re: measuring cpu and wall time for assembly routine

2022-10-19 Thread Simon Wiesheier
" It's an environment variable. " I did $DEAL_II_NUM_THREADS and the variable is not set. But if it were set to one, why would this explain the gap between cpu and wall time? " My point is the constructor should not be called millions of times. You are not going to be able to get that function

Re: [deal.II] Re: measuring cpu and wall time for assembly routine

2022-10-19 Thread Bruno Turcksin
Simon, Le mer. 19 oct. 2022 à 09:33, Simon Wiesheier a écrit : > Thank you for your answer! > > " Did you set DEAL_II_NUM_THREADS=1?" > > How can I double-check that? > ccmake . > only shows my the variables CMAKE_BUILD_TYPE and deal.II_DIR . > But I do do knot if this is the right place to

Re: [deal.II] Re: measuring cpu and wall time for assembly routine

2022-10-19 Thread Simon Wiesheier
Thank you for your answer! " Did you set DEAL_II_NUM_THREADS=1?" How can I double-check that? ccmake . only shows my the variables CMAKE_BUILD_TYPE and deal.II_DIR . But I do do knot if this is the right place to look for. " That could explain why CPU and Wall time are different. Finally, if I

Re: [deal.II] Re: Run time analysis for step-75 with matrix-free or matrix-based method

2022-10-19 Thread 'yy.wayne' via deal.II User Group
Besides, the Trilinos direct solver applied is Amesos_Lapack(a mistake). Changing to Klu therefore save more time. 在2022年10月19日星期三 UTC+8 20:30:53 写道: > I run both matrix-based and matrix-free mode with release mode, both speed > up a lot. The matrix-free CG iteration speeds up 30 times

[deal.II] Re: measuring cpu and wall time for assembly routine

2022-10-19 Thread Bruno Turcksin
Simon, The best way to profile a code is to use a profiler. It can give a lot more information than what simple timers can do. You say that your code is not parallelized but by default deal.II is multithreaded . Did you set DEAL_II_NUM_THREADS=1? That could explain why CPU and Wall time are

Re: [deal.II] Re: Run time analysis for step-75 with matrix-free or matrix-based method

2022-10-19 Thread Martin Kronbichler
Dear Wayne, For performance it certainly matters, because some components of our codes have more low-level checks in debug mode than others, and because the compiler optimizations do not have the same effect on all parts of our code. Make sure to test the release mode and see if it makes more

[deal.II] measuring cpu and wall time for assembly routine

2022-10-19 Thread Simon
Dear all, I implemented two different versions to compute a stress for a given strain and want to compare the associated computation times in release mode. version 1: stress = fun1(strain) cpu time: 4.52 s wall time: 4.53 s version 2: stress = fun2(strain) cpu time: 32.5s

Re: [deal.II] Re: Run time analysis for step-75 with matrix-free or matrix-based method

2022-10-19 Thread 'yy.wayne' via deal.II User Group
Thanks Martin ! I never considered about Debug or optimized mode before. Cmake result says I'm using Debug mode. Some more information: The computaiton is done in deal.ii 9.4.0 oracle virtualBox, with 1 mpi process in qtcreator, and CPU is intel 10600kf. I didn't change the CMakeLists and

Re: [deal.II] Re: Run time analysis for step-75 with matrix-free or matrix-based method

2022-10-19 Thread Martin Kronbichler
Dear Wayne, I am a bit surprised by your numbers and find them rather high, at least with the chosen problem sizes. I would expect the matrix-free solver to run in less than a second for 111,000 unknowns on typical computers, not almost 10 seconds. I need to honestly say that I do not have a

[deal.II] Re: Run time analysis for step-75 with matrix-free or matrix-based method

2022-10-19 Thread 'yy.wayne' via deal.II User Group
Thanks for your reply Peter, The matrix-free run is basic same as in step-75 except I substitute coarse grid solver. For fe_degree=6 without GMG and fe_degree in each level decrease by 1 for pMG, the solve_system() function runtime is 24.1s. It's decomposed to *MatrixFree MG operators

[deal.II] Re: Run time analysis for step-75 with matrix-free or matrix-based method

2022-10-19 Thread Peter Munch
Hi Wayne, your numbers make totally sense. Don't forget that you are running for high order: degree=6! The number of non-zeroes per element-stiffness matrix is ((degree + 1)^dim)^2 and the cost of computing the element stiffness matrix is even ((degree + 1)^dim)^3 if I am not mistaken (3