Re: [deal.II] Re: Observations on CPU usage on a single processor

Wolfgang Bangerth Fri, 05 Aug 2016 09:36:24 -0700


Pete,

Bruno, I assumed that the thread with 100% CPU usage was somehow feeding the
others in step-8,

It's more like for some functions, we split operations onto as many threads asthere are CPUs. But then, the next function you call may not be parallelized,and so everything only works on one thread. On average, that one thread has aload of 100% whereas the others have a lesser load.

I just tested the step-8 program with PreconditionIdentity() witch showed 100%
CPU usage on all 8 CPUs. The results follow. Assuming having no preconditioner
only slows things down maybe getting 3 times the CPU power will make it up. I
haven't checked solve times yet. The preconditioner for step-8 was
PreconditionSSOR<> with relaxation parameter = 1.2. Is there an optimum
preconditioner/relaxation parameter for 3d elasticity problems that you know
of? Is their determination only by the trial and error?


1.2 seems to be what a lot of people use.

As for thread use: if you use PreconditionIdentity, *all* major operationsthat CG calls are parallelized. On the other hand, using PreconditionSSOR, youwill be spend at least 50% of your time in the preconditioner, but SSOR is asequential method where you need to compute the update for one vector elementbefore you can more to the next. So it cannot be parallelized, andconsequently your average thread load will be less than 100%.

Neither of these are good preconditioners in the big scheme of things, if youenvision going to large problems. For those, you ought to use variations ofthe multigrid method.

Wolfgang, What I meant by efficiency was the CPU usage in the threads for
Step-17 NEW and OLD decreased with the larger #DOFs or cycle #'s.

If the load decreased for both codes, I would attribute this to memorytraffic. If the problem is small enough, much of it will fit into the cachesof the processor/cores, and so you get high throughput. If the problem becomesbigger, processors wait for data for longer. Waiting is, IIRC, still countedas processor load, but it may make some operations that are not parallelizedtake longer than those that are parallelized, and so overall lead to a loweraverage thread load.


But that's only a theory that would require a lot more digging to verify.

Best
 W.


--
------------------------------------------------------------------------
Wolfgang Bangerth               email:            bange...@colostate.edu
                                www: http://www.math.tamu.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en

---You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [deal.II] Re: Observations on CPU usage on a single processor

Reply via email to