Dear experts,
Is there anybody that used Score-P or Tau in a libMesh application ? Did
you got accurate network cost measurements ? How did you instrument,
rebuild libMesh with it ?
Thanks for any help!
Regards,
Fábio
--
I am trying to compute the time that is spent evaluating certain routines in a
code. I saw in some of the examples that sections of the code were bounded by
the Start/Stop_Log tags. How does one incorporate these tags into a common
PerfLog class object to print out time spent evaluating the diff
On Thu, 14 Aug 2014, walter kou wrote:
> I am currently using scalar-valued FE in libmesh (like the example in 3D
> Linear Elastic Cantilever). It seems libmesh-0.9.3 gives more features on
> FEMsystem: vector-valued FE.
You can use vector-valued elements without FEMSystem and vice-versa.
> 1)
Dear all,
I am currently using scalar-valued FE in libmesh (like the example in 3D
Linear Elastic Cantilever). It seems libmesh-0.9.3 gives more features on
FEMsystem: vector-valued FE.
my questions is:
1) What is the advantage of the way based on FEMSystem over the way based
on scalar-valued FE?
It may be a good idea to combine this study with looking at the output
of -log_summary (assuming PETSc is your solver engine) to see if there
is a correlation with the growth in communication or other components
of matrix assembly.
Dmitry.
On Mon, Jun 10, 2013 at 10:16 AM, Ataollah Mesgarnejad
wr
Cody,
I'm not sure if you saw the graph I uploaded it again here:
https://dl.dropboxusercontent.com/u/19391830/scaling.jpg.
In all these runs the NDOFs/Processor is less than 1. What is bothering me
is that the enforce_constraints_exactly is taking up more and more time as
number of proces
Ata,
You might be scaling past the reasonable limit for libMesh. I don't know
what solver you are using, but for a strong scaling study we generally
don't go below 10,000 local DOFs. This is the recommended floor for PETSc
too:
http://www.mcs.anl.gov/petsc/documentation/faq.html#slowerparallel
Dear all,
I've been doing some scaling tests on my code. When I look at time (or % of
time) spent at each stage in libMesh log I see that the
enforce_constraints_exactly stage in DofMap is scaling very bad. I was
wondering if anyone can comment.
Here is my EquationSystems.print_info():
Equat
> If you do decide to do some benchmarks could you post them here?
> Especially if it got a lot slower...
>
will do.
df
--
___
Libmesh-users mailing list
Libmesh-users@lists.
On Wed, Dec 31, 2008 at 2:17 PM, David Fuentes wrote:
>
> Have there been any significant performance increases in libMesh
> since release 0.6.3 ?
Nothing really comes to mind... I think the Threading Building Blocks
stuff was already there by then. Roy did some stuff to the MPI
interface which
Have there been any significant performance increases in libMesh
since release 0.6.3 ?
I wanted to profile the latest and greatest, but not ready to catch up
to the trunk version if not necessary.
df
--
_
Are there any performance tests of libMesh published anywhere?
I want to get an idea of how the assembly routines for a
representative stiffness matrix and load vector perform in 3D
when everything compiled in optimized mode.
thanks,
df
---
Dear Ben,
On Tue, 9 Sep 2008, Benjamin Kirk wrote:
I'm not very familiar with the hardware part, but the lspci output
looks like I have *two* GigE devices, so they are not shared between
the two processors, right?
Try /sbin/ifconfig to see how many interfaces are actually configured. Most
ne
Nevermind...;-)
- Original Message -
From: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
Cc: [email protected]
Sent: Tue Sep 09 11:52:23 2008
Subject: Re: [Libmesh-users] Performance of EquationSystems::reinit()
withParal
son <[EMAIL PROTECTED]>; [email protected]
Sent: Tue Sep 09 11:42:54 2008
Subject: Re: [Libmesh-users] Performance of EquationSystems::reinit() with
ParallelMesh
On Tue, 9 Sep 2008, Kirk, Benjamin (JSC-EG) wrote:
> At one point the send_list was used to do exactly that. In
On Tue, 9 Sep 2008, Kirk, Benjamin (JSC-EG) wrote:
> At one point the send_list was used to do exactly that. In fact,
> that is the only reason it exists. I need to look back and see when
> that changed. If the send list is broken I'll factor roy's stuff
> into the DofMap to fix it.
Whoa, wai
PROTECTED] <[EMAIL PROTECTED]>
To: John Peterson <[EMAIL PROTECTED]>
Cc: libmesh-users
Sent: Tue Sep 09 11:34:14 2008
Subject: Re: [Libmesh-users] Performance of EquationSystems::reinit() with
ParallelMesh
On Tue, 9 Sep 2008, John Peterson wrote:
> Did you send a patch to the list?
On Tue, 9 Sep 2008, John Peterson wrote:
Did you send a patch to the list? I'm going back through my email but
not seeing it...
No, just to Tim (and Ben, since he'd mentioned having some time to
benchmark a hopefully similar problem). When the patch didn't show
any clear improvement, I didn'
On Tue, Sep 9, 2008 at 9:46 AM, Roy Stogner <[EMAIL PROTECTED]> wrote:
>
> On Tue, 9 Sep 2008, Benjamin Kirk wrote:
>
>> That is my thinking. The System::project_vector() code does some all-to-all
>> communication,
>
> Would you take a look at that patch of mine? I thought it removed
> half of th
>> Running with 1 CPU/node will hopefully perform better
>> since you are not sharing a gigE connection between processors.
>
> I'm not very familiar with the hardware part, but the lspci output
> looks like I have *two* GigE devices, so they are not shared between
> the two processors, right?
T
On Tuesday 09 September 2008 16:01:04 Tim Kroeger wrote:
> Dear John,
>
> On Tue, 9 Sep 2008, John Peterson wrote:
> > On linux, lspci will tell you something about the hardware connected
> > to the PCI bus. This may list the interconnect device(s).
>
> lspci seems not to be installed on that mach
Dear Ben,
On Tue, 9 Sep 2008, Benjamin Kirk wrote:
> Running with 1 CPU/node will hopefully perform better
> since you are not sharing a gigE connection between processors.
I'm not very familiar with the hardware part, but the lspci output
looks like I have *two* GigE devices, so they are not s
Dear John,
On Tue, 9 Sep 2008, John Peterson wrote:
>> Attached is the output with 20 nodes and 1 CPU per node. Unfortunately, it's
>> even slower than 10 nodes with 2 CPUs each.
>
> Interesting. And you have exclusive access to these nodes via some
> sort of scheduling software? There's no cha
On Tue, Sep 9, 2008 at 9:44 AM, Tim Kroeger
<[EMAIL PROTECTED]> wrote:
> Dear all,
>
> On Tue, 9 Sep 2008, Benjamin Kirk wrote:
>
>>> There are about 120 nodes with 2 CPUs each. Please find attached the
>>> content of /proc/cpuinfo of one of these nodes (should be typical for
>>> all of them). Wh
Sure. I'll take a look at it this afternoon.
On 9/9/08 9:46 AM, "Roy Stogner" <[EMAIL PROTECTED]> wrote:
>
>
> On Tue, 9 Sep 2008, Benjamin Kirk wrote:
>
>> That is my thinking. The System::project_vector() code does some all-to-all
>> communication,
>
> Would you take a look at that patch
On Tue, 9 Sep 2008, Benjamin Kirk wrote:
> That is my thinking. The System::project_vector() code does some all-to-all
> communication,
Would you take a look at that patch of mine? I thought it removed
half of the all-to-all communication (the global vector of old degrees
of freedom, but not t
Dear Ben,
On Tue, 9 Sep 2008, Benjamin Kirk wrote:
> That is my thinking. The System::project_vector() code does some all-to-all
> communication, and this seems to be scaling quite badly as you get to larger
> processor counts. Running with 1 CPU/node will hopefully perform better
> since you a
> Compute node and head node give exactly the same output. So does this
> mean I have a very slow interconnect, and is this the reason for the
> bad scalability?
That is my thinking. The System::project_vector() code does some all-to-all
communication, and this seems to be scaling quite badly as
Dear Ben,
On Tue, 9 Sep 2008, Benjamin Kirk wrote:
> If you have not done so already it would be instructive to see how using one
> CPU per node performs.
Okay, I have started with 1 node and 1 CPU (i.e. serial) now. It
might not be finished before knocking-off time, though. I will then
(i.e
On Tue, Sep 9, 2008 at 9:18 AM, Tim Kroeger
<[EMAIL PROTECTED]> wrote:
> Dear John,
>
> On Tue, 9 Sep 2008, John Peterson wrote:
>
>> On linux, lspci will tell you something about the hardware connected
>> to the PCI bus. This may list the interconnect device(s).
>
> lspci seems no
Dear John,
On Tue, 9 Sep 2008, John Peterson wrote:
> On linux, lspci will tell you something about the hardware connected
> to the PCI bus. This may list the interconnect device(s).
lspci seems not to be installed on that machine, although it is linux.
>>>
>>> Try /sbin/lspci
> There are about 120 nodes with 2 CPUs each. Please find attached the
> content of /proc/cpuinfo of one of these nodes (should be typical for
> all of them). When I run with n CPUs, I usually mean that I run on
> n/2 nodes using both CPUs each (although there is also the possibility
> to use one
On Tue, Sep 9, 2008 at 9:08 AM, Tim Kroeger
<[EMAIL PROTECTED]> wrote:
> Dear Ben,
>
> On Tue, 9 Sep 2008, Benjamin Kirk wrote:
>
On linux, lspci will tell you something about the hardware connected
to the PCI bus. This may list the interconnect device(s).
>>>
>>> lspci seems not to be i
Dear Ben,
On Tue, 9 Sep 2008, Benjamin Kirk wrote:
On linux, lspci will tell you something about the hardware connected
to the PCI bus. This may list the interconnect device(s).
lspci seems not to be installed on that machine, although it is linux.
Try /sbin/lspci - there is a good chance
On Tue, 9 Sep 2008, Tim Kroeger wrote:
> Dear John,
>
> On Tue, 9 Sep 2008, John Peterson wrote:
>
>> On linux, lspci will tell you something about the hardware connected
>> to the PCI bus. This may list the interconnect device(s).
>
> lspci seems not to be installed on that machine, although i
>> On linux, lspci will tell you something about the hardware connected
>> to the PCI bus. This may list the interconnect device(s).
>
> lspci seems not to be installed on that machine, although it is linux.
Try /sbin/lspci - there is a good chance /sbin is not in your path.
-Ben
Dear John,
On Tue, 9 Sep 2008, John Peterson wrote:
> On linux, lspci will tell you something about the hardware connected
> to the PCI bus. This may list the interconnect device(s).
lspci seems not to be installed on that machine, although it is linux.
Best Regards,
Tim
--
Dr. Tim Kroeger
On Tue, Sep 9, 2008 at 3:53 AM, Tim Kroeger
<[EMAIL PROTECTED]> wrote:
>
> Concerning the interconnect, I actually don't know. Is there some easy way
> to find out, i.e. some file like /proc/something? (The machine is located
> far away, so it's not with the admin sitting next door to me.)
On li
>> So the project_vector() performance went from 168-179 sec before the patch to
>> 134-148 sec after the patch... but the total time used only went down by
>> about 3 seconds, not 30, because apparently "All" started using up the
>> remainder?
>
> Very strange, really. The application was defini
> I noticed something very reminiscent of this just two days ago. In my case
> I run a transient solution to steady-state and then stop the simulation.
>
> I then re-read this result, refine the mesh, project the solution, and
> re-converge on the refined mesh.
>
> I can't quantify it at the mom
Tim,
How many variables and vectors are in your system?
-Ben
On 9/5/08 9:42 AM, "Tim Kroeger" <[EMAIL PROTECTED]> wrote:
> Dear Roy,
>
> On Thu, 4 Sep 2008, Roy Stogner wrote:
>
>>> I see, you are also calling serial vectors "global vectors" now.
>>
>> Just one subset of serial vectors: tho
Dear Roy,
On Thu, 4 Sep 2008, Roy Stogner wrote:
>> I see, you are also calling serial vectors "global vectors" now.
>
> Just one subset of serial vectors: those for which every coefficient
> is valid.
Okay, so we have parallel and serial vectors, where serial vectors can
be divided into global
Dear Ben,
On Thu, 4 Sep 2008, Benjamin Kirk wrote:
> I noticed something very reminiscent of this just two days ago.
> [...]
> I can't quantify it at the moment but this took a lot longer than expected.
> I just "seemed really long." It was somewhat acceptable in my case since I
> then did a hun
I have just caught up to speed with the mailing list after a few
distractions.
I noticed something very reminiscent of this just two days ago. In my case
I run a transient solution to steady-state and then stop the simulation.
I then re-read this result, refine the mesh, project the solution,
On Thu, 4 Sep 2008, Tim Kroeger wrote:
> On Wed, 3 Sep 2008, Roy Stogner wrote:
>
> What about a better documentation of this? I have attached a patch for this.
>
>>> but I can't find a corresponding "serialize()" method.
>>
>> There is no reverse method to produce a parallel vector from a glob
Dear Roy,
On Wed, 3 Sep 2008, Roy Stogner wrote:
As far as I understand, currently the NumericVector interface class (with
its implementations such as PetscVector) is used as a parallel as well as a
serial vector.
Yes.
I notice that there is a method localize() that seems to transform a
se
On Thu, Sep 4, 2008 at 12:41 AM, Roy Stogner <[EMAIL PROTECTED]> wrote:
>
>
> On Wed, 3 Sep 2008, Roy Stogner wrote:
>
>> On Tue, 2 Sep 2008, Tim Kroeger wrote:
>>
>>> By the way: I found a funny typo in libmesh_logging.h, see attached patch.
>>
>> That's a silly one; thanks! The fix is committed
On Wed, 3 Sep 2008, Roy Stogner wrote:
> On Tue, 2 Sep 2008, Tim Kroeger wrote:
>
>> By the way: I found a funny typo in libmesh_logging.h, see attached patch.
>
> That's a silly one; thanks! The fix is committed now.
Spoke too soon. Apparently just running "svn commit" isn't enough
when the
On Tue, 2 Sep 2008, Tim Kroeger wrote:
> Okay, I understand. I didn't think about the mesh to occupy so much memory,
> in particular because I have a lot of systems on that mesh.
Unfortunately it does. I don't know about your application, but keep
in mind that more systems results in larger d
On Mon, 1 Sep 2008, Tim Kroeger wrote:
> Since I didn't get any reply yet, I am not sure whether you got the mail
> below. On the other hand, perhaps you just didn't answer because you found
> there was nothing to say.
Actually, neither was the case - I didn't answer because had very
little t
Dear libMesh developer team,
Could someone please tell me what the current state about the item
mentioned below is? I just want to know, so that I can decide whether
I should wait or get around somehow or whatever.
(In the current state, my code does not show any speedup on 20
processors comp
The fem data file.
2008/8/2 Shengli Xu <[EMAIL PROTECTED]>
> John,
>
> The test case is attached. It is a static analysis of 3D beam.
> Could you please give me the result of libMesh0.5.0 libMesh0.6.2?
>
> Thanks,
>
> 2008/8/1 John Peterson <[EMAIL PROTECTED]>
>
> Shengli and Adam,
>>
>> Could yo
John,
The test case is attached. It is a static analysis of 3D beam.
Could you please give me the result of libMesh0.5.0 libMesh0.6.2?
Thanks,
2008/8/1 John Peterson <[EMAIL PROTECTED]>
> Shengli and Adam,
>
> Could you please send us the simplest possible test case which
> displays the poor pe
Shengli and Adam,
Could you please send us the simplest possible test case which
displays the poor performance? If the problem can be replicated with
one of the example problems directly, then please let us know what
command line or input arguments you used to show the slow matrix
insertion resul
Dear libMesh team,
As I'm just so much in asking questions, here's another one: How will
the performance of EquationSystems::reinit() behave when ParallelMesh
is ready for use by user application? The reason for my question is
that for my code on a large number of processors,
EquationSystems:
On Wed, 16 Apr 2008, Tim Kroeger wrote:
> As I'm just so much in asking questions, here's another one: How will
> the performance of EquationSystems::reinit() behave when ParallelMesh
> is ready for use by user application?
At first, just as badly as before. System::project_vector() is still
on
56 matches
Mail list logo