Dear Roy,
On Thu, 9 Apr 2009, Tim Kroeger wrote:
On Thu, 9 Apr 2009, Roy Stogner wrote:
Ugh, definitely. Although as an immediate start: do you have
tracefiles turned on?
No, I didn't. Actually, up to now, I didn't know this option exists.
I have restarted it now with that option enabled,
On Thu 2009-04-09 14:03, Kirk, Benjamin (JSC-EG311) wrote:
> While we do not currently manage the array ourselves, what I am proposing is
> that we make that step, and further to do it in the base class so all the
> derived classes can benefit.
You still don't have to deal with allocating the mem
>> As for the performance implications, I think we can get nearly all of it
>> back by optimizing the underlying PetscVector (but actually do it in the
>> NumericVector base class) to bypass the PETSc API whenever possible.
>>
>> Specifically, we would use the VecCreateGhostWithArray (sp?) method
On Thu, Apr 9, 2009 at 10:19 AM, Kirk, Benjamin (JSC-EG311)
wrote:
>
> Then there is the fact that the GlobalToLoalMap is derived from a
> std::map...
>
> std::map is very convenient for building up the list, but is not so
> efficient in terms of access or memory requirements.
>
> What I would sug
On Thu, 9 Apr 2009, Kirk, Benjamin (JSC-EG311) wrote:
> Two obvious optimizations seem possible, though. the
> this->first_local_index() and this->last_local_index() both perform the same
> underlying PETSc call to get *both* [local_min, local_max) and only return
> the QOI.
>
> This could be st
>> Of course, restructuring the whole libMesh library such as to use a
>> ghosted concept that matches PETSc's concept is utopic, so we can't
>> remove all inefficiencies.
>
> The only issue there is local vs. global index lookups: we currently
> only expose global dof indices to the user, and s
On Thu, 9 Apr 2009, Tim Kroeger wrote:
> BTW: Why is --enable-tracefiles not enabled by default? It shouldn't bother
> anybody since it doesn't do anything as long as the application doesn't
> crash, does it? And if it *does* crash, this option is very useful.
Hmm... I think I was worried ab
On Thu, 9 Apr 2009, Tim Kroeger wrote:
> Of course, restructuring the whole libMesh library such as to use a
> ghosted concept that matches PETSc's concept is utopic, so we can't
> remove all inefficiencies.
The only issue there is local vs. global index lookups: we currently
only expose global
Dear Roy,
On Thu, 9 Apr 2009, Roy Stogner wrote:
> On Thu, 9 Apr 2009, Tim Kroeger wrote:
>
>> Assertion `it!=_global_to_local_map.end()' failed.
>> [16]
>> /home/tkroeger/archives/libMesh/libmesh/include/numerics/petsc_vector.h,
>> line 956, compiled Mar 31 2009 at 08:17:12
>>
>> There seems
Dear Jed,
On Thu, 9 Apr 2009, Jed Brown wrote:
> On Thu 2009-04-09 08:12, Tim Kroeger wrote:
>
>> What exactly do you mean? In other words, what do I have to do to
>> produce the profiling output that you would like to see?
>
> Run with -log_summary so that we can see how much time is being spen
On Thu, 9 Apr 2009, Tim Kroeger wrote:
> Assertion `it!=_global_to_local_map.end()' failed.
> [16] /home/tkroeger/archives/libMesh/libmesh/include/numerics/petsc_vector.h,
> line 956, compiled Mar 31 2009 at 08:17:12
>
> There seems to be another bug.
>
> What do you think, should I track this d
On Thu 2009-04-09 08:12, Tim Kroeger wrote:
> Dear Jed,
>
> On Wed, 8 Apr 2009, Jed Brown wrote:
>
>> I would be interested to see profiling output from Tim's code with
>> ghosted vectors.
>
> What exactly do you mean? In other words, what do I have to do to
> produce the profiling output that y
Dear Roy,
On Wed, 8 Apr 2009, Tim Kroeger wrote:
> What do you think, should I perform another two computations with 24
> CPUs (on three nodes) and see how fast that is?
Of course, "What do you think, should I" means "I will" in this case,
and the result is:
Assertion `it!=_global_to_local_ma
Dear Jed,
On Wed, 8 Apr 2009, Jed Brown wrote:
> I would be interested to see profiling output from Tim's code with
> ghosted vectors.
What exactly do you mean? In other words, what do I have to do to
produce the profiling output that you would like to see?
> Also, note that there is a bit of
On Wed 2009-04-08 08:48, Kirk, Benjamin (JSC-EG311) wrote:
> As for the performance implications, I think we can get nearly all of it
> back by optimizing the underlying PetscVector (but actually do it in the
> NumericVector base class) to bypass the PETSc API whenever possible.
>
> Specifically,
>> How many vectors are you storing in this application?
>
> Quite a lot. (-:
>
> (About 8 systems, most of which are transient, and some of which have
> additional vectors.)
>
>> Redundant vector storage being the limiting factor in scaling at this scale
>> surprises me...
>
> Roy was also s
Dear Ben,
On Wed, 8 Apr 2009, Kirk, Benjamin (JSC-EG311) wrote:
> How many vectors are you storing in this application?
Quite a lot. (-:
(About 8 systems, most of which are transient, and some of which have
additional vectors.)
> Redundant vector storage being the limiting factor in scaling
How many vectors are you storing in this application?
Redundant vector storage being the limiting factor in scaling at this scale
surprises me...
- Original Message -
From: Tim Kroeger
To: Roy Stogner
Cc: libmesh-devel
Sent: Wed Apr 08 01:46:34 2009
Subject: Re: [Libmesh-devel
Dear Roy,
On Tue, 7 Apr 2009, Roy Stogner wrote:
> My last reservation: is there any performance penalty? The ghosted
> vectors should be more scalable than serial vectors on N processors,
> but they've got overhead that may cost CPU time on 2-4 processors.
> When you were regression testing tho
On Tue, 7 Apr 2009, Derek Gaston wrote:
> On Apr 7, 2009, at 10:01 AM, Roy Stogner wrote:
>
>> It wouldn't be too hard, if we were just talking about PETSc. But
>> unless we want to break our other interfaces, we'll need the
>> equivalent of ghosted vectors from LASPACK and Trilinos (and
>> Dist
On Apr 7, 2009, at 10:01 AM, Roy Stogner wrote:
> It wouldn't be too hard, if we were just talking about PETSc. But
> unless we want to break our other interfaces, we'll need the
> equivalent of ghosted vectors from LASPACK and Trilinos (and
> DistributedVector? anyone using that for explicit pro
On Tue, 7 Apr 2009, Derek Gaston wrote:
On Apr 7, 2009, at 9:22 AM, Roy Stogner wrote:
On Mon, 6 Apr 2009, Tim Kroeger wrote:
I would vote for making ghosted vectors the default now.
I'm tempted to agree. (Which probably means everyone else is way
ahead of me;
On Apr 7, 2009, at 9:22 AM, Roy Stogner wrote:
On Mon, 6 Apr 2009, Tim Kroeger wrote:
I would vote for making ghosted vectors the default now.
I'm tempted to agree. (Which probably means everyone else is way
ahead of me; they just talked me into making --enable-second a default
option, year
On Mon, 6 Apr 2009, Tim Kroeger wrote:
> I would vote for making ghosted vectors the default now.
I'm tempted to agree. (Which probably means everyone else is way
ahead of me; they just talked me into making --enable-second a default
option, years after I wrote it.)
My last reservation: is the
Dear Roy,
On Thu, 26 Mar 2009, Tim Kroeger wrote:
> Perhaps I should just run both the ghosted and the non-ghosted version
> of my application two times each and look whether the final results
> between ghosted/non-ghosted differ consderably more than within these
> groups.
I did this now: Two r
On Thu, 26 Mar 2009, Jed Brown wrote:
> On Thu 2009-03-26 12:40, Roy Stogner wrote:
>> I'd be tempted to add that behavior as a very-non-default option to
>> FEMSystem... except, for it to really be useful you'd need a
>> deterministic solve() too, and I'm sure the same distributed FP
>> ordering
On Thu 2009-03-26 12:40, Roy Stogner wrote:
> I'd be tempted to add that behavior as a very-non-default option to
> FEMSystem... except, for it to really be useful you'd need a
> deterministic solve() too, and I'm sure the same distributed FP
> ordering problems come up there.
Nah, just MatView an
On Thu, 26 Mar 2009, Tim Kroeger wrote:
> Oops... at which line does your g++ complain if you don't include stdlib.h?
exit() was undefined.
> This non-reproducibility complicates the check whether the ghosted
> vectors change the code behaviour a lot, because it is no longer
> clear what "chang
On Thu 2009-03-26 10:54, Roy Stogner wrote:
> I can't say I'm happy about not having perfectly deterministic apps,
> but I'm not sure what to do about it. Maybe someone familiar with
> PETSc MatSetValues can chime in with "We're deterministic! Roy
> doesn't know what he's talking about" or "We *c
Dear Roy,
On Thu, 26 Mar 2009, Roy Stogner wrote:
> I needed "#include " to get test2 to work; g++ is getting
> more and more nitpicky about standards compliance.
Oops... at which line does your g++ complain if you don't include
stdlib.h?
> I can reproduce the non-reproducibility... but not to
On Thu, 26 Mar 2009, Tim Kroeger wrote:
> You might want to check whether you can reproduce the non-reproducibility.
> To do this, please download www.mevis.de/~tim/a.tar.gz and unpack it (small
> this time). Then run the attached test.cpp on 8 CPUs, which creates a simple
> grid, assembles
Dear Roy,
On Wed, 25 Mar 2009, Tim Kroeger wrote:
Unfortunately, the residuals still don't coincide. What suprises me
even more is that they even differ between identical runs, i.e. ghost
dofs are disabled both times. In other words, my application gets
results that are not reproducible, and
Dear Roy,
On Mon, 23 Mar 2009, Roy Stogner wrote:
> On Mon, 23 Mar 2009, Tim Kroeger wrote:
>
>> Of course, my residuals are again slightly off (fifth version now). I think
>> I should try out what happens when I switch back to non-ghosted vectors
>> now. I will do that as soon as the current
On Mon, 23 Mar 2009, Tim Kroeger wrote:
> My application does not crash any more. Well, at least it didn't crash at
> the `usual' point, and it's still running now, and up to now the results are
> more or less equal to those with non-ghosted vectors. If this remains to be
> true until the app
Dear Roy,
On Mon, 16 Mar 2009, Roy Stogner wrote:
> Anyway, I've checked the fixes into SVN; now might be a good time for
> those of us on the bleeding edge to update.
Great work! Thank you very much!
My application does not crash any more. Well, at least it didn't
crash at the `usual' point
On Fri, 13 Mar 2009, Tim Kroeger wrote:
> Could you please check whether this enables you the reproduce the crash? You
> need to run the program on 8 processors (with ghosted enabled of course). I
> used METHOD=devel, but I guess it will crash in the other modes as well.
Well, it crashes in
On Fri, 13 Mar 2009, Tim Kroeger wrote:
> Could you please check whether this enables you the reproduce the crash? You
> need to run the program on 8 processors (with ghosted enabled of course). I
> used METHOD=devel, but I guess it will crash in the other modes as well.
This is some very im
On Fri, 13 Mar 2009, Tim Kroeger wrote:
Okay, good to know. Well, I will try to track all refinement steps in
order to make the problem faster reproducible.
I did this now, and it reproduces the crash.
Let me tell you briefly about the refinement/coarsening steps of my
application: They occ
Dear Roy,
The bad news first: Your patch did not fix my crash. However, it
again led to slightly differing residuals (the fourth version now).
On Thu, 12 Mar 2009, Roy Stogner wrote:
> On Wed, 11 Mar 2009, Tim Kroeger wrote:
>
>> Still, I wonder why it cannot be replicated by writing the grid
On Wed, 11 Mar 2009, Tim Kroeger wrote:
> Still, I wonder why it cannot be replicated by writing the grid to a file and
> re-reading it in. Will that change the distribution of the dofs to the
> processors?
Often, yes. IIRC our partitioning results (at least with
METIS/ParMETIS, maybe not wi
On Thu, 12 Mar 2009, Tim Kroeger wrote:
> By the way, why doesn't System::project_vector() in case of a parallel vector
> use a ghosted (rather than a serial) as the temporary?
Because we're still getting ghosted vectors working in the first
place. Make a small change, test the small change, r
On Wed, 11 Mar 2009, Tim Kroeger wrote:
> On Wed, 11 Mar 2009, Roy Stogner wrote:
>
>> But let's try the either-easy-or-futile way first: I'll write a
>> (possibly redundant) patch to make sure we're properly getting
>> constraint dependency dofs into the send list, and you can try running
>> with
On Wed, 11 Mar 2009, John Peterson wrote:
On Wed, Mar 11, 2009 at 12:52 PM, Tim Kroeger
but the first TransientLinearImplicitSystem triggers the crash, when
projecting its _transient_old_local_solution.
variables, both of the same FE-type.
All variables are first order Lagrange. Hence, the
On Wed, Mar 11, 2009 at 12:52 PM, Tim Kroeger
wrote:
> On Wed, 11 Mar 2009, Roy Stogner wrote:
>
>> On Wed, 11 Mar 2009, Tim Kroeger wrote:
>>
>>> It happens in System::project_vector() for the case of a ghosted vector.
>>> This calls DofMap::enforce_constraints_exactly(), which in line 680 of
>>>
On Wed, 11 Mar 2009, Roy Stogner wrote:
> On Wed, 11 Mar 2009, Tim Kroeger wrote:
>
>> It happens in System::project_vector() for the case of a ghosted vector.
>> This calls DofMap::enforce_constraints_exactly(), which in line 680 of
>> dof_map_constraints.C calls NumericVector::operator(). It
On Wed, 11 Mar 2009, Tim Kroeger wrote:
> It happens in System::project_vector() for the case of a ghosted vector.
> This calls DofMap::enforce_constraints_exactly(), which in line 680 of
> dof_map_constraints.C calls NumericVector::operator(). It hits the
> libmesh_assert() near the end of
On Tue, 10 Mar 2009, Roy Stogner wrote:
> On Tue, 10 Mar 2009, Roy Stogner wrote:
>
>> if I can't find the bug right away.
>
> Found it.
Great.
Now, the application crashes at the "original" crash point.
It happens in System::project_vector() for the case of a ghosted
vector. This calls DofMa
On Tue, 10 Mar 2009, Roy Stogner wrote:
> if I can't find the bug right away.
Found it. In UnstructuredMesh::contract() we loop over elements in no
particular order and delete the subactive ones, but in devel or debug
mode we test elem->level() to make sure it's not 0. elem->level() on
a non-l
On Tue, 10 Mar 2009, Tim Kroeger wrote:
> Talking about asserts, what do you think about the attached patch?
A very good idea - I'll commit it now.
> (I'm now using METHOD=devel and try to catch my crash using asserts, and I
> suspect now that it's in the ghosted part of PetscVector somewhere
On Mon, 9 Mar 2009, Roy Stogner wrote:
On Mon, 9 Mar 2009, Tim Kroeger wrote:
But wouldn't we be catching that with a libmesh_assert()? The
preconditions on operator= in petsc_vector.C seem pretty thorough.
But only in debug mode, right?
Right.
Talking about asserts, what do you think
On Mon, Mar 9, 2009 at 10:07 AM, Tim Kroeger
wrote:
> On Mon, 9 Mar 2009, John Peterson wrote:
>
>> In optimized mode you get -DNDEBUG, in debug mode you get -DDEBUG, and
>> in devel-mode you get neither.
>
> I didn't even know that a "devel-mode" exists. How do I invoke that? I
> can't find it,
On Mon, 9 Mar 2009, Tim Kroeger wrote:
> Just to be curious: What is the most confusing code in the library then?
Matter of opinion, but in mine it's the FE base class structure.
The standard Curiously Recurring Template Pattern is confusing enough,
our version of it isn't quite the standard, an
On Mon, 9 Mar 2009, John Peterson wrote:
> In optimized mode you get -DNDEBUG, in debug mode you get -DDEBUG, and
> in devel-mode you get neither.
I didn't even know that a "devel-mode" exists. How do I invoke that?
I can't find it, neither in the output of "./configure --help" nor in
the inst
On Mon, Mar 9, 2009 at 9:41 AM, Tim Kroeger
wrote:
> On Mon, 9 Mar 2009, John Peterson wrote:
>
>> On Mon, Mar 9, 2009 at 9:23 AM, Tim Kroeger
>> wrote:
>>>
>>> (Feature-request: Use LIBMESH_DEBUG rather
>>> than DEBUG.)
>>
>> This would probably be ok... since assert's happen when NDEBUG is
>> *
On Mon, 9 Mar 2009, John Peterson wrote:
> On Mon, Mar 9, 2009 at 9:23 AM, Tim Kroeger
> wrote:
>> (Feature-request: Use LIBMESH_DEBUG rather
>> than DEBUG.)
>
> This would probably be ok... since assert's happen when NDEBUG is
> *not* defined if I remember correctly, and don't care about DEBUG.
On Mon, Mar 9, 2009 at 9:23 AM, Tim Kroeger
wrote:
> (Feature-request: Use LIBMESH_DEBUG rather
> than DEBUG.)
This would probably be ok... since assert's happen when NDEBUG is
*not* defined if I remember correctly, and don't care about DEBUG.
Can anyone else comment?
--
John
-
Dear Roy,
On Fri, 6 Mar 2009, Roy Stogner wrote:
>> It was the "solution" vector, not "current_local_solution". I don't know
>> why it crashed there, but that was reproducible. Perhaps I did something
>> else wrong. (In earlier times, when I didn't quite understand the
>> difference between
On Fri, 6 Mar 2009, Tim Kroeger wrote:
> On Thu, 5 Mar 2009, Roy Stogner wrote:
>
>> This is surprising. What were you calling scale() on? As of now the
>> only ghosted vectors are *supposed* to be current_local_solution and kin.
>
> It was the "solution" vector, not "current_local_solution".
Dear Roy,
On Thu, 5 Mar 2009, Roy Stogner wrote:
> On Thu, 5 Mar 2009, Tim Kroeger wrote:
>
>> Well, actually the crash was due to the missing a.close() between a=b
>> and a.scale(). After adding that, but before my latest patch, it didn't
>> crash any more, but it produced totally wrong result
On Thu, 5 Mar 2009, Tim Kroeger wrote:
Well, actually the crash was due to the missing a.close() between a=b
and a.scale(). After adding that, but before my latest patch, it didn't
crash any more, but it produced totally wrong results. That led me to the
idea that PetscVector::scale() (and
Dear Roy,
On Mon, 2 Mar 2009, Roy Stogner wrote:
> On Mon, 2 Mar 2009, Tim Kroeger wrote:
>
>> Update: It doesn't crash any more. With the patch that I sent you in the
>> previous mail, it seems to work.
>
> That's interesting. Did you ever track down the source of the
> problem? The patch yo
On Mon, 2 Mar 2009, Tim Kroeger wrote:
Update: It doesn't crash any more. With the patch that I sent you in the
previous mail, it seems to work.
That's interesting. Did you ever track down the source of the
problem? The patch you provided looks like it could have fixed some
inaccuracy bugs
Dear Roy,
On Wed, 25 Feb 2009, Tim Kroeger wrote:
> On Wed, 25 Feb 2009, Tim Kroeger wrote:
>
>> I'm currently trying to test the ghosted vectors on my application,
>> but the queue on the cluster is full, so I have to wait.
>
> Update: My application crashes with the ghosted vectors, but I'm not
Dear Roy,
On Wed, 25 Feb 2009, Tim Kroeger wrote:
On Wed, 25 Feb 2009, Tim Kroeger wrote:
Besides, this can't be the right fix, can it? Why should any parallel
communication be required to copy one consistent ghosted vector to
another? It seems as if VecCopy is simply failing to copy the wh
Dear Roy,
On Wed, 25 Feb 2009, Tim Kroeger wrote:
> I'm currently trying to test the ghosted vectors on my application,
> but the queue on the cluster is full, so I have to wait.
Update: My application crashes with the ghosted vectors, but I'm not
yet sure whether it's a bug in the ghosted code
On Wed 2009-02-25 11:18, Tim Kroeger wrote:
> On Tue, 24 Feb 2009, Roy Stogner wrote:
> > Besides, this can't be the right fix, can it? Why should any parallel
> > communication be required to copy one consistent ghosted vector to
> > another? It seems as if VecCopy is simply failing to copy the
On Tue, 24 Feb 2009, Roy Stogner wrote:
> On Wed, 18 Feb 2009, Roy Stogner wrote:
>
>> I'm swamped right now. The next step is to reduce ex10 to a simpler
>> case (maybe solving du/dt = 0 in nD with some non-zero ICs?) where I
>> can step through the whole projection in gdb, but it's likely to be
On Tue, 24 Feb 2009, Roy Stogner wrote:
> I think I'll make --enable-ghosted-vectors a configure option to
> make the code easier to play with while we're still not confident
> about it.
Quick update:
It's ./configure --enable-ghosted
It's not turned on by default or by --enable-everything
---
On Wed, 18 Feb 2009, Roy Stogner wrote:
> I'm swamped right now. The next step is to reduce ex10 to a simpler
> case (maybe solving du/dt = 0 in nD with some non-zero ICs?) where I
> can step through the whole projection in gdb, but it's likely to be
> weeks before I can get to that.
I apparent
On Wed, 18 Feb 2009, Tim Kroeger wrote:
> What's the current state about the ghosted vectors?
They appear to be working for stationary problems, but there's
definitely a bug which corrupts transient results - possibly in the
System::project_vector code, possibly in the
enforce_constraints_exactl
Dear Roy,
What's the current state about the ghosted vectors? Are you waiting
for some input from me, or did you just not have time in the
meanwhile?
Best Regards,
Tim
--
Dr. Tim Kroeger
tim.kroe...@mevis.fraunhofer.dePhone +49-421-218-7710
tim.kroe...@cevis.uni-bremen.de
On Fri, 6 Feb 2009, Roy Stogner wrote:
> This particular fix worked for the problem at hand, but ex14 still
> isn't working past the first solve - I suspect it may be a flaw in the
> new projection case; I'll try to track it down this weekend.
ex14 is fixed now; I hadn't completely fixed the bug
72 matches
Mail list logo