Re: [Gmsh] "Re: GMSH parsing is very slow"

Christophe Geuzaine Thu, 31 Jan 2019 06:39:37 -0800


> On 31 Jan 2019, at 15:19, Al Sc <[email protected]> wrote:
> 
> Dear Christophe,
> 
> I tried out your gmsh-file with "Point{100+1:100+N} In Volume{1};" and it's 
> indeed much faster!
> That indeed seems to be the solution I needed -- thanks a lot!
> 
> As I don't really know much about the internal functionality of gmsh, it 
> would have been almost impossible for me to come up with that solution on my 
> own.
> From your documentation it is clear that the input to gmsh is more than just 
> a geometry specification, and it's rather kind of a script which generates 
> the geometry.
> However, given only the documentation, a naive user like me will conclude 
> that you do not *need* any advanced scripting functionality to generate a 
> basic model. E.g. that you can restrict yourself to "just using the 
> geometry-specification subset of the language" without any performance 
> penalty. Hence, a naive user concludes that:
> " Point{100+1:100+N} In Volume{1};"
> and
> " Point{100} In Volume{1}; ...
>  ... Point{10100} In Volume{1};"
> are, in fact, equivalent (not only in semantics, but also in "parser 
> performance").


They are equivalent. The difference is *when* you call them.

> 
> I'd suggest that you try to implement the model parsing in a way that the 
> "topology structure" is not immediately re-computed, but only invalidated.
> Because, in my opinion, the observed performance-drop of gmsh with repeated 
> "Point In Volume" statements is still highly counter-intuitive, even to users 
> more advanced than me.
> 

Indeed. This is all made clear when you use the API, where the distinction 
between CAD operations and topological model operations is made explicit. 

For example, if using the Python API you try to do

for i in range(N):
  gmsh.model.occ.addPoint(...)
  gmsh.model.mesh.embed(...)

you will get an error, stating that the point does not exist in the model - 
since no call to gmsh.model.occ.synchronize() has been made after the point was 
added to the CAD. To make it work you would then do

for i in range(N):
  gmsh.model.occ.addPoint(...)
  gmsh.model.occ.synchronize(...)
  gmsh.model.mesh.embed(...)

When N is large you would rapidly conclude that calling "synchronize" that many 
times will kill your performance (the API doc makes this explicit).

The .geo parser strives to be automatic and backward compatible with very old 
versions of Gmsh. So each "Point ... In ..." command thus checks if the CAD has 
been modified - and if it has, triggers a synchronization automatically. We 
could add a note in the documentation about this, by stating explicitly which 
operations will trigger a sync. I'm adding this to our TODO list.

Christophe




> Best regards,
> A.S.
> 
> Am Do., 31. Jan. 2019 um 14:53 Uhr schrieb Christophe Geuzaine 
> <[email protected]>:
> 
> This was precisely the point of my example: if you embed the point after each 
> point is created, you force a rebuild of the topology of the model. So the 
> efficient script would be
> 
> SetFactory("OpenCASCADE");
> Box(1) = {0,0,0, 1,1,1};
> N=10000;
> For i In {1:N}
>   Point(100+i) = {0.25 + 5e-5*i, 0.1,0.1};
> EndFor
> Point{100+1:100+N} In Volume{1};
> 
> Christophe
> 
> 
> > On 31 Jan 2019, at 14:48, Al Sc <[email protected]> wrote:
> > 
> > Dear Christophe,
> > 
> > my example files are scientific data, however originate from processing 
> > certain proprietary 3D models that were shared with me under certain 
> > restrictions. Therefore, it's difficult to share my original file with you.
> > However, I was indeed able to reproduce the issue using only a slight 
> > modification of your example file!
> > 
> > Compare  the following two files:
> > test1.geo:
> > %%% %%% %%% %%% %%% %%% %%% %%% %%% %%%
> > SetFactory("OpenCASCADE");
> > Box(1) = {0,0,0, 1,1,1};
> > For i In {1:1000}
> >   Point(100+i) = {0.25 + 1e-4*i, 0.1,0.1};
> >   Point{100+i} In Volume{1};
> > EndFor
> > %%% %%% %%% %%% %%% %%% %%% %%% %%% %%% 
> > 
> > test2.geo:
> > %%% %%% %%% %%% %%% %%% %%% %%% %%% %%% 
> > SetFactory("OpenCASCADE");
> > Box(1) = {0,0,0, 1,1,1};
> > For i In {1:10000}
> >   Point(100+i) = {0.25 + 1e-5*i, 0.1,0.1};
> >   Point{100+i} In Volume{1};
> > EndFor
> > %%% %%% %%% %%% %%% %%% %%% %%% %%% %%% 
> > 
> > The first file contains 1000 internal pts, and the second one contains 10 
> > times as much.
> > Now run:
> >   $ time gmsh -1 test1.geo
> >     -> Takes 0.88 sec
> >   $ time gmsh -1 test2.geo 
> >     -> Takes 75 sec 
> > This is a nearly 100 (!!!) times increase in parsing run time, when the 
> > model size is increased by only a factor of 10.
> > This nicely illustrates what I meant by "quadratic growth of the parsing 
> > run time as a function of the number of internal pts."
> > As a consequence of this "quadratic growth", the run time for larger models 
> > quickly becomes enormous.
> > 
> > Best regards
> > 
> > A.S.
> > 
> > Am Do., 31. Jan. 2019 um 14:31 Uhr schrieb Christophe Geuzaine 
> > <[email protected]>:
> > 
> > Can you send a test file?
> > 
> > I tried this, i.e. adding 1000 embedded points in a volume, and it is quite 
> > fast (< 2 seconds):
> > 
> > SetFactory("OpenCASCADE");
> > Box(1) = {0,0,0, 1,1,1};
> > For i In {1:1000}
> >   Point(100+i) = {0.25 + 5e-4*i, 0.1,0.1};
> >   Point{100+i} In Volume{1};
> > EndFor
> > 
> > Maybe you modify the model after each insertion of an embedded point, which 
> > would force a rebuild of the topological model?
> > 
> > Christophe
> > 
> > 
> > > On 31 Jan 2019, at 13:04, Al Sc <[email protected]> wrote:
> > > 
> > > Dear Sir or Madam,
> > > 
> > > some more details on the previously-reported case of extremely slow 
> > > parsing of a gmsh file:
> > > The file contains 34795 points -- of which 23041 points are "Point In 
> > > Volume" (e.g. embedded_vertices of a GRegion?). Moreover, it contains 
> > > 4998 "Plane Surface"s and one single "Volume". Almost all plane surfaces 
> > > are triangles and quads -- except for <10 facets, which each have a very 
> > > high vertex count (typ. 7000).
> > > 
> > > I found out that if I remove all "Point In Volume" objects, then the 
> > > parsing takes only 20 seconds. However, if all the "Point In Volume" 
> > > objects are not removed, parsing takes around 3000 seconds. This led me 
> > > to the hypothesis that there is some huge inefficiency with parsing gmsh 
> > > files with "Point In Volume" (probably quadratic run time in 
> > > N(PointInVolume)??).
> > > 
> > > Furthermore, I analyzed the runs using "perf" (linux utility).
> > > Run 1 (about 3000 seconds):
> > >   $ time perf record ./gmsh-3.0.6-Linux64/bin/gmsh -1 input_orig.geo
> > > Run 2 (about 20 seconds):
> > >   $ time perf record ./gmsh-3.0.6-Linux64/bin/gmsh -1 
> > > input_no_point_in_volume.geo
> > > In Run 1, "perf report" yields the following top-scoring functions:
> > >   11.57%  gmsh     gmsh                 [.] GModel::getEdgeByTag
> > >    7.73%  gmsh     gmsh                 [.] GEO_Internals::synchronize
> > >    6.80%  gmsh     libc-2.12.so         [.] _int_free
> > >    6.31%  gmsh     gmsh                 [.] GModel::getVertexByTag
> > >    4.91%  gmsh     libc-2.12.so         [.] malloc
> > >    4.53%  gmsh     gmsh                 [.] InterpolateCurve
> > >    4.40%  gmsh     libstdc++.so.6.0.22  [.] std::_Rb_tree_increment
> > >    3.88%  gmsh     libc-2.12.so         [.] memcpy
> > >    3.69%  gmsh     gmsh                 [.] List_Nbr
> > >    3.35%  gmsh     libc-2.12.so         [.] _int_malloc
> > >    3.24%  gmsh     gmsh                 [.] GEntity::GEntity
> > >    3.13%  gmsh     gmsh                 [.] avl_lookup
> > >    2.63%  gmsh     gmsh                 [.] gmshFace::resetNativePtr
> > >    2.53%  gmsh     gmsh                 [.] GEdge::addFace
> > >    2.52%  gmsh     gmsh                 [.] CompareVertex
> > >    1.97%  gmsh     gmsh                 [.] std::_List_base<GEdge*, 
> > > std::allocator<GEdge*>
> > >    1.88%  gmsh     gmsh                 [.] GModel::getFaceByTag
> > >    1.76%  gmsh     gmsh                 [.] Tree_Action
> > >    1.74%  gmsh     gmsh                 [.] gmshEdge::degenerate
> > >    1.63%  gmsh     gmsh                 [.] CompareCurve
> > >    1.53%  gmsh     gmsh                 [.] std::_List_base<int, 
> > > std::allocator<int> >::_M
> > >    1.38%  gmsh     gmsh                 [.] GModel::deletePhysicalGroups
> > >    1.23%  gmsh     gmsh                 [.] std::_List_base<GEdgeSigned, 
> > > std::allocator<GE
> > >    1.10%  gmsh     gmsh                 [.] GVertex::addEdge
> > >    0.85%  gmsh     gmsh                 [.] List_Read
> > >    0.75%  gmsh     gmsh                 [.] GEdge::getBeginVertex
> > >    0.73%  gmsh     gmsh                 [.] GFace::computeMeanPlane
> > >    0.73%  gmsh     libc-2.12.so         [.] free
> > >    0.67%  gmsh     gmsh                 [.] CTX::instance
> > >    0.65%  gmsh     gmsh                 [.] gmshEdge::resetMeshAttributes
> > > 
> > > This suggests that maybe GModel::getEdgeByTag is eventually called a 
> > > quadratic number of times in the number of PointInVolume-objects -- and 
> > > this causes the drastic slow-down.
> > > 
> > > Could you please investigate this? Thanks a lot!
> > > (As already mentioned, this issue also occurs with gmsh-4.x.x)
> > > 
> > > Best regards
> > > A. S.
> > > _______________________________________________
> > > gmsh mailing list
> > > [email protected]
> > > http://onelab.info/mailman/listinfo/gmsh
> > 
> > — 
> > Prof. Christophe Geuzaine
> > University of Liege, Electrical Engineering and Computer Science 
> > http://www.montefiore.ulg.ac.be/~geuzaine
> > 
> > 
> > 
> 
> — 
> Prof. Christophe Geuzaine
> University of Liege, Electrical Engineering and Computer Science 
> http://www.montefiore.ulg.ac.be/~geuzaine
> 
> 
> 

— 
Prof. Christophe Geuzaine
University of Liege, Electrical Engineering and Computer Science 
http://www.montefiore.ulg.ac.be/~geuzaine




_______________________________________________
gmsh mailing list
[email protected]
http://onelab.info/mailman/listinfo/gmsh

Re: [Gmsh] "Re: GMSH parsing is very slow"

Reply via email to