Dear Christoph, to be honest, a remark deeply hidden somewhere in the documentation wouldn't have helped me. That's because my approach typically is not to read the documentation like a book (i.e. from front to back) but rather to skip forwards to the paragraphs I currently need for the implementation. I believe most users work in a similar way. My personal suggestion would be that you "detect" in your backwards-compatible gmsh parser the event where -something-like- more than 1000 CAD-to-topology synchronizations (or whatever you call it) are implicitly done. And if this happens, just _FAIL_. This slightly breaks your compatibility, but only in corner cases where performance is too bad for practical purposes anyway. A user may want to override this behavior using a special command line option like "-cad-sync-limit infinity" - in case he thinks otherwise.
Best regards A. S. Am Do., 31. Jan. 2019 um 15:39 Uhr schrieb Christophe Geuzaine < [email protected]>: > > > > On 31 Jan 2019, at 15:19, Al Sc <[email protected]> wrote: > > > > Dear Christophe, > > > > I tried out your gmsh-file with "Point{100+1:100+N} In Volume{1};" and > it's indeed much faster! > > That indeed seems to be the solution I needed -- thanks a lot! > > > > As I don't really know much about the internal functionality of gmsh, it > would have been almost impossible for me to come up with that solution on > my own. > > From your documentation it is clear that the input to gmsh is more than > just a geometry specification, and it's rather kind of a script which > generates the geometry. > > However, given only the documentation, a naive user like me will > conclude that you do not *need* any advanced scripting functionality to > generate a basic model. E.g. that you can restrict yourself to "just using > the geometry-specification subset of the language" without any performance > penalty. Hence, a naive user concludes that: > > " Point{100+1:100+N} In Volume{1};" > > and > > " Point{100} In Volume{1}; ... > > ... Point{10100} In Volume{1};" > > are, in fact, equivalent (not only in semantics, but also in "parser > performance"). > > They are equivalent. The difference is *when* you call them. > > > > > I'd suggest that you try to implement the model parsing in a way that > the "topology structure" is not immediately re-computed, but only > invalidated. > > Because, in my opinion, the observed performance-drop of gmsh with > repeated "Point In Volume" statements is still highly counter-intuitive, > even to users more advanced than me. > > > > Indeed. This is all made clear when you use the API, where the distinction > between CAD operations and topological model operations is made explicit. > > For example, if using the Python API you try to do > > for i in range(N): > gmsh.model.occ.addPoint(...) > gmsh.model.mesh.embed(...) > > you will get an error, stating that the point does not exist in the model > - since no call to gmsh.model.occ.synchronize() has been made after the > point was added to the CAD. To make it work you would then do > > for i in range(N): > gmsh.model.occ.addPoint(...) > gmsh.model.occ.synchronize(...) > gmsh.model.mesh.embed(...) > > When N is large you would rapidly conclude that calling "synchronize" that > many times will kill your performance (the API doc makes this explicit). > > The .geo parser strives to be automatic and backward compatible with very > old versions of Gmsh. So each "Point ... In ..." command thus checks if the > CAD has been modified - and if it has, triggers a synchronization > automatically. We could add a note in the documentation about this, by > stating explicitly which operations will trigger a sync. I'm adding this to > our TODO list. > > Christophe > > > > > > Best regards, > > A.S. > > > > Am Do., 31. Jan. 2019 um 14:53 Uhr schrieb Christophe Geuzaine < > [email protected]>: > > > > This was precisely the point of my example: if you embed the point after > each point is created, you force a rebuild of the topology of the model. So > the efficient script would be > > > > SetFactory("OpenCASCADE"); > > Box(1) = {0,0,0, 1,1,1}; > > N=10000; > > For i In {1:N} > > Point(100+i) = {0.25 + 5e-5*i, 0.1,0.1}; > > EndFor > > Point{100+1:100+N} In Volume{1}; > > > > Christophe > > > > > > > On 31 Jan 2019, at 14:48, Al Sc <[email protected]> wrote: > > > > > > Dear Christophe, > > > > > > my example files are scientific data, however originate from > processing certain proprietary 3D models that were shared with me under > certain restrictions. Therefore, it's difficult to share my original file > with you. > > > However, I was indeed able to reproduce the issue using only a slight > modification of your example file! > > > > > > Compare the following two files: > > > test1.geo: > > > %%% %%% %%% %%% %%% %%% %%% %%% %%% %%% > > > SetFactory("OpenCASCADE"); > > > Box(1) = {0,0,0, 1,1,1}; > > > For i In {1:1000} > > > Point(100+i) = {0.25 + 1e-4*i, 0.1,0.1}; > > > Point{100+i} In Volume{1}; > > > EndFor > > > %%% %%% %%% %%% %%% %%% %%% %%% %%% %%% > > > > > > test2.geo: > > > %%% %%% %%% %%% %%% %%% %%% %%% %%% %%% > > > SetFactory("OpenCASCADE"); > > > Box(1) = {0,0,0, 1,1,1}; > > > For i In {1:10000} > > > Point(100+i) = {0.25 + 1e-5*i, 0.1,0.1}; > > > Point{100+i} In Volume{1}; > > > EndFor > > > %%% %%% %%% %%% %%% %%% %%% %%% %%% %%% > > > > > > The first file contains 1000 internal pts, and the second one contains > 10 times as much. > > > Now run: > > > $ time gmsh -1 test1.geo > > > -> Takes 0.88 sec > > > $ time gmsh -1 test2.geo > > > -> Takes 75 sec > > > This is a nearly 100 (!!!) times increase in parsing run time, when > the model size is increased by only a factor of 10. > > > This nicely illustrates what I meant by "quadratic growth of the > parsing run time as a function of the number of internal pts." > > > As a consequence of this "quadratic growth", the run time for larger > models quickly becomes enormous. > > > > > > Best regards > > > > > > A.S. > > > > > > Am Do., 31. Jan. 2019 um 14:31 Uhr schrieb Christophe Geuzaine < > [email protected]>: > > > > > > Can you send a test file? > > > > > > I tried this, i.e. adding 1000 embedded points in a volume, and it is > quite fast (< 2 seconds): > > > > > > SetFactory("OpenCASCADE"); > > > Box(1) = {0,0,0, 1,1,1}; > > > For i In {1:1000} > > > Point(100+i) = {0.25 + 5e-4*i, 0.1,0.1}; > > > Point{100+i} In Volume{1}; > > > EndFor > > > > > > Maybe you modify the model after each insertion of an embedded point, > which would force a rebuild of the topological model? > > > > > > Christophe > > > > > > > > > > On 31 Jan 2019, at 13:04, Al Sc <[email protected]> wrote: > > > > > > > > Dear Sir or Madam, > > > > > > > > some more details on the previously-reported case of extremely slow > parsing of a gmsh file: > > > > The file contains 34795 points -- of which 23041 points are "Point > In Volume" (e.g. embedded_vertices of a GRegion?). Moreover, it contains > 4998 "Plane Surface"s and one single "Volume". Almost all plane surfaces > are triangles and quads -- except for <10 facets, which each have a very > high vertex count (typ. 7000). > > > > > > > > I found out that if I remove all "Point In Volume" objects, then the > parsing takes only 20 seconds. However, if all the "Point In Volume" > objects are not removed, parsing takes around 3000 seconds. This led me to > the hypothesis that there is some huge inefficiency with parsing gmsh files > with "Point In Volume" (probably quadratic run time in N(PointInVolume)??). > > > > > > > > Furthermore, I analyzed the runs using "perf" (linux utility). > > > > Run 1 (about 3000 seconds): > > > > $ time perf record ./gmsh-3.0.6-Linux64/bin/gmsh -1 input_orig.geo > > > > Run 2 (about 20 seconds): > > > > $ time perf record ./gmsh-3.0.6-Linux64/bin/gmsh -1 > input_no_point_in_volume.geo > > > > In Run 1, "perf report" yields the following top-scoring functions: > > > > 11.57% gmsh gmsh [.] GModel::getEdgeByTag > > > > 7.73% gmsh gmsh [.] > GEO_Internals::synchronize > > > > 6.80% gmsh libc-2.12.so [.] _int_free > > > > 6.31% gmsh gmsh [.] GModel::getVertexByTag > > > > 4.91% gmsh libc-2.12.so [.] malloc > > > > 4.53% gmsh gmsh [.] InterpolateCurve > > > > 4.40% gmsh libstdc++.so.6.0.22 [.] std::_Rb_tree_increment > > > > 3.88% gmsh libc-2.12.so [.] memcpy > > > > 3.69% gmsh gmsh [.] List_Nbr > > > > 3.35% gmsh libc-2.12.so [.] _int_malloc > > > > 3.24% gmsh gmsh [.] GEntity::GEntity > > > > 3.13% gmsh gmsh [.] avl_lookup > > > > 2.63% gmsh gmsh [.] gmshFace::resetNativePtr > > > > 2.53% gmsh gmsh [.] GEdge::addFace > > > > 2.52% gmsh gmsh [.] CompareVertex > > > > 1.97% gmsh gmsh [.] std::_List_base<GEdge*, > std::allocator<GEdge*> > > > > 1.88% gmsh gmsh [.] GModel::getFaceByTag > > > > 1.76% gmsh gmsh [.] Tree_Action > > > > 1.74% gmsh gmsh [.] gmshEdge::degenerate > > > > 1.63% gmsh gmsh [.] CompareCurve > > > > 1.53% gmsh gmsh [.] std::_List_base<int, > std::allocator<int> >::_M > > > > 1.38% gmsh gmsh [.] > GModel::deletePhysicalGroups > > > > 1.23% gmsh gmsh [.] > std::_List_base<GEdgeSigned, std::allocator<GE > > > > 1.10% gmsh gmsh [.] GVertex::addEdge > > > > 0.85% gmsh gmsh [.] List_Read > > > > 0.75% gmsh gmsh [.] GEdge::getBeginVertex > > > > 0.73% gmsh gmsh [.] GFace::computeMeanPlane > > > > 0.73% gmsh libc-2.12.so [.] free > > > > 0.67% gmsh gmsh [.] CTX::instance > > > > 0.65% gmsh gmsh [.] > gmshEdge::resetMeshAttributes > > > > > > > > This suggests that maybe GModel::getEdgeByTag is eventually called a > quadratic number of times in the number of PointInVolume-objects -- and > this causes the drastic slow-down. > > > > > > > > Could you please investigate this? Thanks a lot! > > > > (As already mentioned, this issue also occurs with gmsh-4.x.x) > > > > > > > > Best regards > > > > A. S. > > > > _______________________________________________ > > > > gmsh mailing list > > > > [email protected] > > > > http://onelab.info/mailman/listinfo/gmsh > > > > > > — > > > Prof. Christophe Geuzaine > > > University of Liege, Electrical Engineering and Computer Science > > > http://www.montefiore.ulg.ac.be/~geuzaine > > > > > > > > > > > > > — > > Prof. Christophe Geuzaine > > University of Liege, Electrical Engineering and Computer Science > > http://www.montefiore.ulg.ac.be/~geuzaine > > > > > > > > — > Prof. Christophe Geuzaine > University of Liege, Electrical Engineering and Computer Science > http://www.montefiore.ulg.ac.be/~geuzaine > > > >
_______________________________________________ gmsh mailing list [email protected] http://onelab.info/mailman/listinfo/gmsh
