Your point on the initialization is perfectly clear, thanks.

Does the libgraph_tool_topology.get_*dists *could suffer from the same
problems of memory re-allocations ? If not, what could be the reasons of
the execution time increase (approx. 23x) observed for
libgraph_tool_topology.get_*dists *?

I'll try to disable openmp.

Best,
François.

2015-03-10 16:33 GMT+01:00 Tiago Peixoto [via Main discussion list for the
graph-tool project] <[email protected]>:

> On 10.03.2015 15:39, François wrote:
>
> > Hello,
> >
> > Firstly thanks for pushing in this update as quickly, that's
> > awesome. I tested your fix, It seems that the vector initialization
> > was not the only to blame. With the fix the execution time is still
> > around 300ms.
> >
> > I checked that you're fix is used, and "timed" the execution of the
> *shortest_distance* function up to the call to
> *libgraph_tool_topology.get_dists*.
> >
> > In the "sub-sample" graph:
> >
> >   * the initialization takes in average 3ms;
> >   * the the call to libgraph_tool_topology.get_dists* *takes in average
> 11ms.
> >
> > In the full graph:
> >
> >   * the initialization takes in average 100ms;
> >   * the  call to libgraph_tool_topology.get_dists* *takes in average
> 230ms.
> Do you have openmp enabled? If yes, it is best to disable it with
> openmp_set_num_threads(1) before running the code.
>
> Note, however, that you will always get a slower code for a larger
> graph... A 10x slow-down in each function is expected if your graph is
> 10x larger.
>
> Note also that the distance initialization is done inside get_dists().
>
> > Surprisingly (or not ?) when I change the pmap initialization [link to
> source <
> https://git.skewed.de/count0/graph-tool/blob/master/src/graph_tool/topology/__init__.py#L1248>]
> to:
> >
> >     vals = numpy.arange(g.num_vertices(), dtype='int')
> >     pmap = g.new_vertex_property("int64_t", vals=vals)
> >
> > The numpy call takes 36ms and creation of the new vertex property takes
> 103ms.
>
> These are very tight loops, so things are very sensitive to cache hits,
> memory re-allocation issues and so on.  For instance, here I get:
>
>     In [8]: g = lattice([1000, 10000])
>
>     In [9]: g
>     Out[9]: <Graph object, undirected, with 10000000 vertices and 19989000
> edges at 0x7fad583720b8>
>
>     In [10]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 16.7 ms, sys: 0 ns, total: 16.7 ms
>     Wall time: 19.6 ms
>
>     In [11]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 13.3 ms, sys: 960 ms, total: 973 ms
>     Wall time: 976 ms
>
>     In [12]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 20 ms, sys: 13.3 ms, total: 33.3 ms
>     Wall time: 32.6 ms
>
>     In [13]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 3.33 ms, sys: 950 ms, total: 953 ms
>     Wall time: 954 ms
>
>     In [14]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 20 ms, sys: 10 ms, total: 30 ms
>     Wall time: 30 ms
>
>     In [15]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 20 ms, sys: 43.3 ms, total: 63.3 ms
>     Wall time: 62.1 ms
>
>     In [16]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 23.3 ms, sys: 6.67 ms, total: 30 ms
>     Wall time: 28.9 ms
>
>     In [17]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 13.3 ms, sys: 173 ms, total: 187 ms
>     Wall time: 186 ms
>
>     In [18]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 20 ms, sys: 13.3 ms, total: 33.3 ms
>     Wall time: 30.7 ms
>
>     In [19]: %time vals = numpy.arange(g.num_vertices(), dtype='int')
>     CPU times: user 6.67 ms, sys: 853 ms, total: 860 ms
>     Wall time: 858 ms
>
> Notice how the timing fluctuates wildly from ~30ms to ~900ms!
>
> With the property map creation, things are similar:
>
>     In [20]: %time pmap = g.new_vertex_property("int64_t", vals=vals)
>     CPU times: user 50 ms, sys: 0 ns, total: 50 ms
>     Wall time: 48.2 ms
>
>     In [21]: %time pmap = g.new_vertex_property("int64_t", vals=vals)
>     CPU times: user 23.3 ms, sys: 20 ms, total: 43.3 ms
>     Wall time: 43.5 ms
>
>     In [22]: %time pmap = g.new_vertex_property("int64_t", vals=vals)
>     CPU times: user 40 ms, sys: 3.33 ms, total: 43.3 ms
>     Wall time: 42.3 ms
>
>     In [23]: %time pmap = g.new_vertex_property("int64_t", vals=vals)
>     CPU times: user 30 ms, sys: 490 ms, total: 520 ms
>     Wall time: 519 ms
>
>     In [24]: %time pmap = g.new_vertex_property("int64_t", vals=vals)
>     CPU times: user 50 ms, sys: 6.67 ms, total: 56.7 ms
>     Wall time: 56.6 ms
>
>     In [25]: %time pmap = g.new_vertex_property("int64_t", vals=vals)
>     CPU times: user 26.7 ms, sys: 1.12 s, total: 1.15 s
>     Wall time: 1.14 s
>
>     In [26]: %time pmap = g.new_vertex_property("int64_t", vals=vals)
>     CPU times: user 43.3 ms, sys: 13.3 ms, total: 56.7 ms
>     Wall time: 54.6 ms
>
>     In [27]: %time pmap = g.new_vertex_property("int64_t", vals=vals)
>     CPU times: user 20 ms, sys: 1.11 s, total: 1.13 s
>     Wall time: 1.12 s
>
> These jumps in time are probably just memory re-allocations.
>
> The actual initialization itself is basically the same when using a
> numpy array or a property map:
>
>     In [33]: %time pmap.a[:] = 10
>     CPU times: user 30 ms, sys: 0 ns, total: 30 ms
>     Wall time: 20.7 ms
>
>     In [34]: %time pmap.a[:] = 10
>     CPU times: user 26.7 ms, sys: 0 ns, total: 26.7 ms
>     Wall time: 18.9 ms
>
>     In [35]: %time pmap.a[:] = 10
>     CPU times: user 20 ms, sys: 0 ns, total: 20 ms
>     Wall time: 20.1 ms
>
>     n [36]: %time vals[:] = 10
>     CPU times: user 16.7 ms, sys: 3.33 ms, total: 20 ms
>     Wall time: 18.3 ms
>
>     In [37]: %time vals[:] = 10
>     CPU times: user 26.7 ms, sys: 0 ns, total: 26.7 ms
>     Wall time: 21.7 ms
>
>     In [38]: %time vals[:] = 10
>     CPU times: user 16.7 ms, sys: 0 ns, total: 16.7 ms
>     Wall time: 15 ms
>
> Therefore I'm not really sure there is any issue with the initialization.
>
> Best,
> Tiago
>
> --
> Tiago de Paula Peixoto <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=4026032&i=0>>
>
>
> _______________________________________________
> graph-tool mailing list
> [hidden email] <http:///user/SendEmail.jtp?type=node&node=4026032&i=1>
> http://lists.skewed.de/mailman/listinfo/graph-tool
>
> *signature.asc* (836 bytes) Download Attachment
> <http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/attachment/4026032/0/signature.asc>
>  --
> Tiago de Paula Peixoto <[email protected]>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/Shortest-distance-complexity-when-used-with-max-dist-tp4026018p4026032.html
>  To start a new topic under Main discussion list for the graph-tool
> project, email [email protected]
> To unsubscribe from Shortest_distance complexity when used with max_dist, 
> click
> here
> <http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4026018&code=ZnJhbmNvaXMua2F3YWxhQGdtYWlsLmNvbXw0MDI2MDE4fDIxMTQ0MDk4Nzk=>
> .
> NAML
> <http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>



-- 
François Kawala




--
View this message in context: 
http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/Shortest-distance-complexity-when-used-with-max-dist-tp4026018p4026033.html
Sent from the Main discussion list for the graph-tool project mailing list 
archive at Nabble.com.
_______________________________________________
graph-tool mailing list
[email protected]
http://lists.skewed.de/mailman/listinfo/graph-tool

Reply via email to