Re: [graph-tool] Slow filling graph

2018-11-29 Thread elastica
> Note that you can still remove the last loop altogether by accessing the
property maps as arrays: (tree.a * weight.a).sum() 

This innocuous remark leads in fact to a huge speedup: you gain a factor 13,
a lot of surprise! Now, Graph Tool has the best performance among Networkit,
Igraph, Scipy and NetworkX of course. However, to put in perspective, Graph
Tool is only 2 times faster than pure Python code where Kruskal is
implemented with a basic Union-Find, we are far from genuine C/C++
performance, because usually graph implementations are 20 times faster in
C/C++ than pure Python. 

Below, the complete benchmark. Note that Scipy code has the potential to
become the faster, the problem with scipy graphs is the need to convert
adjacency lists (or edges list) to Compressed Sparse Row and you have to
write it in Python, this is VERY slow.


# ***

from time import clock
from sys import stderr, stdin

import networkit as nk
from networkit.graph import RandomMaximumSpanningForest


from scipy.sparse.csgraph import minimum_spanning_tree
from scipy.sparse import csr_matrix


from igraph import Graph

import graph_tool as gt
from graph_tool.topology import min_spanning_tree  
import numpy as np


#  Pure Python Code ---

def find(s, parents):
ups=[s]
while True:
up=parents[s]
if up ==None:
break
s=up
ups.append(s)
for v in ups[:-1]:
parents[v]=s
return s

def merge(s, t, parents, heights):
rep_s=find(s, parents)
rep_t=find(t, parents)

if rep_s==rep_t:
return False
height_s=heights[rep_s]
height_t=heights[rep_t]
if height_s <  height_t:
parents[rep_s]=rep_t
else:
parents[rep_t]=rep_s
if height_s==height_t:
heights[rep_s]+=1
return True  

def kruskal_python(edges, n):
edges.sort(key=lambda t:t[2])
parents=[None]*n
heights=[0]*n
k=0
cost=0
cnt=0
while cnthttp://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
___
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool


Re: [graph-tool] Slow filling graph

2018-11-29 Thread Tiago de Paula Peixoto
Am 29.11.18 um 16:51 schrieb elastica:
>
> def kruskal_gt(wedges,n):
>     g= gt.Graph(directed=False)
>     weight = g.new_edge_property("long long")
>     g.add_edge_list(np.array(wedges), eprops=[weight])
>     tree=min_spanning_tree(g, weights=weight)
>     return sum(b*weight[e] for (e,b) in zip(g.edges(), tree))

Yes, this is what I meant. Note that you can still remove the last loop 
altogether by accessing the property maps as arrays:

    (tree.a * weight.a).sum()

Best,
Tiago
-- 
Tiago de Paula Peixoto 




signature.asc
Description: OpenPGP digital signature
___
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool


Re: [graph-tool] Slow filling graph

2018-11-29 Thread elastica
Again, doesn't work but the code was visible in the preview.

Here the code outside any tag:



def kruskal_gt(wedges,n):
g= gt.Graph(directed=False)
weight = g.new_edge_property("long long")
g.add_edge_list(np.array(wedges), eprops=[weight])
tree=min_spanning_tree(g, weights=weight)
return sum(b*weight[e] for (e,b) in zip(g.edges(), tree))






--
Sent from: 
http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
___
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool


Re: [graph-tool] Slow filling graph

2018-11-29 Thread elastica
;)

Oops, that's infortunate, sorry for the confusion, yet I pasted the code
inside a raw text tag.


Again  :





--
Sent from: 
http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
___
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool


Re: [graph-tool] Slow filling graph

2018-11-29 Thread Tiago de Paula Peixoto
Am 29.11.18 um 14:59 schrieb elastica:
> Hi,
>
> Is the following code correspond to the code you have in mind:
>
>
>
> ?
>
>

It seems you forgot to actually paste the code.

-- 
Tiago de Paula Peixoto 




signature.asc
Description: OpenPGP digital signature
___
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool


Re: [graph-tool] Slow filling graph

2018-11-29 Thread elastica
Hi, 

Is the following code correspond to the code you have in mind:



?



--
Sent from: 
http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
___
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool


Re: [graph-tool] Slow filling graph

2018-11-29 Thread Tiago de Paula Peixoto
Am 29.11.18 um 10:20 schrieb elastica:
> The *add_edge_list* internally changes edges ordering to the lexicographic
> one so in order to set weights, I have to sort the weighted-edge list,
> adding some little overhead.

That's not true; no re-ordering is performed by add_edge_list().

> On the other hand, *add_edge_list()* is much
> more efficient than adding edges one by one with the add_edge method (about
> 7 times faster!). Unfortunately, this is still slow, even slower than
> NetworkX (about 1.2 times slower and we all know that NetworkX highest
> quality is not speed).

That's because you are looping over the edges _twice_. Take a more careful
look at the documentation for add_edge_list(), and you find that it accepts
an "eprops" parameters, which lists the edge property maps that need to be
filled as well. Just pass your weights to this parameter, and feed it a list
of (source, target, weight).

For it to be even faster, you can make your edge list be a Numpy array ---
in which case the entire loop will be in C++.

-- 
Tiago de Paula Peixoto 



signature.asc
Description: OpenPGP digital signature
___
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool


Re: [graph-tool] Slow filling graph

2018-11-29 Thread elastica
Hi Alexandre and thanks for your useful response


The *add_edge_list* internally changes edges ordering to the lexicographic
one so in order to set weights, I have to sort the weighted-edge list,
adding some little overhead. On the other hand, *add_edge_list()* is much
more efficient than adding edges one by one with the add_edge method (about
7 times faster!). Unfortunately, this is still slow, even slower than
NetworkX (about 1.2 times slower and we all know that NetworkX highest
quality is not speed).

Here is the new version:

# ***
from time import clock
from sys import stderr, stdin

import graph_tool as gt
from graph_tool.topology import min_spanning_tree  

import networkx as nx


#  NetworkX Code ---

def kruskal_networkX(edges, n):
G=nx.Graph()
for a, b, w in edges:
G.add_edge(a,b,weight=w)
return sum(d['weight'] for (_,_,d) in
nx.minimum_spanning_tree(G).edges(data=True))



#  Graph_tool Code ---

def kruskal_gt(wedges,n):
wedges.sort()
g= gt.Graph(directed=False)
edges, weights= zip(*[((a,b),w) for (a,b,w) in wedges])
weight = g.new_edge_property("long long")
g.add_edge_list(edges)
for e,(_,_,w) in zip(g.edges(), wedges):
weight[e]=w
tree=min_spanning_tree(g, weights=weight)
return sum(b*weight[e] for (e,b) in zip(g.edges(), tree))

#  Benchmark ---

def go(solver, L):
global duration

N=len(L)
k=1

while k gt.out 
kruskal_networkX : 15.851
kruskal_gt : 19.713



--
Sent from: 
http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
___
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool


Re: [graph-tool] Slow filling graph

2018-11-26 Thread Alexandre Hannud Abdo
Ni! Hi elastica,

In graph-tool you can efficiently add edges with properties using the
`add_edge_list` function:

https://graph-tool.skewed.de/static/doc/graph_tool.html#graph_tool.Graph.add_edge_list

I'd also suggest that you benchmark python code with the `cProfile` module,
it is much easier and more informative:

https://docs.python.org/3/library/profile.html

Cheers,
ale
.~´

On Sat, Nov 24, 2018 at 7:36 PM  wrote:

> Hi
>
> I was experimenting Kruskal performance against a data file for 100 graphs
> (n, m<2) :
>
>
> # =
> from time import clock
> from sys import stderr, stdin
>
> import graph_tool as gt
> from graph_tool.topology import min_spanning_tree
>
> import networkx as nx
>
>
> #  NetworkX Code ---
>
> def kruskal_networkX(edges, n):
> G=nx.Graph()
> for a, b, w in edges:
> G.add_edge(a,b,weight=w)
> return sum(d['weight'] for (_,_,d) in
> nx.minimum_spanning_tree(G).edges(data=True))
>
>
>
> #  Graph_tool Code ---
>
> def kruskal_gt(wedges,n):
> g= gt.Graph(directed=False)
> edges, weights= zip(*[((a,b),w) for (a,b,w) in wedges])
> weight = g.new_edge_property("long long")
> for a,b,w in wedges:
> e=g.add_edge(a,b)
> weight[e]=w
> tree=min_spanning_tree(g, weights=weight)
> return sum(b*weight[e] for (e,b) in zip(g.edges(), tree))
>
> #  Benchmark ---
>
> def go(solver, L):
> global duration
>
> N=len(L)
> k=1
>
> while k edges=[]
> n=L[k]
> k+=1
>
> for a in range(n):
> d=L[k]
> k+=1
> for j in range(d):
> b=L[k]
> k+=1
> w=L[k]
> k+=1
> if a edges.append([a,b-1,w])
> start=clock()
> print(solver(edges, n))
> print("")
> duration+=clock()-start
>
> # data
> L=list(int(z) for z in stdin.read().split() if z.isdigit())
>
> for solver in [kruskal_networkX, kruskal_gt]:
> duration=0
> go(solver, L)
> print("%s : %.3f" %(solver.__name__, duration), file=stderr)
>
> # =
>
>
>
> Outputs are in agreement but gt execution is very slow:
>
> $ python3 gt.py < big.in > bench.out
> kruskal_networkX : 15.721
> kruskal_gt : 134.839
>
>
> the bottleneck being within the following loop-filling the graph:
>
>
> # 
> for a,b,w in wedges:
> e=g.add_edge(a,b)
> weight[e]=w
> # 
>
>
> Probably, i'm filling the graph in the wrong way.
>
> You can dowload the data file here:
>
> https://drive.google.com/file/d/1y0fX1kopgzQEWgmRHRnouWHnMkuliwfw/view?usp=sharing
> ___
> graph-tool mailing list
> graph-tool@skewed.de
> https://lists.skewed.de/mailman/listinfo/graph-tool
>
___
graph-tool mailing list
graph-tool@skewed.de
https://lists.skewed.de/mailman/listinfo/graph-tool