[graph-tool] joblib parallel for with graph-tool filtering?

Tasos Fri, 04 May 2018 01:03:02 -0700

With graph-tool and joblib working together, do we need to send graph.copy()
in the "Parallel" call like in the code below when using graph vertex
filtering with .set_vertex_filter? graph.copy() makes memory usage extreme
in large graphs (2M Vs, 4M Es) but in my head ensures any concurrency
problems. (or 'graph' without '.copy()' is ok?)


What is the best way to run parallel graph searches and filtering (different
vertex per thread) with graph-tool and joblib? (or without joblib)


###
    # defined and filled earlier
g_graph = graph_tool.Graph(directed=False)
eprop_ang = g_graph.new_edge_property("float")

###
from joblib import Parallel, delayed
import multiprocessing
import os
import tempfile
import shutil
import datetime

path2 = tempfile.mkdtemp()
out_path2 = os.path.join(path2,'z6path_out2.mmap')
out2 = np.memmap(out_path2, dtype=np.float32,
shape=(g_graph.num_vertices(),dims), mode='w+')

num_cores = 30
num_pre_workers = 60

def runparallel(graph, row, out2):
    dist, pred = graph_tool.search.dijkstra_search(graph, graph.vertex(row),
weight=eprop_ang)
    ## etc etc
    #####

    v_filter = graph.new_vertex_property('bool',val=False)
    for v in SOMETHING_LOCAL:
        v_filter[v] = True
    graph.set_vertex_filter(v_filter)
    # do something with the filtered 'graph' (subgraph)
    # and save output to out2
    out4[row] = RESULT
    ##
    graph.clear_filters()


Parallel(n_jobs=num_cores, pre_dispatch=num_pre_workers,
verbose=1)(delayed(runparallel)(g_graph.copy(), r, out2) for r in
range(g_graph.num_vertices()))




--
Sent from: 
http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
_______________________________________________
graph-tool mailing list
[email protected]
https://lists.skewed.de/mailman/listinfo/graph-tool

[graph-tool] joblib parallel for with graph-tool filtering?

Reply via email to