[graph-tool] efficient random sampling of paths between two nodes

Franco Peschiera Sat, 21 Mar 2020 15:23:57 -0700

Good day to all,

I wrote an initial message (through the issues site) asking about the
possibility of using the all_paths to random sample paths between two nodes
in a graph.


It was pointed out that the: 1. The algorithm is deterministic and that 2.
It’s not possible to adapt it to iterate in a unbiased random way.

Now I have a couple more questions, namely searching for advice on how to
achieve what I want.
------------------------------

*Context*:

I’ve modeled certain parts of combinatorial optimization problem as a set
of graphs that I’m exploiting to efficiently generate variables on the fly
during the optimization routine. A variable in my problem is represented by
a path between two nodes in one of those graphs.
------------------------------

*Specifically, what I want*:

   1. I have a big graph.
   2. During each iteration I choose two nodes on the graph (based on some
   logic irrelevant to the question).
   3. I get all the paths between those two nodes (or I randomly sample
   them if there are too many).

For the sampling in the third step *I’m currently doing* the following
(assuming a sample of size X and a population of paths of Y):

   1. I get the generator of paths using the function all_paths.
   2. I iterate over the generator using Reservoir sampling
   <https://en.wikipedia.org/wiki/Reservoir_sampling> and I stop if 1) I
   explored all paths or 2) I reach a multiple of X (e.g., 3X).

Many times, the sample X is really small compared to the population Y). X
can be 500-2000, and Y can be 100.000+.
------------------------------

*My problem is*:

Since I could potentially do many iterations (and samplings), I would like
to have an unbiased sampling method for the Y paths without having to
enumerate them all.

One option is to do the sampling during the construction of the paths. I
have in mind to just use the get_out_neighbors method on the start node to
do my own depth-first search, randomly choosing a neighbor at each moment
until I get to the cutoff or the end node. This, I could do for each path
until I get to my limit of paths to sample. I’m not sure, though, if this
is 1) similar in logic to the all_paths method and 2) if this is close in
efficiency (due to using python to iterate and sample instead of C++).

Another option, although a lot more far-fetched, is to create my own
all_paths method in C++ and re-compile my own version of graph-tool. Is
this realistic?

Are there better options to doing this?

Thanks,

Franco Peschiera

_______________________________________________
graph-tool mailing list
[email protected]
https://lists.skewed.de/mailman/listinfo/graph-tool

[graph-tool] efficient random sampling of paths between two nodes

Reply via email to